NP-complete problems have attracted many studies, also in the neural network community. Recent advances obtained both from the analysis of the average case complexity (in contrast to the, standard, worst case analysis) and from the design of dynamical systems that tackle NP-complete problems could benefit from a substantial cross-fertilization. For example, the analysis of the average case gives possible indications for the design of new algorithms that could be easily implemented with neural-networks like techinques.
This is a very informal workshop devoted to the exchange of ideas and experiences between people tackling NP-complete problems (both with standard and non standard techniques, e.g. neural networks) and researchers in the field of the average complexity.
A non-exhaustive list of issues of interest is:
Alan L. Yuille Smith-Kettlewell Eye Research Institute, CA - USA
Fundamental Limits of Bayesian Inference: Order Parameters, Phase Transitions, and Convergence Rates
There is a growing interest in formulating vision problems in terms
of Bayesian inference and, in particular, the maximum a posteriori
(MAP) estimator. This approach involves putting prior probability
distributions, P(X), on the variables X to be inferred and a
conditional distribution P(Y|X) for the measurements Y. For example, X
could denote the position and configuration of a road in an aerial
image and Y can be the aerial image itself (or a filtered version). We
observe that these distributions define a probability distribution
P(X,Y) on the set of problem instances. For the specific task of
detecting roads from aerial images analysis of this ensemble enables
us to determine fundamental bounds on the performance of the MAP
estimate (independent of the inference algorithm employed). We
demonstrate that performance measures -- such as the accuracy of the
estimate and whether the road can be detected at all -- depend on the
probabilities P(Y|X), P(X) only by an order parameter K. Intuitively,
K summarizes the strength of local cues (as provided by local edge
filters) together with prior information (i.e. the probable shapes of
roads). We demonstrate that there is a phase transition at a critical
value of the order parameter K -- below this phase transition it is
impossible to detect the road by any algorithm. We also derive closely
related order parameters which determine the time and memory
complexity of search and the accuracy of the solution using the A*
search strategy. We summarize results when the model uses the ``wrong
prior''. Finally, we discuss how our results can be extended to more
general classes of inference problems using large deviation theory (Wu
and Zhu 1999, Yuille, Coughlan, Wu, Zhu 1999).
Back to program.
How can the visual computations derived from information processing
considerations be reduced to networks of biologically-plausible
neurons? We present a model of visual computation based on tightly
inter-connected cliques of pyramidal cells. It leads to a formal
theory of cell assemblies, a specific relationship between correlated
firing patterns and abstract functionality, and a direct calculation
relating estimates of cortical cell counts to orientation
hyperacuity. We show that, while in general such computations solve
a linear complementarity problem (believed to be NP-complete) and are
formally related to polymatrix games, in this instance a polynomial
algorithm results. Our network architecture is therefore unique in
that (1) it supports a mode of computation which is both reliabile
and efficent; (2) the current-spike relations are modeled as an
analog dynamical system in which the requisite computations can take
place on the time scale required for an early stage of visual
processing; and (3) the dynamics are triggered by the spatio-temporal
response of cortical cells. Joint research with D. Miller.
Back to program.
We present a new (continuous) quadratic programming approach for
(sub)graph- and (sub)tree-isomorphism problems which is based on an
equivalent maximum clique formulation. The approach is centered around a
fundamental result proved by Motzkin and Straus in the mid-1960s, and
recently expanded in various ways, which allows us to formulate the
maximum clique problem in terms of a standard quadratic program. The
attractive feature of this formulation is that a clear one-to-one
correspondence exists between the solutions of the quadratic programs and
those in the original, combinatorial problems. To approximately solve the
program we use the so-called ``replicator'' equations, a class of
straightforward continuous- and discrete-time dynamical systems developed
in various branches of theoretical biology. We show how, despite their
inherent inability to escape from local solutions, they nevertheless
provide experimental results which are competitive with those obtained
using more sophisticated mean-field annealing heuristics. Application of
this approach to shape matching problems arising in computer vision and
pattern recognition are also presented.
The results presented raise some questions concerning the connections between
computational complexity and the "elusiveness" of global optima.
[work partly done jointly with K. Siddiqi (McGill) and S. W. Zucker (Yale)]
Back to program.
Continuous optimization seems to be the ubiquitous formulation of
an impressive number of different problems in science and engineering.
In this paper, a unified framework for problem solving is proposed
using continuous optimization (continuous problem solving) and
the concept of computational suspiciousness is introduced
to address typical troubles arising when using numerical methods
like gradient descent.
The concept is mainly related to the absence of local minima,
that is to a classic case in which there is no suspect that properly
designed numerical algorithms get stuck and fail reaching the
optimal solution. Some intriguing links with the
computational complexity theory are also provided.
An optimal algorithm for solving unimodal problems is given
which is based on a canonical form of gradient descent.
It is shown that the knowledge of lower bounds on the complexity of
the solution of a given problem can give straightforward indications
on its computational suspiciousness.
Back to program.
This talk will address two interacting themes:
- average-case complexity versus (quasi) worst-case complexity
- simple greedy (conventional) versus neural network
(unconventional) algorithms
With regards to the first, it is well known that random instances (i.e.,
average case ones) are much EASIER to solve (at least near-optimally)
than worst case ones for several problems. This is the case for
the Maximum Clique problem. We will begin by discussing our own experiences
related to this issue, on this problem, which are consistent with this fact.
We will then examine a more complex issue, that of generating random instances
that exhibit (quasi) worst-case behavior. One commonly-used way to do this
is to generate random instances that are near a phase transition. Random instances
generated in the proximity of this are generally known to be harder than those
away from it. Our approach is very different. We instead replace the UNIFORM
distribution (from which to sample random instances) by a heuristic
efficiently-samplable analog of the Solomoff-Levin UNIVERSAL distribution. We
will reveal both empirical and theoretical evidence that strongly suggests that
the latter instances are harder than (uniformly) random ones.
With regards to the second theme, we will show that it is the more advanced
of the neural network algorithms in particular that emerge as winners (as compared
to the simple greedy, and the simpler of the neural network methods) on the
UNIVERSALLY-RANDOM instances. On the other hand, all methods perform more or less
the same on UNIFORMLY-RANDOM instances. Which is not surprising since the latter
are relatively easier instances.
Back to program.