The notion of NP-completeness, introduced by Cook [Coo71] and independently by Levin [Lev73], has provided a rigorous mathematical definition for measuring intractability of NP problems. Karp [Kar72] introduced the methodology of many-one reductions and demonstrated the rich variety of NP-complete problems; and just before 1979, several hundreds of additional NP-complete problems were discovered [GJ79]. Since then, many more computational problems from almost all areas involve computing have been shown to be NP-complete.
Despite many years of intensive study,
no efficient algorithms have
been found for NP-complete problems,
and so NP-complete problems are
generally thought of as being computationally intractable.
However, NP-completeness is a worst-case concept that
does not provide information about how difficult an NP-complete
problem might be on the average case. Indeed,
several NP-complete problems have been shown to be tractable ``on average.''
For example,
although HAMILTONIAN PATH is NP-complete, Gurevich and Shelah
[GS87] have shown that if a graph is chosen in a certain
reasonable way, then the HAMILTONIAN PATH problem can be
solved by a deterministic algorithm whose expected running
time on graphs with
vertices is bounded by a linear polynomial in
.
Also, despite the fact that GRAPH 3-COLORABILITY is NP-complete, Wilf
[Wil84] has shown that it can be solved by
a deterministic algorithm whose expected
running time on graphs with
vertices under a uniform distribution
is actually bounded by a
constant. Thus, the average-case complexity of a problem is, in many
respects, a more significant measure than its worst-case complexity.
Given a problem and a distribution on instances, finding an expected polynomial-time algorithm to solve the problem or proving that such an algorithm does not exist is an important issue. There are two central notions needed in studying this issue along similar lines to the theory of NP-completeness. Namely, a notion for measuring efficiency on the average case and a notion of completeness for measuring intractability on the average case.
Expected polynomial time is a natural notion
to measure average efficiency of an algorithm.
But it is, among other things,
machine dependent (see, for example, [Gur91]
[Ven91]
[Wan96]),
and so it cannot be used to build a general
theory on average-case intractability for NP
problems. To overcome this obstacle,
Levin defined a robust notion on what it
means for the running time of a deterministic algorithm to be polynomial
on average with respect to the input distribution.
A problem with an associated probability distribution is called
a distributional problem.
A distributional problem
is said to be in AP if
can be
solved by a deterministic algorithm whose running time
is polynomial on average with respect to
.
Levin then defined reductions between distributional problems
in such a way that reductions are transitive and if a distributional
problem is reducible to a second distributional problem which is in
AP, then the original distributional problem is also in AP.
With this machinery in place, Levin showed that distributional
TILING with a simple distribution is average-case NP-complete,
meaning that it is not in AP unless every NP problem under
every polynomial-time computable distribution is in AP.
Since then,
several additional average-case NP-complete problems have been found.
Among them are
the distributional Post correspondence problem [Gur91] and
the distributional word problem
for finitely presented groups [Wan95b].
However, as pointed out by Gurevich [Gur91][Gur87],
the reduction defined in [Lev86] has
certain limitations.
Gurevich showed that using such a reduction,
a distributional
problem under a ``flat'' distribution cannot be complete unless
nondeterministic exponential time collapses to deterministic
exponential time. A distribution
is flat if
there exists an
such that for all
,
, where
denotes the
size of
.
Wang and Belanger [WB95] further showed
that, without any assumption, a distributional problem under
a flat distribution cannot be complete under one-one and
p-length-preserving
reductions.
To overcome this obstacle, Venkatesan and Levin [VL88]
introduced the notion of randomized reductions, and they
showed that under such a
reduction,
a graph edge coloring problem with a flat distribution is
average-case NP-complete.
Several additional average-case NP-complete problems with flat distributions
have been found since then. Among them are
the distributional matrix transformation problem [BG95][Gur90] and
the distributional matrix representability problem [VR92].
Distributions on instances of all these average-case NP-complete
problems are simple in that the components of an instance are
selected uniformly at random. Such distributions are polynomial-time
computable.
One might suspect that a more complex distribution
would make it easier to generate hard instances of an NP problem and so
additional average-case NP-complete
problems could be found. This motivated Ben-David et. al. [BCGL92]
to study p-samplable distributions.
A distribution
is p-samplable if there is a randomized algorithm that takes no
inputs and outputs
with probability
in polynomial time of
.
They then showed that
for any standard NP-complete problem, there is a p-samplable distribution
that makes it to be average-case complete. But this p-samplable
distribution, constructed by an enumeration technique,
is far from being considered as natural.
Impagliazzo and Levin [IL90] later showed that
for NP search problems, p-samplable distributions do not
generate harder instances than simply picking instances uniformly at
random. In particular, they showed that any NP search problem
with a p-samplable distribution is reducible to an NP search
problem with a uniform distribution under a randomized reduction.
For decision problems, the same result can be obtained
using a randomized truth-table reduction [BCGL92]
(see also [Wan96]).
Hence, without loss of generality, we shall only focus on
polynomial-time computable distributions.
Two other average time measurements have been recently proposed in [RS93] and [CS96], respectively, in an effort to refine Levin's notion of average time. The definition of average time proposed in [RS93] does not provide harder NP problems [BW][BW95]. The definition of average time proposed in [CS96] provides the same AP as Levin's under a certain reasonable condition on distributions [CS96], and that condition is satisfied in the discussion here. Thus, as far as average-case NP-completeness is concerned, Levin's average time measurement is sufficient, which is also easier to work with.
This paper is organized as follows. Basic definitions and results are presented in Section 2. A list of average-case NP-complete problems are presented in Section 3. Completeness proofs of these problems are given in Sections 4, 5, and 6. Some final remarks and open problems are given in Section 7.