|
|
  
THE VIRTUAL OFFICE OF WILLIAM A. DEMBSKI
Home |
Biosketch |
Personal Story |
Articles |
Other Resources
Why Evolutionary Algorithms Cannot Generate Specified Complexity
William A. Dembski
Appeared as Metaviews 152 (www.meta-list.org). 1999/11/1. Approximately 3575 words.
In my last piece for META, I asserted that evolutionary algorithms
cannot generate specified complexity and motivated this assertion by
pointing to the failure of Richard Dawkins's well-known METHINKS IT
IS LIKE A WEASEL example to generate specified complexity. My point
was that Dawkins's evolutionary algorithm converged on METHINKS IT IS
LIKE A WEASEL with probability one, and therefore reduced the
complexity of generating this sequence to zero. With reference to
specified complexity, complexity and probability are inverse notions:
High complexity presupposes many live possibilities and
correspondingly assigns low probability to anyone of these
possibilities. Thus, while it's true that shaking out random scrabble
pieces would render METHINKS IT IS LIKE A WEASEL highly improbable
(and therefore complex), Dawkins's evolutionary algorithm renders
that sequence certain and thereby removes its complexity. Basically,
the problem here is one of setting the relevant probabilistic
context. Within a random-scrabble-shaking-scenario, this sequence is
complex and specified, but within Dawkins's evolutionary algorithm it
is no longer complex (though it remains specified). I therefore
concluded my last piece by saying that just as Darwinian evolution
only delivers the **appearance** of design (an assertion all
Darwinists perforce accept), so too it only delivers the
**appearance** of specified complexity.
In this piece I want to develop this last point further. I want to
sketch a fuller justification for this failure of evolutionary
algorithms to generate specified complexity. Note that I'm not going
to provide a precise mathematical proof here. I'm simply going to
sketch the general mathematical considerations that lead me to regard
the attempt to generate specified complexity via evolutionary
algorithms as a fool's errand. Indeed, I'm going to argue that the
failure of evolutionary algorithms to generate actual specified
complexity counts decisively against their ability as general purpose
problem solvers and by implication raises substantive doubts about
the power of the Darwinian mechanism generally to produce the marvels
increasingly attributed to them. Note that in this piece I am keeping
the body of the text non-technical. Nonetheless, I include a set of
technical appendixes for those who want to see some of the relevant
mathematics.
In general terms, the problem of generating specified complexity
via an evolutionary algorithm can be conceived as follows. We are
given a phase space of possible solutions to a problem and a fitness
function over that phase space. Our task is to optimize this fitness
function by finding a point in the phase space that attains a certain
level of fitness. Think of it this way: The phase space is a vast
plane, the fitness function is a vast hollowed-out mountain-range
over the plane (complete with low-lying foothills and incredibly high
peaks). The task of an evolutionary algorithm is by moving around in
the plane to get to some point under the mountain-range where it
attains at least a certain height (e.g., 10,000 feet). The collection
of all such places on the plane where the mountain range attains at
least that height (here 10,000 feet) we will call the **target**.
Thus the job of the evolutionary algorithm is by navigating the phase
space to find its way into the target (see Appendix 1 below).
Now, the phase space (which we are picturing as a giant plane)
usually comes with some additional topological structure, typically
given by a metric or distance function (see Appendix 2). This
topological structure tells us how points in the phase space are
related geometrically to nearby points. Also, even though the phase
space is huge, it tends to be finite (strictly finite for problems
represented on computer and topologically finite, or what topologists
call "compact," in general). Moreover, such spaces typically come
with a uniform probability that is adapted to the topology of the
phase space (see Appendix 3).
Basically this means that if you get out your tape measure and
measure off a three by five foot area in one part of the phase space,
the uniform probability will assign it the same probability as a
three by five foot area in another portion of the phase space. All
the spaces to which I've seen evolutionary algorithms applied do
indeed satisfy these two conditions of having a finite topological
structure (i.e., they are compact) and possessing a uniform
probability. Moreover, this uniform probability is what typically
gets used to estimate the complexity/improbability of the target
(i.e., the area of the phase space under the mountain range where it
attains a certain requisite level -- e.g., 10,000 feet).
For instance, in Dawkins's METHINKS*IT*IS*LIKE*A*WEASEL example,
the phase space consists of strings of upper case Roman letters and
spaces (represented by asterisks) of length 28. A uniform probability
on this space assigns equal probability to each of these sequences --
approximately 1 in 10^40. It's this improbability that corresponds to
the complexity of the target sequence and with respect to which this
target sequence constitutes an instance of specified complexity.
In general, given a phase space with a target sitting under those
places where the mountain range attains at least a certain level
(e.g., 10,000 feet), the (uniform) probability of randomly choosing a
point from the phase space and landing in the target will be very
small. In Dawkins's example, the target equals the character string
METHINKS*IT*IS*LIKE*A*WEASEL and the improbability is 1 in 10^40. For
non-toy examples the improbability is typically much less than my
universal probability bound of 1 in 10^150 that I justify in The
Design Inference (Cambridge, 1998; cf. section 6.5). Indeed, if the
probability of the target were not small, a random search through the
phase space would suffice to find a point in the target, and there
would be no need to construct an evolutionary algorithm to find
it.
We therefore suppose that the target is just a tiny portion of the
whole phase space; or, in slightly more technical language, the
(uniform) probability of the target in relation to the phase space as
a whole is exceedingly small. What's more, the target, in virtue of
its explicit identification, is specified (certainly this is the case
in Dawkins's example where the target includes but one point and
coincides with the character string METHINKS*IT*IS*LIKE*A*WEASEL).
Thus it would seem that to find a point in the target would be to
generate specified complexity.
But let's look deeper. Consider an evolutionary algorithm that
does in fact find the target. An evolutionary algorithm can be
conceived as a stochastic process that moves around the phase space
some finite number of times (see Appendix 4). Let's call the
evolutionary algorithm E. The evolutionary algorithm starts at some
point E(0) in the phase space (usually chosen at random). Then it
moves to E(1). Then to E(2). Then to E(3). Etc. For E successfully to
find the target (i.e., to find a point under the mountain range where
it attains at least a certain level -- e.g., 10,000 feet) then means
that within a manageable number of steps n, E is very likely to land
in the target -- i.e., some one of E(0), E(1), ..., E(n) is likely to
land in the target (see Appendix 5). Simply put, the algorithm E has
to get us into the target with high probability and in a relatively
short number of steps. In the Dawkins example, E(n) rapidly converged
to METHINKS*IT*IS*LIKE*A*WEASEL for n around 40.
An evolutionary algorithm needs to be contrasted with pure random
sampling. Pure random sampling treats the phase space as a giant urn
from which we draw items at random according to the uniform
probability. In that case, a random sample from M of size k will
contain a point in the target with probability better than 1/2
provided that k is around the reciprocal of the (uniform) probability
of the target. Since we are assuming that the probability of the
target is less than my universal probability bound of 1 in 10^150
given earlier, it follows that k will need to be at least 10^150.
This number is enormous and far exceeds the number of computations
conceivable for any traditional computer. Moreover, it doesn't seem
that quantum computation is going to render this number tractable
either since the points in phase space need to be made explicit in
any random sampling scheme (implying decoherence and thus preventing
us from exploiting quantum superposition).
Let's now return to the evolutionary algorithm E. We're going to
allow ourselves a certain number of steps, call it m, for E to land
in the target. Clearly m is going to have to be much less than 10^150
if we're going to program E on a computer and have any hope of E
landing in the target. With m fixed, we can determine the probability
that E will land in any subset of phase space in m steps (see
Appendix 6). For instance, in the Dawkins example, for m = 100 and
the target sequence METHINKS*IT*IS*LIKE*A*WEASEL and E the cumulative
selection algorithm Dawkins constructed, the probability of E
attaining the target in m = 100 steps is approximately 1.
What this means is that even though with respect to the uniform
probability on the phase space the target has exceedingly small
probability, the probability for the evolutionary algorithm E to get
into the target in m steps is no longer small. And since complexity
and improbability are for the purposes of specified complexity
parallel notions, this means that even though the target is complex
and specified with respect to the uniform probability on the phase
space, it remains specified but is no longer complex with respect to
the probability induced by evolutionary algorithm E.
Does this mean that the evolutionary algorithm has in fact
generated complex specified information, but that in referring to a
loss of complexity with respect to E I'm simply engaging in some
fancy redefinitions to avoid this conclusion? I don't think so.
Remember that we are interested in the **generation** of specified
complexity and not in its reshuffling.
To see what's at stake here, we need to be clear about a
restriction that needs to be placed on E if it is to count as a
genuine evolutionary algorithm (i.e., a legitimate correlative of the
Darwinian mutation-selection mechanism). It is not, for instance,
legitimate for E to be able to survey the mountain range (i.e.,
fitness landscape), see where in the phase space it attains a global
maximum, and then head in that direction. That would be teleology.
No, E must be able to navigate its way to the target either by
randomly choosing points from the phase space or by using those as
starting points and then selecting other points in the phase space
based **solely** on the topology of the phase space and without
recourse to the fitness function, except to evaluate the fitness
function at individual points of the phase space already traversed by
E. In other words, E must move around the phase space only on the
basis of its topology and the elevation of the fitness function at
points in the phase space already traversed by E.
We can think of it this way: E(0), the first point selected by the
evolutionary algorithm, is selected randomly from the phase space
(i.e., with respect to the uniform probability on the phase space);
the fitness function can then be evaluated at E(0) (i.e., we can
determine how high the mountain range is at that point E(0) in phase
space); given only E(0), the fitness function's height at E(0), and
the topology of phase space, E selects E(1); then the height of the
fitness function can be evaluated at E(1); then E(2) is selected
based only on E(0), E(1), the height of the fitness function at these
two points, and the topology of M; etc. In other words: The
evolutionary algorithm E must be independent of the fitness function
except for those points that E has hitherto traversed and then only
insofar as the fitness function is evaluated at those points.
Certainly this means that the evolutionary algorithm E is highly
constrained in the use it can make of the fitness function. But
there's more. It means that the success of E in hitting the target
depends crucially on the structure of the fitness function. If, for
instance, the fitness function is totally flat and close to the
ground whenever it is outside the target, then it fails to
discriminate between points outside the target and so cannot be any
help guiding an evolutionary algorithm into the target. For such a
fitness function, the probability of the evolutionary algorithm
landing in the target is no better than the probability of pure
random sampling landing in the target, which as we know is inadequate
to get us there (see Appendix 7).
But the problem is even worse. It follows by a combinatorial
argument that for any partition of the phase space into pieces none
of which has probability more than the probability of the target
(which by assumption is less than 1 in 10^150), for the vast majority
of these partition elements the probability of the evolutionary
algorithm E entering them is going to be no better than pure random
sampling. It follows that the vast majority of fitness functions on
the phase space that coincide with our original fitness function on
the target but reshuffle the function on the partition elements
outside the target will not land the evolutionary algorithm in the
target (this result is essentially a corollary of the No Free Lunch
theorems by Wolpert and Macready). Simply put, the vast majority of
fitness functions will not guide E into the target even if they
coincide with our original fitness function on the target (see
Appendix 8).
This result refutes the claim that evolutionary algorithms can
generate specified complexity, for it means that they can yield
specified complexity only if such algorithms along with their fitness
functions are carefully adapted to the complex specified targets they
are meant to attain. In other words, all the specified complexity we
get out of an evolutionary algorithm has first to be put into the
construction of the evolutionary algorithm and into the fitness
function that guides the algorithm. Evolutionary algorithms therefore
do not generate or create specified complexity, but merely harness
already existing specified complexity. Like a bump under a rug, the
specified complexity problem has been shifted around, but it has not
been eliminated.
I have omitted many details. I have also omitted some
complications which to my mind make the problem of generating
specified complexity via evolutionary algorithms even more
problematic (in nature, for instance, the fitness function will not
stay fixed but vary over time). Some of the details are treated in
chapter 6 of my recently released Intelligent Design: The Bridge
Between Science & Theology (InterVarsity). A full treatment will
have to await a book I'm currently writing (Redesigning Science: Why
Specified Complexity Is a Reliable Empirical Marker of Actual
Design). But I want to make these preliminary results available
because the misconception that one can purchase specified complexity
on the cheap is widespread and ill-conceived.
The only known generator of specified complexity that we know is
intelligence. Sans intelligence, a process that yields specified
complexity merely converts already existing specified complexity. We
are seeing a similar phenomenon with inflationary cosmologies, which
attempt to wash out cosmological fine-tuning but invariably seem to
smuggle it back in. Smuggling in specified complexity is not the same
as **generating** specified complexity. I challenge the biological
community to take these results seriously, and reevaluate how it
understands the generation of specified complexity.
+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_+_
BRIEF TECHNICAL APPENDIXES
==BEGIN APPENDIX 1== Let's call the phase space M and the fitness
function f. Let's say f maps into the non-negative reals and that by
optimization here we mean maximizing f. Thus we'll want to find a
point in that subset of M where f attains at least a certain level a*
(a* being a positive real number). Let's refer to this subset as the
target T. This subset is represented mathematically as T = {x in M |
f(x) >= a*}, i.e., the set of all x in M where f(x) is at least
a*. Our task, then, is to find some point q in T, i.e., a point q in
the phase space satisfying f(q) >= a*. ==END APPENDIX 1==
==BEGIN APPENDIX 2== A metric or distance function d on M is a
non-negative real-valued function on the cartesian product of M
satisfying the following properties: (1) for all x in M, d(x,x) = 0
(2) for all x and y in M, d(x,y) = d(y,x) [symmetry] (3) for
all x, y, and z in M, d(x,z) <= d(x,y) + d(y,z) [triangle
inequality] ==END APPENDIX 2==
==BEGIN APPENDIX 3== Ordinarily the phase space M satisfies a
uniformizability condition, i.e., M possesses a uniform probability
measure U that is adapted to the metric on M so that geometrically
congruent pieces of M get assigned identical probabilities (for a
general account of this see my article "Uniform Probability,"
Journal of Theoretical Probability, Vol. 3, No. 4, 1990, pp.
611-626). ==END APPENDIX 3==
==BEGIN APPENDIX 4== An evolutionary algorithm E can be conceived
as a stochastic process from the natural numbers N = {0, 1, 2, ...}
into the phase space M, but which will truncated after a certain
point (we don't allow it to circulate in the phase space endlessly).
To call E a stochastic process on the natural numbers N is to say
that E is a measurable function from the cartesian product of NxS
into M where S is a probability space with some probability measure
P. ==END APPENDIX 4==
==BEGIN APPENDIX 5== More formally, E is a stochastic process on
NxS where S has probability P. Thus for E successfully to optimize f
with respect to M means that within a manageable number of steps n,
the set {E(0,s), E(1,s), ..., E(n,s)} contains a point q in T = {x in
M | f(x) >= a*} with high probability for s in S, i.e., P({s in S
| {E(0,s), E(1,s), ..., E(n,s)} intersects T}) is large. ==END
APPENDIX 5==
==BEGIN APPENDIX 6== E induces a probability assignment E* on M
where for each subset A of M (technically, "Borel subset"), E*(A) =
P({s in S | {E(0,s), E(1,s), ... E(m,s)} intersects A}). Note that E*
is not a probability measure since additivity fails. ==END APPENDIX
6==
==BEGIN APPENDIX 7== For such a fitness function f, for E to land
in T = {x in M | f(x) >= a*} in the requisite m number of steps,
cannot exceed m times the uniform probability of the target, i.e.,
the number of points in phase space traversed by E times the uniform
probability of the target. Since that uniform probability is no
greater than 1 in 10^150 and since m must be vastly smaller than
10^150 for E to be computationally tractable, it follows that for
such a fitness landscape, evolutionary algorithms are no better than
taking purely random samples of size m. ==END APPENDIX 7==
==BEGIN APPENDIX 8== Let A(1), A(2), ..., A(r) be a partition of M
(i.e., a mutually exclusive and exhaustive compartmentalization of M)
such that for each element of this partition A(i), U(A(i)) <= U(T)
(< 1 in 10^150). Also assume that one of the A(i)'s is T. Then r
>= 10^150 (because the sum of the U(A(i))'s for i going from 1 to
r is unity). It follows that for any s in S, {E(0,s), E(1,s), ...
E(m,s)} intersects at most m+1 of the partition elements. Since m is
by assumption vastly smaller than 10^150 (it has to be if the
evolutionary algorithm is going to be computationally tractable),
permuting f on the partition elements other than the target T = {x in
M | f(x) >= a*} (thus leaving f fixed on the target) indicates
that only a proportion of around m/r of these permutations will be
likely to land E in T (m/r is an incredibly small number). On the
other hand, the vast majority of these permutations (i.e., a
proportion of 1-m/r) will land in T with probability on the order m x
U(T) (which remains an incredibly small probability). ==END APPENDIX
8==
This essay first appeared in Metanexus:
The Online Forum on Religion and Science
(www.metanexus.net)
and is reproduced here with permission. Copyright © 1999 William A. Dembski.
Copyright © William A. Dembski. All Rights Reserved.
Email this to a friend
copyright
© 1995-2008
Leadership U. All rights reserved.
Updated: 13 July 2002
|