The Steppingstone Problem

And the Limits of Evolutionary Potential

Sean D. Pitman, M.D.

© March 2004

Imagine yourself beside a very wide river. As you look out across this river you see various steppingstones. Close to the bank of the river there are lots of these steppingstones such that the average distance between them is rather minimal. However, you notice that the number of steppingstones rapidly decreases as you look out farther and farther from the bank. The average distance between the stones quickly grows, so that a simple jump from one to the next becomes impossible without getting wet.

This is the fundamental problem faced by evolutionists. How do the mindless processes of random mutation and natural selection get from one novel steppingstone function to the next without getting wet at higher and higher levels of functional complexity which require greater and greater minimum structural threshold requirements (i.e., more and more specifically arranged amino acid residues)?

Meaningless or Meaningful

Of course, random mutations (or "letter changes") to the codes of life do occur quite often in every living thing. These letter changes can result in the evolution of a new type or level of function or in no functional change at all. When no functional change is realized, this is called "neutral evolution."¹ For example, a change from the letter sequence grft to agrft via the addition of the letter a would be a neutral change with respect to meaning in the English language system since both letter sequences are equally meaningless.

The information systems that code for all the parts of living things often have such functionally neutral mutations. In fact, the large majority of all mutational changes are thought to be functionally neutral. What is especially interesting about these neutral mutations is that nature cannot tell the difference between them, since nature only recognizes differences in function, not "spelling." However, on occasion, a mutation will actually change the meaning or function of a genetic word or phrase.

For example, if the spelling of vacation happened to get "mutated" to read vocation or even vucation, there would be a big change in meaning. Of course the word vucation has no meaning in the English language, but a loss of the meaning of the word vacation might be beneficial in certain circumstances, as would the gain of the meaning of the word vocation. Such meaningful changes, when they happen in the genetic codes of living things, can be detected by natural selection as either beneficial or detrimental. If they are deemed to be beneficial, they are kept for the next generation to use, but if detrimental, they are eliminated from the gene pool over the course of time.

A Brutal Game

Nature plays a brutal game of competition, where the strongest survives to pass on genetic information while the weakest, along with the weaker genetic information, dies out. However brutal this game of survival is, it is a real game and it works very well as a preserving force that keeps the strong and gets rid of the weak. The question is, are there any examples of mindless evolutionary processes actually creating novel functions that were not there before?

The clear answer to this question is yes ; mindless evolutionary processes do actually create novel functions in creatures that were never there before. For example, antibiotic resistance is a famous case of evolution in action. As it turns out, all bacteria seem to be able to rapidly evolve de novo resistance to just about any antibiotic that comes their way. But how, exactly, do such novel functions evolve?

Antibiotic Resistance

In the case of de novo antibiotic resistance, such rapid evolution is made possible because there are so many beneficial "steppingstones" so close together, right beside what the bacterial colony already has. Success is only one or two mutational steps away in many different directions since a multitude of different single mutations will result in a beneficial increase in resistance. How is this possible?

In short, this is made possible because of the way in which antibiotics work. All antibiotics attack rather specific target sequences inside certain bacteria. Many times all the colony under attack has to do is alter the target sequence in just one bacterium by one or two genetic "characters" and resistance will be gained since the offspring of this resistant bacterium, being more fit than their peers, will take over the colony in short order. A simple "spelling change" made the target less recognizable to the antibiotic, and so the antibiotic became less effective. In other words, the pre-established antibiotic- target interaction was damaged or destroyed by one or two monkey-wrench mutations. As with Humpty Dumpty and all the king's men, it is far easier to destroy or interfere with a pre-established function or interaction than it is to create a new one, since there are so many more ways to destroy than there are to create.

So, do all functions within living things evolve as easily as the antibiotic resistance function? As it turns out, those independent functions that are not based on the destruction of or interference with other pre-established functions are much more difficult to evolve. For example, single protein enzymes catalyze many biochemical events within living things. They help to build and break down other molecules via their own independent abilities, which are not based on the gain or loss of any other system, function, or interaction.

Consider that several forms of antibiotic resistance are based on the production and activity of various enzymes. Perhaps the most famous anti-antibiotic enzyme is the penicillinase enzyme, which is produced by various bacteria having the proper penicillinase code in their DNA. What the penicillinase enzyme does is chop up part of the penicillin antibiotic so that it can no longer attack its target and kill the bacterium. Many people think that bacteria evolve this enzyme just like they can evolve other forms of antibiotic resistance. This is simply untrue.

All the King's Horses

The information required to produce an enzyme which is specific enough to chop up penicillin is far greater than the information required to block the antibiotic-target interaction, since there are far fewer ways to make such a specific enzymatic function compared to the number of ways to block a specific antibiotic function. Creating a block to a previous function is like breaking Humpty Dumpty, while creating the function of an independent enzyme is like putting Humpty Dumpty back together again.

As it turns out, the required code needed for producing the penicillinase enzyme has never been observed to evolve in any bacterial colony de novo. Either a penicillinase-producing colony already had this code before it was exposed to penicillin, or it gained this code by genetic transfer from some other bacterial population that already had the code.² Simply put, the penicillinase enzyme does not evolve, or at least not often enough to have been observed in real time, while other forms of antibiotic resistance that are based on interference with or destruction of pre-established functions or interactions evolve all the time.

Evolution in Action?

But what about other enzymes? Have any novel enzymatic functions ever been shown to evolve in real time? Interestingly enough, several enzymes with entirely new and beneficial functions have been shown to evolve in real time. For example, Kenneth Miller, in his book, Finding Darwin's God, references a very interesting research study published by Barry Hall, an evolutionary biologist from the University of Rochester.³

In this study, Hall deleted the lactase genes in certain E. coli bacteria. These genes produced and regulated the production of a lactase enzyme called b-galactosidase. What this enzyme does is break apart a type of sugar molecule called lactose into two smaller sugar molecules called glucose and galactose - both of which E. coli can use for energy production. Obviously then, without the genes needed to make this lactase enzyme, the mutant E. coli were no longer able to use lactose for energy despite being placed in a lactose enriched environment, unless of course they evolved a new enzyme to replace the one that they lost. And sure enough, they did just that. In just one or two generations these E. coli successfully evolved a brand new gene that produced a new lactase enzyme. Aha! Evolution in action yet again!

Although most descriptions of Hall's experiments stop right here, including the one found in Miller's book, what Hall did next is most interesting. He deleted the newly evolved gene as well, to see if any other gene would evolve the lactase function . . . and nothing happened! Despite tens of thousands of generations with large population numbers and high mutation rates, no new lactase enzyme evolved. Hall himself noted in his paper that these double mutant bacteria seemed to have "limited evolutionary potential."

Limited Potential

Other unfortunate bacteria seem to be just as limited in their evolutionary potential. Even though they would significantly benefit, many types of bacteria, after more than a million generations (more generations than it supposedly took humans to evolve from ape-like creatures), have not been observed to evolve a relatively simple lactase enzyme. One should also note that these same bacteria, unable to evolve a lactase enzyme, are all able to evolve, in relatively short order, resistance to any antibiotic that comes their way. So what is it, exactly, that "limits" the evolutionary potential of living things, like bacteria, in their ability to evolve some functions but not others?

I propose that the answer can be found in the number and density of beneficial "stepping-stones" available (in the form of genetic sequences). For forms of antibiotic resistance that are gained by blocking the antibiotic-target function, there are lots of beneficial steppingstones very close together, but not so for the enzymatic functions of lactase or penicillinase. Relatively speaking, there are very few such enzymes, compared to the total number of possible sequences.

For example, there are 676 potential two-letter words in the English language. Of these, 96 are defined as meaningful, creating a ratio of meaningful to meaning- less of 1 in 7. Now, there are 296 more meaningful three-letter words, totaling 972, but the total number of potential words increases 26 fold to 17,576. Since the number of meaningful words only increased by a fraction of this amount, the ratio of meaningful to meaningless dropped to 1 in 18.

A Random Walk

Still, such ratios are relatively high, and random walk can get from any one-, two-, or three-letter words to any other via a path of meaningful words, as in the steppingstone sequence of cat - hat - bat - bad - bid - did - dig - dog. "Evolution" (changing meaning or "function") at this level is rather simple because the stepping-stones are so close together. But, with each additional minimum letter requirement, the growth of the meaningless sequences quickly outpaces the growth of the total number of meaningful sequences, and the ratio of meaningful to meaningless gets smaller and smaller at an exponential rate.

For example, there are around 30,000 meaningful seven-letter words and combinations of smaller words totaling seven letters, but there are 8,031,810,176 potential seven-letter sequences. This produces a situation in which an average meaningful seven-letter sequence is surrounded by over 250,000 meaningless sequences. Obviously then, compared to three-letter steppingstones, it is much harder to "evolve" between meaningful seven-letter steppingstones without having to cross through a little ocean of meaningless sequences.

The same thing happens with the genetic codes in living things. The more genetic letters that are required to achieve a particular function, and the higher the level of the specificity of their arrangement, the more junk there is compared to the relatively few beneficial sequences at such a level of complexity.

For example, a simple BLAST ⁴ database search of known proteins will show that the shortest working lactase enzyme found in a living organism seems to require well over 400 amino acids at minimum with at least a fair degree of specificity. Some estimates suggest that the total number of beneficial sequences at the 400-amino-acid level of specified complexity totals less than 10¹⁰⁰ sequences.^5,6Now, considering that the total number of atoms in the entire known universe is around 10⁸⁰,this 10¹⁰⁰ number seems absolutely huge!⁷ Huge, that is, until one considers that there are over 10⁵²⁰ possible sequences at this level of complexity, which creates a ratio of beneficial to non-beneficial sequences of 1 in 10⁴⁰⁰ (which is like finding a single atom in zillions of universes).

Notice also in the Choi and Kim paper (illustrated figure above) their "global view of the protein structure space." They mapped "1,898 nonredundant protein structures from Protein Data Bank are mapped in the 3D space [down from the hyperdimensional space of protein-sequence space] to visualize the major feature of the map. The protein structure space is sparsely populated, and all of the proteins of known structures cluster mostly into four elongated regions, which correspond approximately to four SCOP classes (all- ${alpha}$ , all- $beta$ , ${alpha}$ + $beta$ , and ${alpha}$ / $beta$ ) of protein structures indicated by red, yellow, purple, and cyan spheres, respectively. The small proteins and multidomain protein classes are represented by green and black spheres, respectively. All structural class assignments were based on the SCOP classification. Three axes are drawn in to visualize high-population regions of all- ${alpha}$ , all- $beta$ , and ${alpha}$ / $beta$ class proteins, and the "origin" is represented by a large orange ball at the point where two of the axes meet." ⁸

Given this description, notice how the small proteins (green spheres) are much more closely spaced and clustered together compared to the multidomain proteins (black spheres) and other larger proteins (other colors) which occupy much much larger sequence spaces. There is a progressive increase in the average distance between beneficial protein structures with increasing size requirements. This feature is illustrated in an even clearer way in the figure below (c). In this figure you will note a size scale where the shortest proteins are colored dark blue, medium sized proteins green to yellow, and the largest proteins red. Guess which beneficial protein systems have the greatest average distance from each other?

Again, this only highlights the fact that increasing structural threshold requirements produce a markedly lower ratio and wider non-beneficial gaps between potentially viable and beneficial protein-based systems in sequence/structural space. Also consider that the three dimensional illustration presented is a dramatic under characterization of the actual distance that exists in hyperdimensional sequence space. It is like projecting the shadows of a large number of widely spaced objects that exist in three dimensional space onto a two dimensional screen. The resulting dots on the two-dimensional screen would appear much closer together than they really are in three dimensional space. Now, extrapolate this effect by hundreds and thousands of dimensions (one extra dimension for every one amino acid residue increase in protein system size) to understand the true gap distances illustrated by Choi and Kim. Commenting on this projection, Choi and Kim write:

"The dissimilarity matrix then was subject to the classical multidimensional scaling (MDS) procedure to find the positional coordinates in a multidimensional (1,898 dimension) space ofthe protein structure universe. We used S_99.95 to prevent a few extremely large similarity scores from dominating the distribution feature of the structural space map. To capture and visualizethe major features of the high dimensional space, we representthe protein structure space in three dimensions byusing the three components with highest eigenvalues, which aresubstantially greater than the rest." ⁸

Erich Bornberg-Bauer's paper dealing with model protein structures (comparable to real proteins) supports the notion that sequence space is sparsely populated with fairly evenly distributed viable proteins even at low-levels of structural threshold requirements - features which I propose only become exponentially more and more accentuated with each step up the ladder of minimum structural threshold requirements.

"Roughly speaking, however, distances are randomly distributed. This means that, although only a small fraction of sequence space yields uniquely folding sequences, sequence space is occupied nearly uniformly. No "higher order" clustering (i.e., except the trivial case of the homologous sequences) is visible." ⁹

Real Life

Of course, since nature cannot tell the difference between two meaningless genetic sequences, it cannot select between them, making natural selection blind to such neutral changes. Since there are no recognizable "steppingstones" close by, all that nature has left, to find new beneficial sequences, is a blind random walk through enormous piles of junk sequences. Of course, this random, curvy walk takes a lot longer than a direct walk would take, and the time involved increases exponentially with each increase in the minimum sequence and specificity requirements for a particular function. Random selection of sequences within sequence space starting from a beneficial island (like throwing darts at a dartboard) has no statistical advantage when it comes to finding novel beneficial sequences over neutral random walk. This prediction is reflected in real life by an exponential decline in the ability of mindless evolutionary processes to evolve anything beyond the lowest levels of functional complexity.

Many simple functions, such as de novo antibiotic resistance, are easy to evolve for any bacterial colony in short order. Moving up a level of complexity, there are far fewer examples of single protein enzymes evolving where a few hundred amino acids at minimum are required to work together at the same time (and many types of bacteria cannot evolve even at this level). However, there are absolutely no examples in the scientific literature of any function requiring more than a thousand or so amino acids working at the same time (as in the simplest bacterial motility system) ever evolving - period. The beneficial "steppingstones" are just too far apart due to all the junk that separates the few beneficial islands of function from every other island in the vast universe of junk sequences at such levels of informational complexity. The average time needed to randomly sort through enough junk sequences to find any other beneficial function at such a level of complexity quickly works its way into trillions upon trillions of years - even for an enormous population of bacteria (all the bacteria on Earth: ~1e30) with a high mutation rate (one mutation per 100,000 base pairs per individual every 20 minutes). (Link)

At this point the mindless processes of evolution simply become untenable as any sort of viable explanation for the high levels of diverse complexity that we see within all living things. The only process left that is known to give rise to functional systems at comparable levels of complexity involves human intelligence or beyond. No lesser intelligence, and certainly no other known mindless processes, have ever come close to producing something like the informational complexity found in the simplest bacterial motility system. (Link)

For the invisible things of him from the creation of the world are clearly seen, being understood

by the things that are made, even his eternal power and Godhead. (Romans 1:20)

Kimura, M. 1983. Neutral Theory of Molecular Evolution. Cambridge University Press.

Pitman, S.D. 2003. Antibiotic resistance. ( http://naturalselection.0catch.com/Files/antibioticresistance.html )

Hall, B.G. 1982. Evolution on a petri dish - the evolved b-galactosidase system as a model for studying acquisitive evolution in the laboratory. Evolutionary Biology 15:85-150.

BLAST Search: http://www.ncbi.nlm.nih.gov/BLAST

Yockey, H.P. 1992. Information Theory and Molecular Biology. Cambridge University Press, pp. 255, 257.

Yockey, H.P., On the information content of cytochrome C, Journal of Theoretical Biology , 67 (1977), p. 345-376.

Anonymous. n.d. The Universe. National Solar Observatory, Sacramento Peak. http://www.nso.edu/sunspot/pr/answerbook/universe.html/ [ Ed. note: The number of atoms according to this reference is estimated to be 10 ⁷⁹ .]

In-Geol Choi^*, and Sung-Hou Kim, Evolution of protein structural classes and protein sequence families, PNAS | September 19, 2006 | vol. 103 | no. 38 | 14056-14061 ( Link )

Erich Bornberg-Bauer, How Are Model Protein Structures Distributed in Sequence Space? Biophysical Journal, Volume 73, November 1997, 2393-2403 ( Link )

. Home Page                                                                           . Truth, the Scientific Method, and Evolution

. Methinks it is Like a Weasel                                                 . The Cat and the Hat - The Evolution of Code

. Maquiziliducks - The Language of Evolution             . Defining Evolution

. The God of the Gaps                                                           . Rube Goldberg Machines

. Evolving the Irreducible                                                     . Gregor Mendel

. Natural Selection                                                                  . Computer Evolution

. The Chicken or the Egg                                                         . Antibiotic Resistance

. The Immune System                                                            . Pseudogenes

. Genetic Phylogeny                                                                . Fossils and DNA

. DNA Mutation Rates                                                            . Donkeys, Horses, Mules and Evolution

. The Fossil Record                                                                . The Geologic Column

. Early Man                                                                                . The Human Eye

. Carbon 14 and Tree Ring Dating                                     . Radiometric Dating

. Amino Acid Racemization Dating                  . The Steppingstone Problem

. Quotes from Scientists                                                           . Ancient Ice

. Meaningful Information                                                          . The Flagellum

. Harlen Bretz                                   . Milankovitch Cycles

Search this site or the web powered by FreeFind

Site search Web search

Since June 1, 2002