By Topic

IEEE Quick Preview
  • Abstract
  • Authors
  • Figures
  • Multimedia
  • References
  • Cited By
  • Keywords

Conservation of Information in Search: Measuring the Cost of Success

Conservation of information theorems indicate that any search algorithm performs, on average, as well as random search without replacement unless it takes advantage of problem-specific information about the search target or the search-space structure. Combinatorics shows that even a moderately sized search requires problem-specific information to be successful. Computers, despite their speed in performing queries, are completely inadequate for resolving even moderately sized search problems without accurate information to guide them. We propose three measures to characterize the information required for successful search: 1) endogenous information, which measures the difficulty of finding a target using random search; 2) exogenous information, which measures the difficulty that remains in finding a target once a search takes advantage of problem-specific information; and 3) active information, which, as the difference between endogenous and exogenous information, measures the contribution of problem-specific information for successfully finding a target. This paper develops a methodology based on these information measures to gauge the effectiveness with which problem-specific information facilitates successful search. It then applies this methodology to various search tools widely used in evolutionary search.



OVER 50 years ago, Leon Brillouin, a pioneer in information theory, wrote “The [computing] machine does not create any new information, but it performs a very valuable transformation of known information” [5]. When Brillouin's insight is applied to search algorithms that do not employ specific information about the problem being addressed, one finds that no search performs consistently better than any other. Accordingly, there is no “magic-bullet” search algorithm that successfully resolves all problems [10], [44].

In the last decade, Brillouin's insight has been analytically formalized under various names and in various ways in regard to computer-optimization search and machine-learning algorithms.

  1. Schaffer's Law of Conservation of Generalization Performance [39] states “a learner … that achieves at least mildly better-than-chance performance [without specific information about the problem in question]… is like a perpetual motion machine.”
  2. The no free lunch theorem (NFLT) [46] likewise establishes the need for specific information about the search target to improve the chances of a successful search. “[U]nless you can make prior assumptions about the … [problems] you are working on, then no search strategy, no matter how sophisticated, can be expected to perform better than any other” [23]. Search can be improved only by “incorporating problem-specific knowledge into the behavior of the [optimization or search] algorithm” [46].
  3. English's Law of Conservation of Information (COI) [15] notes “the futility of attempting to design a generally superior optimizer” without problem-specific information about the search.

When applying Brillouin's insight to search algorithms (including optimization, machine-learning procedures, and evolutionary computing), we use the umbrella-phrase COI. COI establishes that one search algorithm will work, on average, as well as any other when there is no information about the target of the search or the search space. The significance of COI has been debated since its popularization through the NFLT. On the one hand, COI has a leveling effect, rendering the average performance of all algorithms equivalent. On the other hand, certain search techniques perform remarkably well, distinguishing themselves from others. There is a tension here, but no contradiction. For instance, particle-swarm optimization [2], [14], [16], [22] and genetic algorithms [1], [16], [19], [36], [37], [38], [45], [49], [51] perform well on a wide spectrum of problems. Yet, there is no discrepancy between the successful experience of practitioners with such versatile search algorithms and the COI-imposed inability of the search algorithms themselves to create novel information [7], [13], [15]. Such information does not magically materialize but instead results from the action of the programmer who prescribes how knowledge about the problem gets folded into the search algorithm.

Practitioners in the earlier days of computing sometimes referred to themselves as “penalty function artists” [21], a designation that reflects the need for the search practitioner to inject knowledge about the sought-after solution into the search procedure. Problem-specific information is almost always embedded in search algorithms. Yet, because this information can be so familiar, we can fail to notice its presence.

COI does not preclude better-than-average performance on a specific problem class [8], [18], [20], [31], [35], [43]. For instance, when choosing the best Formula$k$ of Formula$n$ features for a classification problem, hill-climbing approaches are much less effective than branch-and-bound approaches [25]. Hill-climbing algorithms require a smooth fitness surface uncharacteristic of the feature-selection problem. This simple insight when smooth fitness functions are appropriate constitutes knowledge about the search solution; moreover, utilizing this knowledge injects information into the search. Likewise, the ability to eliminate large regions of the search space in branch-and-bound algorithms supplies problem-specific information. In these examples, the information source is obvious. In other instances, it can be more subtle.

Accuracy of problem-specific information is a fundamental requirement for increasing the probability of success in a search. Such information cannot be offered blindly but must accurately reflect the location of the target or the search-space structure. For example, to open a locked safe with better-than-average performance, we must either do the following.

  1. Know something about the combination, e.g., knowing all of the numbers in the combination are odd is an example of target information.
  2. Know something about how the lock works, e.g., listening to the lock's tumblers during access is an example of search-space structure information.

In either case, the information must be accurate. If we are incorrectly told, for example, that all of the digits to a lock's combination are odd when in fact they are all even, and we therefore only choose odd digits, then the probability of success nosedives. In that case, we would have been better simply to perform a random search.

Recognition of the inability of search algorithms in themselves to create information is “very useful, particularly in light of some of the sometimes-outrageous claims that had been made of specific optimization algorithms” [7]. Indeed, when the results of an algorithm for even a moderately sized search problem are “too good to be true” or otherwise “overly impressive,” we are faced with one of two inescapable alternatives.

  1. The search problem under consideration, as gauged by random search, is not as difficult as it first appears. Just because a convoluted algorithm resolves a problem does not mean that the problem itself is difficult. Algorithms can be like Rube Goldberg devices that resolve simple problems in complex ways. Thus, the search problem itself can be relatively simple and, from a random-query perspective, have a high probability of success.
  2. The other inescapable alternative, for difficult problems, is that problem-specific information has been successfully incorporated into the search.

As with many important insights, reflection reveals the obviousness of COI and the need to incorporate problem-specific information in search algorithms. Yet, despite the widespread dissemination of COI among search practitioners, there is to date no general method to apply COI to the analysis and design of search algorithms. We propose a method that does so by measuring the contribution of problem-specific target and search-space structure information to search problems.



If we have a combination lock with five thumb wheels, each numbered from zero to nine, our chances of opening the lock in eight tries or less is not dependent on the ordering of the combinations tried. Using the results of the first two failed combinations, for example, [0, 1, 2, 3, 4] and [4, 3, 2, 1, 0], tells us nothing about what the third try should be. After eight tries at the combination lock with no duplicate combinations allowed, no matter how clever the method used, the chances of choosing the single combination that opens the lock is Formula$p = 1 - (99\,992/100\,000) = 8.00 \times 10^{-5}$ corresponding to an endogenous information of Formula$I_{\Omega} = -\log_{2}p = 13.6\ \hbox{b}$. The endogenous information provides a baseline difficulty for the problem to be solved. In this example, the problem difficulty is about the same as sequentially forecasting the results of 14 flips of a fair coin.

To increase the probability of finding the correct combination, more information is required. Suppose, for example, we know that all of the single digits in the combination that opens the lock are even. There are Formula$5^{5} = 3125$ such combinations. Choosing the single successful combination in eight tries or less is Formula$q = 1 - (3117/3125) = 2.56 \times 10^{-3}$, or exogenous information Formula$I_{S} = 8.61\ \hbox{b}$. The difference, Formula$I_{+} = I_{\Omega} - I_{S} = 5.00\ \hbox{b}$, is the active information and assesses numerically the problem-specific information incorporated into the search that all the digits are even.

More formally, let a probability space Formula$\Omega$ be partitioned into a set of acceptable solutions Formula$T$ and a set of unacceptable solutions Formula$\overline{T}$. The search problem is to find a point in Formula$T$ as expediently as possible. With no problem-specific information, the distribution of the space is assumed uniform. This assumption dates to the 18th century and Bernoulli's principle of insufficient reason [4], which states that “in the absence of any prior knowledge, we must assume that the events [in Formula$\Omega$] … have equal probability” [34]. Problem-specific information includes both target-location information and search-space structure information. Uniformity is equivalent to an assumption of maximum (informational) entropy on the search space. The assumption of maximum entropy is useful in optimization and is, for example, foundational in Burg's maximum entropy method [6], [9], [34].

The probability of choosing an element from Formula$T$ in Formula$\Omega$ in a single query is then Formula TeX Source where Formula$\vert\cdot\vert$ denotes the cardinality of a set. On the average, without use of active information, all queries will have this same probability of success.

For a single query, the available endogenous information is Formula TeX Source Endogenous information is independent of the search algorithm used to find the target [39], [46]. To increase the probability of a search success, knowledge concerning the target location or search-space structure must be prescribed. Let Formula$q$ denote the probability of success of a query by incorporating into the search problem-specific information. The difficulty of the problem has changed and is characterized by the exogenous information Formula TeX Source

If the problem-specific information incorporated into a search accurately reflects the target location, the probability of success of a query will increase. Thus, Formula$q > p$ and the corresponding exogenous information will be smaller than the endogenous information (i.e., Formula$I_{\Omega} > I_{S}$). The difference between these two information measures we call the active information. It is conveniently represented as Formula TeX Source This definition satisfies important boundary criteria that are consistent with properties we expect of problem-specific information.

  1. For a perfect search, Formula$q = 1$, and the active information is Formula$I_{+} = I_{\Omega}$. The perfect search therefore extracts all available information from the search space.
  2. If there is no improvement, the probability of success is the same as that of a random query and Formula$q = p$. The active information in (4) is appropriately zero, and no knowledge of the target location or search-space structure has been successfully incorporated into the search.
  3. If the active information degrades performance, then we can have Formula$q < p$ in which case the active information is negative.

As with other log-ratio measures, such as decibels, the active information measure is taken with respect to a reference or baseline, in this case the chance of success of a single random query Formula$p$. Other baselines can be used.

Single queries generalize directly to multiple queries or samples. Obtaining one or more sixes in four or less rolls of a die, for example, can be considered a target Formula$T$ in a fourfold Cartesian product of the sample space of a single die. Thus, the sequence of four queries can be considered as a single query in this larger search space. References to a single query can therefore correspond to a number of experiments rather than a single measurement.

Active information captures numerically the problem-specific information that assists a search in reaching a target. We therefore find it convenient to use the term “active information” in two equivalent ways:

  1. as the specific information about target location and search-space structure incorporated into a search algorithm that guides a search to a solution;
  2. as the numerical measure for this information and defined as the difference between endogenous and exogenous information.

Search often requires numerous queries to achieve success. If Formula$Q$ queries are required to extract all of the available endogenous information, then the average active information per query is Formula TeX Source



We next illustrate how active information can be measured and offer, as examples, the active information for search using the common tools of evolutionary search in commonly used search algorithms, including repeated queries, importance sampling, partitioned search, evolution strategy, and use of random mutation. In each case, the source of active information is identified and is then measured.

A. Repeated Queries

Multiple queries clearly contain more information than a single query. Active information is therefore introduced from repeated queries. Assuming uniformity, the probability of at least one success in Formula$Q$ queries without replacement is Formula TeX Source The NFLT states that, without added information, the probability of success on the average will be the same no matter what procedure is used to perform the Formula$Q$ queries. The NFLT assumes sampling without replacement [46]. For Formula$p \ll 1$ and Formula$Q \ll \vert\Omega\vert$, sampling with and without replacement are nearly identical, and we can write to a good approximation Formula TeX Source For Formula$pQ \ll 1$, we can approximate Formula$1 - (1 - p)^{Q} \approx pQ$. Then, for Formula$q = p_{wo}$ in (4), we obtain the interesting result Formula TeX Source

When the stated assumptions apply, the active information from multiple queries is therefore independent of both the size of the sample space and the probability of success. The repeated random queries also provide diminishing returns. The active information from two queries is 1 b. The active information from 1024 queries is only 1 b more than that from 512.

B. Subset Search

A simple example of active information, due to Brillouin [5], occurs when the target is known to reside in some subset Formula$\Omega^{\prime} \subset \Omega$. In subset search, we know that Formula$T \subset \Omega^{\prime} \subset \Omega$ and Formula$q = \vert T\vert/ \vert\Omega^{\prime}\vert$. Active information is therefore introduced by restricting the search to a smaller number of possibilities. The Brillouin active information is Formula TeX Source

Subset search is a special case of importance sampling.

C. Importance Sampling

The simple idea of importance sampling [42] is to query more frequently near to the target. As shown in Fig. 1, active information is introduced by variation of the distribution of the search space to one where more probability mass is concentrated about the target.

Figure 1
Fig. 1. Importance sampling imposes a probability structure on Formula$\Omega$ where more probability mass is focused around the target Formula$T$. The probability of a successful search increases, and active information is introduced. Without importance sampling, the distribution over Formula$\Omega$ is uniform. Note that the placement of the distribution about Formula$T$ must be accurate. If placed at a different location in Formula$\Omega$, the distribution over Formula$T$ can dip below uniform. The active information is then negative.

Our use of importance sampling is not conventional. The procedure is typically used to determine expected values using Monte Carlo simulation using fewer randomly generated queries by focusing attention on queries closer to the target. We are interested, rather, in locating a single point in Formula$T$. The probability of doing so is Formula TeX Source where Formula$h(\vec{x})$ denotes the probability mass function with the outer product image Formula$\{q_{1}\quad q_{2}\quad q_{3}\quad \cdots\quad q_{N}\}^{\times L}$. The active information follows from substituting (8) into (4).

Example 1

Let the target Formula$T$ be the first of Formula$\vert\Omega\vert = 8$ possible outcomes. Then, Formula$p = 1/8$ and Formula$I_{\Omega} = 3\ \hbox{b}$. If importance sampling is applied so that the probability of choosing the target doubles, then the resulting active information is Formula$I_{+} = 1\ \hbox{b}$. If importance sampling is inappropriately applied and the probability of choosing the target is half of Formula$p$, then the resulting active information is negative: Formula$I_{+} = -1\ \hbox{b}$.


Frequency-of-occurrence (FOO) search is applicable for searching for a sequence of Formula$L$ sequential elements using an alphabet of Formula$N$ characters. With no active information, a character must be assumed to occur at the same rate as any other character. Active information can be bettered by querying in accordance to the FOO of the characters [5], [41]. The endogenous information using a uniform distribution is Formula TeX Source A priori knowledge that certain characters occur more frequently than others can improve the search probability. In such cases, the average information per character reduces from Formula$\log_{2}N$ to the average information or entropy Formula TeX Source where Formula$p_{n}$ is the probability of choosing the Formula$n$th character.The active information per character is then 1 Formula TeX Source

If capitalization and punctuation are not used, English has an alphabet of size Formula$N = 27$ (26 letters and a space).Some letters, like Formula$E$, occur more frequently than others, like Formula$Z$. Use of FOO information reduces the entropy from Formula$\log_{2}(27) = 4.76\ \hbox{b}$ to an entropy of about Formula$H_{N} = 4.05\ \hbox{b}$ per character [5]. The active information per character is Formula TeX Source

DNA has an alphabet of length Formula$N = 4$ corresponding to the four nucleotide A, C, G, and T. When equiprobable, each nucleotide therefore corresponds to 2 b of information per nucleotide. The human genome has been compressed to a rate of 1.66 b per nucleotide [27] and corresponds to an active information per nucleotide of Formula TeX Source

Active FOO information can shrink the search space significantly. For Formula$L\gg 1$, the asymptotic equipartition property [9], [41] becomes applicable. In English, for example, if the probability of choosing an Formula$E$ is 10%, then, as Formula$L$ increases, the proportion of Formula$E$'s in the message will, due to the law of large numbers [33], approach 10% of the elements in the message. Let Formula$\ell_{n}$ be the number of elements in a typical message of length Formula$L$. Then, for large Formula$L$, we can approximate Formula$\ell_{n}\approx p_{n}L$ and Formula TeX Source

Signals wherein (14) is approximately true are called typical. The approximation becomes better as Formula$L$ increases. For a fixed Formula$L$, the number of typical signals can be significantly smaller than Formula$\vert\Omega \vert$, thereby decreasing the portion of the search space requiring examination.

To see this, let Formula$\omega \subset \Omega$ denote the set of typical signals corresponding to given FOO probabilities.The added FOO information reduces the size of the search space from Formula$\vert\Omega\vert = N^{L}$ to 2 Formula TeX Source where Formula$H_{N}$ is measured in bits. When search is performed using the FOO, the fractional reduction of size of the search space is specified by the active information. Specifically Formula TeX Source where Formula$I_{+}$ is measured in bits. Equation (16) is consistent with (7). For English, from (12), the reduction due to FOO information is Formula$\vert\omega\vert/\vert\Omega\vert = (2^{-0.709})^{L} = (0.612)^{L}$, a factor quickly approaching zero. From (13), the same is true for the nucleotide sequence for which the effective search-space size reduces by a proportion of Formula$\vert\omega \vert/\vert\Omega \vert = (2^{-0.340})^{L} = (0.975)^{L}$. In both cases, the effective reduced search space, though, will still be too large to practically search even for moderately large values of Formula$L$.

Yockey [50] offers another bioinformatic application. The amino acids used in protein construction by DNA number Formula$N = 20$. For a peptide sequence of length Formula$L$, there are a total of Formula$N^{L}$ possible proteins. For 1-iso-cytochrome c, there are Formula$L = 113$ peptides and Formula$\vert\Omega\vert = N^{L} = 20^{113} = 1.04\times 10^{147}$ possible combinations. Yockey shows, however, that active FOO information reduces the number of sequences over 36 orders of magnitude to Formula$6.42\times 10^{111}$ possibilities. The search for a specific sequence using this FOO structure supplies about a quarter Formula$(I_{+} = 119\ \hbox{b})$ of the required Formula$(I_{\Omega} = 491\ \hbox{b})$ information. The effective search-space size is reduced by a factor of Formula$\vert\omega\vert/\vert\Omega\vert = 2^{-119} = 1.50 \times 10^{-36}$. The reduced size space is still enormous.

Active information must be accurate. The asymptotic equipartition property will work only if the target sought obeys the assigned FOO. As Formula$L$ increases, the match between the target and FOO table must become closer and closer. The boundaries of Formula$\omega$ quickly harden, and any deviation of the target from the FOO will prohibit in finding the target. An extreme example is the curious novel Gadsby [48] which contains no letter Formula$E$. Imposing a 10% FOO for an Formula$E$ for even a short passage in Gadsby will nosedive the probability of success to near zero. Negative active information can be fatal to a search.

1) Searching a Type Class

More restrictive than active FOO information is a message of length Formula$L$ where the exact count of each character is known but their order is not.With this information about the target, the search space shrinks even further to the space Formula$\varpi \subset \omega \subset \Omega$. The cardinality of this type class is bounded [9]. Formula TeX Source From this inequality, we can bind the active information. From (7) Formula TeX Source where we have used (9).

E. Partitioned Search

Partitioned search [12] is a “divide and conquer” procedure best introduced by example. Consider the Formula$L = 28$ character phrase Formula TeX Source Suppose that the result of our first query of Formula$L = 28$ characters is Formula TeX Source Two of the letters Formula$\{{\bf E}, {\bf S}\}$ are in the correct position. They are shown in a bold font. In partitioned search, our search for these letters is finished. For the incorrect letters, we select 26 new letters and obtain Formula TeX Source Five new letters are found, bringing the cumulative tally of discovered characters to Formula$\{{\bf T}, {\bf S}, {\bf E}, \ast, {\bf E}, {\bf S}, {\bf L}\}$. All seven characters are ratcheted into place. The 19 new letters are chosen, and the process is repeated until the entire target phrase is found.

Assuming uniformity, the probability of successfully identifying a specified letter with sample replacement at least once in Formula$Q$ queries is Formula$1 - (1 - 1/N)^{Q}$, and the probability of identifying all Formula$L$ characters in Formula$Q$ queries is Formula TeX Source

For the alternate search using purely random queries of the entire phrase, a sequence of Formula$L$ letters is chosen. The result is either a success and matches the target phrase, or does not. If there is no match, a completely new sequence of letters is chosen. To compare partitioned search to purely random queries, we can rewrite (5) as Formula TeX Source

For Formula$L = 28$ and Formula$N = 27$ and moderate values of Formula$Q$, we have Formula$p \ll q$ corresponding to a large contribution of active information. The active information is due to knowledge of partial solutions of the target phrase. Without this knowledge, the entire phrase is tagged as “wrong” even if it differs from the target by one character.

The enormous amount of active information provided by partitioned search is transparently evident when the alphabet is binary. Then, independent of Formula$L$, convergence can always be performed in two steps. From the first query, the correct and incorrect bits are identified. The incorrect bits are then complemented to arrive at the correct solution. Generalizing to an alphabet of Formula$N$ characters, a phrase of arbitrary length Formula$L$ can always be identified in, at most, Formula$N - 1$ queries. The first character is offered, and the matching characters in the phrase are identified and frozen in place. The second character is offered, and the process is repeated. After repeating the procedure Formula$N - 1$ times, any phrase characters not yet identified must be the last untested element in the alphabet.

Partitioned search can be applied at different granularities. We can, for example, apply partitioned search to entire words rather than individual letters. Let there be Formula$W$ words each with Formula$L/W$ characters each. Then, partitioned search probability of success after Formula$Q$ queries is Formula TeX Source Equations (22) and (23) are special cases for Formula$W = L$ and Formula$W = 1$. If Formula$N^{-L/W} \ll 1$, we can make the approximation Formula$p_{W}\approx Q^{W}N^{-L}$ from which it follows that the active information is Formula TeX Source With reference to (6), the active information is that of Formula$W$ individual searches: one for each word.

F. Random Mutation

In random mutation, the active information comes from the following sources.

  1. Choosing the most fit among mutated possibilities. The active information comes from knowledge of the fitness.
  2. As is the case of prolonged random search, in the sheer number of offspring. In the extreme, if the number of offspring is equal to the cardinality of the search space and all different, a successful search can be guaranteed.

We now offer examples of measuring the active information for these sources of mutation-based search procedures.

1) Choosing the Fittest of a Number of Mutated Offspring

In evolutionary search, a large number of offspring is often generated, and the more fit offspring are selected for the next generation. When some offspring are correctly announced as more fit than others, external knowledge is being applied to the search, giving rise to active information. As with the child's game of finding a hidden object, we are being told, with respect to the solution, whether we are getting “colder” or “warmer” to the target.

Consider the special case where a single parent gives rise to Formula$\Phi$ children. The single child with the best fitness is chosen and used as the parent for the next generation. If there is even a small chance of improvement by mutation, the number of children can always be chosen to place the chance of improvement arbitrarily close to one. To show this, let Formula$\Delta\kappa$ be the improvement in fitness. Let the cumulative distribution of Formula$\Delta\kappa$ be Formula$F_{\Delta \kappa}(x) = \Pr[\Delta\kappa \leq x]$. Requiring there be some chance that a single child will have a better fitness is assured by requiring Formula$\Pr[\Delta\kappa > 0] > 0$ or, equivalently, Formula$F_{\Delta\kappa}(0) < 1.$ Generate Formula$\Phi$ children, and choose the single fittest. The resulting change of fitness will be Formula$\Delta\kappa_{\max} = \mathop{\max} \limits_{1\leq \varphi \leq \Phi}\Delta\kappa_{\varphi}$, where Formula$\Delta\kappa_{\varphi}$ is the change in fitness of child Formula$\varphi$. It follows that Formula$F_{\Delta\kappa_{\max}}(x) = F_{\Delta \kappa}^{\Phi}(x)$ so that the probability the fitness increases is Formula TeX Source Since Formula$F_{\Delta\kappa}(0) < 1$, this probability can be placed arbitrarily close to one by choosing a large enough number of children Formula$\Phi$. If we define success of a single generation as better fitness, the active information of having Formula$\Phi$ children as opposed to one is Formula TeX Source

2) Optimization by Mutation

Optimization by mutation, a type of stochastic search, discards mutations with low fitness and propagates those with high fitness. To illustrate, consider a single-parent version of a simple mutation algorithm analyzed by McKay [32]. Assume a binary Formula$(N = 2)$ alphabet and a phrase of Formula$L$ bits. The first query consists of Formula$L$ randomly selected binary digits. We mutate the parent with a bit-flip probability of Formula$\mu$ and form two children. The fittest child, as measured by the Hamming distance from the target, serves as the next parent, and the process is repeated. Choosing the fittest child is the source of the active information. When Formula$\mu \ll 1$, the search is a simple Markov birth process [34] that converges to the target.

To show this, let Formula$k[n]$ be the number of correct bits on the Formula$n$th iteration. For an initialization of Formula$k_{0}$, we show in the Appendix that the expected value of this iteration is Formula TeX Source where the overbar denotes expectation. Example simulations are shown in Fig. 2.

Figure 2
Fig. 2. Twenty-five emulations of optimization using simple mutation as a function of the number of generations. The bold line is (28) with Formula$k_{0} = 0$. One parent births two children, the fittest of which is kept. There are cases where the fitness decreases, but they are rare. Parameters are Formula$L = 100$ and Formula$\mu = 0.00005$.

The endogenous information of the search is Formula$I_{\Omega} = L\ \hbox{bits}$. Let us estimate that this information is achieved by a perfect search in Formula$G$ generations when Formula$\overline{k[G]} = \alpha L$ and Formula$\alpha$ is very close to one. Assume Formula$k_{0} = \beta L$. Then Formula TeX Source A reasonable assumption is that half of the initially chosen random bits are correct, and Formula$\beta = 1/2$. Set Formula$\alpha = (L - (1/2))/L$. Then, since the number of queries is Formula$Q = 2G$, the number of queries for a perfect search Formula$(I_{+} = I_{\Omega})$ is Formula TeX Source The average active information per query is thus Formula TeX Source or, since Formula$\ln(1 - 2\mu) \simeq -2 \mu$ for Formula$\mu \ll 1$ Formula TeX Source For the example shown in Fig. 2, Formula$I_{\oplus} \approx 0.0022\ \hbox{b}$ per query.

3) Optimization by Mutation With Elitism

Optimization by mutation with elitism is the same as optimization by mutation in Section III-F2 with the following change. One mutated child is generated. If the child is better than the parent, it replaces the parent. If not, the child dies, and the parent tries again. Typically, this process gives birth to numerous offspring, but we will focus attention on the case of a single child. We will also assume that there is a single bit flip per generation.

For Formula$L$ bits, assume that there are Formula$k$ matches to the target. The chance of improving the fitness is a Bernoulli trial with probability Formula$(N - k)/N$. The number of Bernoulli trials before a success is geometric random variable equal to the reciprocal Formula$N/(N - k)$. We can use this property to map the convergence of optimization by mutation with elitism. For discussion's sake, assume we start with an estimate that is opposite of the target at every bit. The initial Hamming distance between the parent and the target is thus Formula$L$. The chance of an improvement on the first flip is one, and the Hamming distance decreases to Formula$L - 1$. The chance of the second bit flip to improve the fitness is Formula$L/(L - 1)$. We expect Formula$L/(L - 1)$ flips before the next improvements. The sum of the expected values to achieve a fitness of Formula$L - 2$ is therefore Formula TeX Source Likewise Formula TeX Source or, in general Formula TeX Source It follows that Formula TeX Source where 3 Formula TeX Source is the Formula$L$th harmonic number [3]. Simulations are shown in Fig. 3, and a comparison of their average with (30) is in Fig. 4. For Formula$\ell = L$, we have a perfect search and Formula TeX Source Since the endogenous information is Formula$L$ bits, the active information per query is therefore Formula TeX Source This is shown in Fig. 5.

Figure 3
Fig. 3. One thousand implementations of optimization by mutation with elitism for Formula$L = 131\ \hbox{b}$.
Figure 4
Fig. 4. Average of the curves shown in Fig. 3 superimposed on the stair-step curve from (30). The two are nearly graphically indistinguishable.
Figure 5
Fig. 5. Formula$1/{\cal H}_{L}$ is the active information per query for optimization by mutation with elitism for a bit string of duration Formula$L$. A plot of Formula$1/{\cal H}_{L}$ versus Formula$L$ is shown here. The asymptote follows from Formula${\cal H}_{L} \longrightarrow \ln(L) + \gamma$ as Formula$L\rightarrow \infty$ where Formula$\gamma = 0.5772156649\ldots$ is the Euler–Mascheroni constant.

G. Stair-Step Search

By stair-step search, we mean building on more probable search results to achieve a more difficult search. Like partitioned search, active information is introduced through knowledge of partial solutions of the target.

We give an example of stair-step search using FOO. Consider the following sequence of Formula$K = 3$ phrases:

  1. STONE_;
  2. TEN_TOES_;

The first phrase establishes the characters used. The second establishes the FOO of the characters. The third has the same character FOO as the second but is longer. From an Formula$N = 27$ letter alphabet, the final phrase contains Formula$I_{\Omega} = 36\log_{2}27 = 171\ \hbox{b}$ of endogenous information. To find this phrase, we will first search for phrase 1 until a success is achieved. Then, using the FOO of phrase 1, search for phrase 2. The FOO of phrase 2 is used to search for phrase 3.

Generally, let Formula$p_{n}[k]$ be the probability of the Formula$n$th character in phrase Formula$k$ and let the probability of achieving success for the all of the first Formula$k$ phrases be Formula$P_{k} = \Pr[k, k - 1, k - 2, \ldots, 1]$. Then Formula TeX Source where the conditional probability is Formula$P_{k\vert k - 1} = \Pr[k\vert k - 1, k - 2, \ldots, 1]$. The process is first-order Markov and can be written as Formula TeX Source where Formula$\ell_{n}[k] = L_{k}p_{n}[k]$ is the occurrence count of the Formula$n$th letter in the Formula$k$th phrase and Formula$L_{k}$ is the total number of letters in the Formula$k$th phrase.Define the information required to find the Formula$k$th phrase by Formula$I_{k} = -\log(P_{k})$. From (31) and (32), it follows that Formula TeX Source where the cross entropy between phrase Formula$k$ and Formula$k - 1$ is [32] Formula TeX Source

The solution to the difference equation in (33) is Formula TeX Source where Formula$H(1, 0): = H(1)$. The active information follows as Formula TeX Source

Use of the Formula$K = 3$ phrase sequence in our example yields Formula$I_{K} = 142\ \hbox{b}$ corresponding to an overall active information of Formula$I_{+} = 29\ \hbox{b}$. Interestingly, we do better by omitting the second phrase and going directly from the first to the third phrase. The active information increases to Formula$I_{+} = 50\ \hbox{b}$.4 Each stair step costs information so that a step's contribution to the solution must be weighed against the cost.



Christensen and Oppacher [7] note the “sometimes-outrageous claims that had been made of specific optimization algorithms.” Their concern is well founded. In computer simulations of evolutionary search, researchers often construct a complicated computational software environment and then evolve a group of agents in that environment. When subjected to rounds of selection and variation, the agents can demonstrate remarkable success at resolving the problem in question. Often, the claim is made, or implied, that the search algorithm deserves full credit for this remarkable success. Such claims, however, are often made as follows: 1) without numerically or analytically assessing the endogenous information that gauges the difficulty of the problem to be solved and 2) without acknowledging, much less estimating, the active information that is folded into the simulation for the search to reach a solution.

A. Monkey at a Typewriter

A “monkey at a typewriter” is often used to illustrate the viability of random evolutionary search. It also illustrates the need for active information in even modestly sized searches. Consider the production of a specified phrase by randomly selecting an English alphabet of Formula$N = 27$ characters (26 letters and a space). For uniform probability, the chance of randomly generating a message of length Formula$L$ is Formula$p = N^{-L}$. The chances are famously small. Presentation of each string of Formula$L$ characters requires Formula$-\log_{2}(p)\ \hbox{b}$ of information. The expected number of repeated Bernoulli trials (with replacement) before achieving a success is a geometric random variable with mean Formula TeX Source Thus, in the search for a specific phrase, we would expect to use Formula TeX Source Astonishingly, for Formula$N = 27$, searching for the Formula$L = 28$ character message in (19) with no active information requires on average Formula$B = 1.59 \times 10^{42}\ \hbox{b}$—more than the mass of 800 million suns in grams [11]. For Formula$B = 10^{100}\ \hbox{b}$, more than there are atoms in the universe, solving the transcendental equation in (38) reveals a search capacity for a phrase of a length of only Formula$L = 68$ characters. Even in a multiverse of Formula$10^{1000}$ universes the same size as ours, a search can be performed for only Formula$L = 766$ characters. A plot of the number of bits Formula$B$ versus phrase length Formula$L$ is shown in Fig. 6 for Formula$N = 27$.

Figure 6
Fig. 6. For Formula$N = 27$, the number of bits Formula$B$ expected for a search of a phrase of length Formula$L$. From (38), as Formula$L \rightarrow \infty$, we have the asymptote Formula$\log(B) \rightarrow L \log N$. This explains the linear appearance of this plot.

Active information is clearly required in even modestly sized searches.

Moore's law will not soon overcome these limitations. If we can perform an exhaustive search for a Formula$B$ bit target in 1 h, then, if the computing speed doubles, we will be able to search for Formula$B + 1\ \hbox{b}$ in the same time period. In the faster mode, we search for Formula$B$ bits when the new bit is one, and then for Formula$B$ more bits when the new bit is zero. Each doubling of computer speed therefore increases our search capabilities in a fixed period of time by only 1 b. Likewise, for Formula$P$ parallel evaluations, we are able to add only Formula$\log_{2}P\ \hbox{b}$ to the search. Thus, 1024 machines operating in parallel adds only 10 b. Quantum computing can reduce search by a square root [17], [24]. The reduction can be significant but is still not sufficient to allow even moderately sized unassisted searches.



Endogenous information represents the inherent difficulty of a search problem in relation to a random-search baseline. If any search algorithm is to perform better than random search, active information must be resident. If the active information is inaccurate (negative), the search can perform worse than random. Computers, despite their speed in performing queries, are thus, in the absence of active information, inadequate for resolving even moderately sized search problems. Accordingly, attempts to characterize evolutionary algorithms as creators of novel information are inappropriate. To have integrity, search algorithms, particularly computer simulations of evolutionary search, should explicitly state as follows: 1) a numerical measure of the difficulty of the problem to be solved, i.e., the endogenous information, and 2) a numerical measure of the amount of problem-specific information resident in the search algorithm, i.e., the active information.


Derivation of (28) in Section III-F2.

Since Formula$\mu \ll 1$, terms of order Formula$\mu^{2}$ and higher are discarded. Let Formula$Z_{k[n]}[n]$ be the number of correct bits that become incorrect on the Formula$n$th iteration. We then have the conditional probability Formula TeX Source Similarly, let Formula$O[n]$ be the number of incorrect bits that become correct on the Formula$n$th iteration. Then Formula TeX Source For a given Formula$k[n]$, Formula$O[n]$ and Formula$Z[n]$ are independent Bernoulli random variables. The total chance in fitness is Formula TeX Source

Since Formula$\mu \ll 1$, the random variable Formula$\Delta k[n]$ can take on values of (−1, 0, 1). The parent is mutated twice, and two mutated children are birthed with fitness changes Formula$\Delta k_{1}[n]$ and Formula$\Delta k_{2}[n]$. The child with the highest fitness is kept as the parent for the next generation which has a fitness of Formula TeX Source where Formula$\Delta \varkappa[n] = \max(\Delta k_{1}[n], \Delta k_{2}[n])$. It then follows that Formula TeX Source where we have used a more compact notation, e.g., Formula$k[n] = k$. Ridding the equation of Formula$\mu^{2}$ terms gives Formula TeX Source Thus, for Formula$\mu \ll 1$, there is a near-zero probability of generating a lower fitness, since doing so requires both offspring to be of a lower fitness and has a probability on the order of Formula$\mu^{2}$. The mutation search is then recognized as a Markov birth process.

The conditional expected value of (44) is Formula$E[\Delta\varkappa\vert k] = 2(L - k)\mu$. We can then take the expectation of (42) and write Formula$\overline{k[n + 1]} = \overline{k[n]} + 2(L - \overline{k[n]})\mu$ or, equivalently, Formula$\overline{k[n + 1]} = (1 - 2\mu)\overline{k[n]} + 2L\mu$, where the overline denotes expectation. The solution of this first-order difference equation is (28).


This paper was recommended by Associate Editor Y. Wang.

W. A. Dembski is with the Department of Philosophy, Southwestern Baptist Theological Seminary, Fort Worth, TX 76115 USA (e-mail: wdembski@designinference.com).

R. J. Marks II is with the Department of Electrical and Computer Engineering, Baylor University, Waco, TX 76798 USA (e-mail: robert_marks@baylor.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

1 This expression is recognized as the Kullback–Leibler distance between the structured search space and the uniform distribution [9].

2 The derivation is standard [9], [41], [50]. The number of ways Formula$\{\ell_{n}\vert1\leq n\leq N\}$ objects can be arranged is given by the multinomial coefficient Formula$\vert\omega \vert = L!/\prod_{n = 1}^{N}\ell_{n}!$. For large Formula$M$, from Stirling's formula, Formula$\ln M!\rightarrow M$ Formula$\ln M$ and Formula$\vert\omega \vert\approx e^{L\ln L} \exp(\prod_{n = 1}^{N}\ell_{n}\ln \ell_{n})$. Applying (14) gives Formula$\vert\omega \vert\approx e^{LH_{N}}$ where the entropy is measured in nats. A nat is the unit of information when a natural log is used, for example, in (2). When measured in bits Formula$(\log_{2})$, we obtain (15).

3The conventional notation for the Formula$L$th harmonic number is Formula$H_{L}$. We use Formula${\cal H}_{L}$ to avoid any confusion with the symbol for entropy.

4Deleting the first phrase also increases the active information—but by not as much. If the second phrase is searched using a uniform prior, then the added information for finding the third phrase is Formula$I_{+} = 38\ \hbox{b}$.


1. H. M. Abbas and M. M. Bayoumi

"Volterra-system identification using adaptive real-coded genetic algorithm"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 36, no. 4, pp. 671-684, 2006

2. S. Agrawal , Y. Dashora , M. K. Tiwari and Y.-J. Son

"Interactive particle swarm: A pareto-adaptive metaheuristic to multiobjective optimization"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 2, pp. 258-277, 2008

3. M. Aigner

Discrete Mathematics

2007, Amer. Math. Soc.

4. J. Bernoulli

"Ars conjectandi (the art of conjecturing)"

Tractatus De Seriebus Infinitis, 1713

5. L. Brillouin

Science and Information Theory

1956, Academic

6. J. P. Burg

Maximum entropy spectral analysis


7. S. Christensen and F. Oppacher

"What can we learn from no free lunch? A first attempt to characterize the concept of a searchable"

Proc. Genetic Evol. Comput., pp. 1219-1226, 2001

8. T.-Y. Chou , T.-K. Liu , C.-N. Lee and C.-R. Jeng

"Method of inequality-based multiobjective genetic algorithm for domestic daily aircraft routing"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 2, pp. 299-308, 2008

9. T. M. Cover and J. A. Thomas

Elements of Information Theory

2006, Wiley-Interscience

10. J. C. Culberson

"On the futility of blind search: An algorithmic view of \'no free lunch\'"

Evol. Comput., vol. 6, no. 2, pp. 109-127, 1998

11. J. D. Cutnell and J. W. Kenneth


1995, Wiley

12. R. Dawkins

The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe Without Design

1996, Norton

13. S. Droste , T. Jansen and I. Wegener

"Perhaps not a free lunch but at least a free appetizer"

Proc. 1st GECCO, pp. 833-839,

14. R. C. Eberhart , Y. Shi and J. Kennedy

Swarm Intelligence

2001, Morgan Kaufmann

15. T. M. English

"Some information theoretic results on evolutionary optimization"

Proc. CEC, vol. 1, p. 795, 1999

16. M. S. Fadali , Y. Zhang and S. J. Louis

"Robust stability analysis of discrete-time systems using genetic algorithms"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 29, no. 5, pp. 503-508, 1999

17. L. K. Grover

"A fast quantum mechanical algorithm for data search"

Proc. ACM Symp. Theory Comput., pp. 212-219, 1996

18. P. Guturu and R. Dantu

"An impatient evolutionary algorithm with probabilistic tabu search for unified solution of some NP-hard problems in graph and set theory via clique finding"

IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 3, pp. 645-666, 2008

19. T. D. Gwiazda

Genetic Algorithms Reference

2006, Tomasz Gwiazda

20. J. S. Gyorfi and C.-H. Wu

"An efficient algorithm for placement sequence and feeder assignment problems with multiple placement-nozzles and independent link evaluation"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 2, pp. 437-442, 2008

21. M. J. Healy

"Personal Communication"

The Boeing Company

22. S.-Y. Ho , H.-S. Lin , W.-H. Liauh and S.-J. Ho

"OPSO: Orthogonal particle swarm optimization and its application to task assignment problems"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 2, pp. 288-298, 2008

23. Y.-C. Ho and D. L. Pepyne

"Simple explanation of the no free lunch theorem"

Proc. 40th IEEE Conf. Decision Control, pp. 4409-4414, 2001

24. Y.-C. Ho , Q.-C. Zhao and D. L. Pepyne

"The no free lunch theorems: Complexity and security"

IEEE Trans. Autom. Control, vol. 48, no. 5, pp. 783-793, 2003

25. C. A. Jensen , R. D. Reed , R. J. Marks II, M. A. El-Sharkawi , J.-B. Jung , R. T. Miyamoto , G. M. Anderson and C. J. Eggen

"Inversion of feedforward neural networks: Algorithms and applications"

Proc. IEEE, vol. 87, no. 9, pp. 1536-1549, 1999

26. M. Koppen , D. H. Wolpert and W. G. Macready

"Remarks on a recent paper on the \'no free lunch\' theorems"

IEEE Trans. Evol. Comput., vol. 5, no. 3, pp. 295-296, 2001

27. G. Korodi , I. Tabus , J. Rissanen and J. Astola

"DNA sequence compression"

IEEE Signal Process. Mag., vol. 47, no. 1, pp. 47-53, 2007

28. C.-Y. Lee

"Entropy—Boltzmann selection in the genetic algorithms"

IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 33, no. 1, pp. 138-149, 2003

29. R. E. Lenski , C. Ofria , R. T. Pennock and C. Adami

"The evolutionary origin of complex features"

Nature, vol. 423, no. 6936, pp. 139-144, 2003

30. Y. Li and C. O. Wilke

"Digital evolution in time-dependent fitness landscapes"

Artif. Life, vol. 10, no. 2, pp. 123-134, 2004

31. B. Liu , L. Wang and Y.-H. Jin

"An effective PSO-based memetic algorithm for flow shop scheduling"

IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 1, pp. 18-27, 2007

32. D. J. C. MacKay

Information Theory, Inference and Learning Algorithms

2002, Cambridge Univ. Press

33. R. J. Marks II

Handbook of Fourier Analysis and Its Application

2009, Oxford Univ. Press

34. A. Papoulis

Probability, Random Variables, and Stochastic Processes

pp. 537-542, 1991, McGraw-Hill

35. U. S. Putro , K. Kijima and S. Takahashi

"Adaptive learning of hypergame situations using a genetic algorithm"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 30, no. 5, pp. 562-572, 2000

36. R. D. Reed and R. J. Marks II

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks

1999, MIT Press

37. K. Takita and Y. Kakazu

"Automatic agent design based on gate growth-application to wall following problem"

Proc. 37th Int. Session Papers SICE Annu. Conf., pp. 863-868, 1998

38. F. Rojas , C. G. Puntonet , M. Rodriguez-Alvarez , I. Rojas and R. Martin-Clemente

"Blind source separation in post-nonlinear mixtures using competitive learning, Simulated annealing, and a genetic algorithm"

IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 34, no. 4, pp. 407-416, 2004

39. C. Schaffer H. Hirsh and W. W. Cohen

"A conservation law for generalization performance"

Proc. 11th Int. Conf. Mach. Learn., pp. 259-265, 1994

40. T. D. Schneider

"Evolution of biological information"

Nucleic Acids Res., vol. 28, no. 14, pp. 2794-2799, 2000

41. C. E. Shannon

"A mathematical theory of communication"

Bell Syst. Tech. J., vol. 27, pp. 379-423, 1948

42. R. Srinivasan

Importance Sampling

2002, Springer-Verlag

43. B. Weinberg and E. G. Talbi

"NFL theorem is unusable on structured classes of problems"

Proc. CEC, vol. 1, pp. 220-226, 2004

44. A. M. Turing

"On computable numbers with an application to the entscheidungs problem"

Proc. Lond. Math. Soc. Ser. 2, vol. 42, pp. 230-265, 1936

45. G. S. Tewolde and W. Sheng

"Robot path integration in manufacturing processes: Genetic algorithm versus ant colony optimization"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 2, pp. 278-287, 2008

46. D. Wolpert and W. G. Macready

"No free lunch theorems for optimization"

IEEE Trans. Evol. Comput., vol. 1, no. 1, pp. 67-82, 1997

47. D. H. Wolpert and W. G. Macready

"Coevolutionary free lunches"

IEEE Trans. Evol. Comput., vol. 9, no. 6, pp. 721-735, 2005

48. E. V. Wright


1997, Lightyear Press

49. Y.-C. Xu and R.-B. Xiao

"Solving the identifying code problem by a genetic algorithm"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 37, no. 1, pp. 41-46, 2007

50. H. P. Yockey

Information Theory, Evolution, and the Origin of Life

2005, Cambridge Univ. Press

51. F. Yu , F. Tu and K. R. Pattipati

"A novel congruent organizational design methodology using group technology and a nested genetic algorithm"

IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 36, no. 1, pp. 5-18, 2006


William A. Dembski

William A. Dembski

William A. Dembski (M'07–SM'07) received the B.A. degree in psychology, the M.S. degree in statistics, the Ph.D. degree in philosophy, and the Ph.D. degree in mathematics in 1988 from the University of Chicago, Chicago, IL, and the M.Div. degree from Princeton Theological Seminary, Princeton, NJ, in 1996.

He was an Associate Research Professor with the Conceptual Foundations of Science, Baylor University, Waco, TX, where he also headed the first intelligent design think-tank at a major research university: The Michael Polanyi Center. He was the Carl F. H. Henry Professor in theology and science with The Southern Baptist Theological Seminary, Louisville, KY, where he founded its Center for Theology and Science. He has taught at Northwestern University, Evanston, IL, the University of Notre Dame, Notre Dame, IN, and the University of Dallas, Irving, TX. He has done postdoctoral work in mathematics with the Massachusetts Institute of Technology, Cambridge, in physics with the University of Chicago, and in computer science with Princeton University, Princeton, NJ. He is a Mathematician and Philosopher. He is currently a Research Professor in philosophy with the Department of Philosophy, Southwestern Baptist Theological Seminary, Fort Worth, TX. He is currently also a Senior Fellow with the Center for Science and Culture, Discovery Institute, Seattle, WA. He has held National Science Foundation graduate and postdoctoral fellowships. He has published articles in mathematics, philosophy, and theology journals and is the author/editor of more than a dozen books. In The Design Inference: Eliminating Chance Through Small Probabilities (Cambridge University Press, 1998), he examines the design argument in a post-Darwinian context and analyzes the connections linking chance, probability, and intelligent causation. The sequel to The Design Inference (Rowman & Littlefield, 2002) critiques Darwinian and other naturalistic accounts of evolution. It is titled No Free Lunch: Why Specified Complexity Cannot Be Purchased Without Intelligence. He has edited several influential anthologies, including Uncommon Dissent: Intellectuals Who Find Darwinism Unconvincing (ISI, 2004) and Debating Design: From Darwin to DNA (Cambridge University Press, 2004, coedited with M. Ruse). His newest book, coauthored with J. Wells, is titled The Design of Life: Discovering Signs of Intelligence in Biological Systems (Foundation for Thought and Ethics, 2007). His area of interest in intelligent design has grown in the wider culture; he has assumed the role of public intellectual. In addition to lecturing around the world at colleges and universities, he is frequently interviewed on the radio and television. His work has been cited in numerous newspaper and magazine articles, including three front-page stories in The New York Times as well as the August 15, 2005 Time Magazine cover story on intelligent design. He has appeared on the BBC, NPR (Diane Rehm, etc.), PBS (Inside the Law with Jack Ford and Uncommon Knowledge with Peter Robinson), CSPAN2, CNN, Fox News, ABC Nightline, and the Daily Show with Jon Stewart.

Robert J. Marks II

Robert J. Marks II

Robert J. Marks II (S'71–M'72–SM'83–F'94) is currently the Distinguished Professor of Engineering at the Department of Engineering, Baylor University, Waco, TX. He served for 17 years as the Faculty Advisor with the Campus Crusade for Christ, University of Washington chapter. He is currently a Distinguished Professor of engineering with the Department of Electrical and Computer Engineering, Baylor University, Waco, TX. His consulting activities include Microsoft Corporation, Pacific Gas & Electric, and Boeing Computer Services. Eleven of his papers have been republished in collections of seminal works. He is the author, coauthor, Editor, or Coeditor of eight books published by MIT Press, IEEE, and Springer–Verlag. His most recent text is Handbook of Fourier Analysis and Its Applications (Oxford University Press, 2009). His research has been funded by organizations such as the National Science Foundation, General Electric, Southern California Edison, Electric Power Research Institute, the Air Force Office of Scientific Research, the Office of Naval Research, the Whitaker Foundation, Boeing Defense, the National Institutes of Health, The Jet Propulsion Laboratory, the Army Research Office, and the National Aeronautics and Space Administration (NASA).

Dr. Marks is Fellow of the Optical Society of America. He was given the honorary title of Charter President of the IEEE Neural Networks Council. He is a former Editor-in-Chief of the IEEE TRANSACTIONS ON NEURAL NETWORKS. He was the recipient of numerous professional awards, including a NASA Tech Brief Award and a Best Paper Award from the American Brachytherapy Society for prostate-cancer research. He was the recipient of the IEEE Outstanding Branch Councilor Award, the IEEE Centennial Medal, the IEEE Computational Intelligence Society Meritorious Service Award, the IEEE Circuits and Systems Society Golden Jubilee Award, and, for 2007, the IEEE CIS Chapter of the IEEE Dallas Section Volunteer of the Year Award. He was named a Distinguished Young Alumnus of the Rose-Hulman Institute of Technology and is an inductee into the Texas Tech Electrical Engineering Academy. He was the recipient of the Banned Item of the Year Award from the Discovery Institute and a recognition crystal from the International Joint Conference on Neural Networks for contributions to the field of neural networks (2007).

Cited By

Cited by IEEE

1. Evolutionary Inversion of Swarm Emergence Using Disjunctive Combs Control

Ewert, W., Yu, A., Thompson, B.B., Marks, R.J.

Systems, Man, and Cybernetics: Systems, IEEE Transactions on, vol. 43, no. 5, pp. 1063-1076, 2013

2. Bernoulli's principle of insufficient reason and conservation of information in computer search

Dembski, W.A., Marks, R.J., II

Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on, pp. 2647-2652, 2009

3. Evolutionary synthesis of nand logic: Dissecting a digital organism

Ewert, W., Marks, R.J., II, Dembski, W.A.

Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on, pp. 3047-3053, 2009

4. OWLPath: An OWL Ontology-Guided Query Editor

Valencia-Garcia, Rafael, Garcia-Sanchez, Francisco, Castellanos-Nieves, Dagoberto, Fernndez-Breis, Jesualdo Tom??s

IEEE Transactions on Systems Man and Cybernetics - Part A Systems and Humans, vol. 41, no. 1, p. 121, 2011

Cited by Other Publishers

1. The effects of low-impact mutations in digital organisms

Nelson, Chase W, Sanford, John C

Theoretical Biology and Medical Modelling, vol. 8, no. 1, p. 9, 2011

2. A robust search paradigm with Enhanced Vine Creeping Optimization

Young, C. N., Le Brese, C., Zou, J. J., Leo, C. J.

Engineering Optimization, p. 1, 2012

3. Arbitrary function optimisation with metaheuristics

García-Martínez, Carlos, Rodriguez, Francisco J., Lozano, Manuel

Soft Computing, vol. 16, no. 12, p. 2115, 2012


Kojonen, Erkki Vesa Rope

Zygon&#xae;, vol. 48, no. 2, p. 251, 2013