## Thursday, July 31, 2014

### Proving a theorem of Ewert, Marks, and Dembski

In what appears to be a forthcoming journal article, ID creationists Winston Ewert, William Dembski, and Robert J. Marks II state a formal result, and justify it merely by citing a three-page paper in the proceedings of a symposium. I was already annoyed with the reviewers and the editors for missing a huge defect, and decided to while away the sleepless hours by checking the putative proof in On the Improbability of Algorithmic Specified Complexity.

The formalism and the argument are atrocious. I eventually decided that it would be easier to reformulate what I thought the authors were trying to say, and to see if I could generate my own proof, than to penetrate the slop. It took me about 20 minutes, working directly in LaTeX. Then I decided to provide some explanation that is missing in the paper.

The theorem is correct. As Ewert, Marks, and Dembski put it, "The probability of obtaining an object exhibiting α$\alpha$ bits of [algorithmic specified complexity] is less than or equal to 2α$2^{-\alpha}$." It is something that they can establish a property like this. But algorithmic specified complexity is a sum of bits of Shannon self-information and bits of Kolmogorov complexity, which seem like apples and oranges to me.

I should mention that the result probably comes from Ewert's doctoral dissertation, Algorithmic Specified Complexity, still under wraps at Baylor. (I'm guessing that he refrained from plagiarism, this go around.) Evidently Dembski, a mathematician, did not edit the paper.

The following makes more sense if you read Sections I and II of the paper. Those of you with a bit of math under your belts will be amazed by the difference.

$\newcommand{\suppmu}{\mathrm{supp}(\mu)} \newcommand{\B}{\mathcal{B}} \newcommand{\X}{\mathcal{X}} \renewcommand{\S}{\mathcal{S}} \newcommand{\P}{\mathcal{P}} \newcommand{\Bprime}{\mathcal{B}^\prime}$The set of all strings (finite sequences) on B={0,1}$\B = \{0, 1\}$ is B.$\B^*.$ Assume that the binary encoding e:SB$e: \S \rightarrow \B^*$ of the set S$\S$ of objects of interest is 1-to-1. This allows use of the set of codewords e(S)$e(\S)$ in place of S.$\S.$

The set of all programs P$\P$ for universal computer U$U$ is a prefix-free subset of B.$\B^*.$ That is, no program is a proper prefix of any other. The conditional Kolmogorov complexity K(x|y)$K(x|y)$ is the length (p)$\ell(p)$ of the shortest program p$p$ that outputs x$x$ on input of y,$y,$ i.e.,

K(x|y)=minp:U(p,y)=x(p).
The Kraft inequality
xX2(x)1
holds for all prefix-free sets XB,$\X \subset \B^*,$ including the prefix-free set of programs P.$\P.$ It follows that for all XB,$\X \subseteq \B^*,$
xX2K(x|y)pP2(p)1,
where y$y$ is a string in B.$\B^*.$ In the first sum, all terms correspond to distinct programs, and each exponent K(x|y)$-\!K(x|y)$ is the negative length of a program that outputs x$x$ on input of y$y$.

Theorem 1. Let μ$\mu$ be a probability measure on encoded objects e(S).$e(\S).$ Also let

X={xsupp(μ)log2μ(x)K(x|y)α},
where y$y$ is a string in B$\B^*$ and α0.$\alpha \geq 0.$ Then μ(X)2α.$\mu(X) \leq 2^{-\alpha}.$

Proof. Rewrite the property of string x$x$ in X$X$ to obtain a bound on μ(x):$\mu(x):$

log2μ(x)K(x|y)log2μ(x)+K(x|y)log2μ(x)μ(x)αααK(x|y)2αK(x|y).
Applying the bound,
μ(X)=xXμ(x)xX2αK(x|y)=xX2α2K(x|y)=2αxX2K(x|y)2α.
The last step follows by the Kraft inequality. Q.E.D.