$ touch nothing $ chmod +x nothing $ ./nothing $
Quines
A quine is a selfreplicating program, that is, a quine prints itself when run. If you’ve never written a quine, I strongly urge you to try to do so before reading on, so you can collect a large reward.
Nothing works
Some systems accept the empty program as instructions to do nothing before terminating, thus fulfilling the definition of a quine. For example, on Linux:
The smr
entry of the 1994
International Obfuscated C Code Contest won using this principle. Some C
compilers accept the empty input and produce an empty executable which we’ve
just seen prints the empty string. The IOCCC rules now forbid empty programs,
as do we.
Another easy solution is to use a builtin instruction that prints the program, available in languages such as BASIC:
10 LIST
Computer viruses often execute instructions to read the memory where they have been loaded in order to selfreplicate, which is partly demonstrated by my bootsector quine.
Though ingenious, these examples cheat; they satisfy the definition of a quine only in letter and not in spirit.
Think inside the box
An honest quine is more mathematical. In such a program, as Landin put it:
the thing an expression denotes, i.e. its "value", depends on the values of its subexpressions, not on other properties of them.
For example, the LIST
command of BASIC is disqualified because its meaning
differs for different programs. Similarly, the effect of a memory read
instruction depends on the contents of the memory, and not just the read
instruction itself. Contrast this with an expression like 1 + 1
, which always
evaluates to 2 wherever it appears.
Thus we seek a mostly pure expression that evaluates to a string that represents the original expression. For practical purposes, we’ll allow syscalls that print a character or a string to the output.
Stringing along
With these stricter conditions, a first guess might be to write a single print
statement for the program that prints out the source. This works in, say, the
echo
language in which every "expression" evaluates to its own
representation, but in general, a single print statement fails because it omits
the call to the print function. For example:
puts("myself");
prints myself
, but not the puts()
call, nor the quotes.
We could try fixing this by quoting the entire program and passing it to
puts()
:
puts("puts(\"myself\")");
But this prints our first program, and not itself.
Nesting more and more print statements is of no use. We’ll always be off by one.
Von Neumann
John von Neumann thought deeply about selfreplication, and solved this turtlesallthewaydown problem by designing a selfreplicating machine with three parts:

a program, which we may think of as a string of bits

a universal constructor, which can interpret the program, that is, carry out the instructions represented by a string of bits

a universal copy machine, which can duplicate a given string of bits.
The constructor builds the new machine except for its program. The copy machine duplicates the program.
Scientific American describes a delightful analogy involving an architect drawing a blueprint to duplicate their studio, with the catch that the blueprint is part of the studio: "the blueprint would include a plan for building a photocopy machine. Once the new studio and the photocopier were built, the construction crew would simply run off a copy of the blueprint and put it into the new studio." Then "the selfdescription need not contain a description of itself".
In other words, separating construction from copying sidesteps the infinite regress. Discovering this key insight is the reward for writing a quine without looking up the solution first. You get to feel as smart as von Neumann! Or at least a little bit: von Neumann had more to worry about, because he envisioned a selfreplicating machine with a capacity to evolve, and worked through the cellular automata version of the problem.
We programmers have it easy. The print statement is our universal constructor: it outputs any string we give it. And a universal copy machine is only a little more work: we just need to declare a given string as a string literal by following the rules of our language. For example, it may be enough to surround it with quotation marks.
Haskell has a builtin copy machine: the show
function. When given a string,
it returns a quoted version of the string, ready for use as a string literal:
> putStrLn $ show "foo" "foo"
This leads to a a simple Haskell quine:
main = let f s = s ++ show s in putStrLn $ f "main = let f s = s ++ show s in putStrLn $ f"
In the definition of f
, the first s
is the constructor building the new
machine, while show s
is the copier duplicating the blueprint. We concatenate
their outputs with (++)
because juxtaposition means function application in
Haskell.
The string literal can be considered a program in a simple language which can only do one thing: print the input string and append a quoted version of it. This is exactly what our interpreter does.
Boutique quines
For some reason, brevity is often prized in quines, so we might use the Reader monad to reduce the program’s size:
main=putStrLn$(++)<*>show$"main=putStrLn$(++)<*>show$"
Unlocking the secret of quines enables the construction of curiosities such as a quinelike program that prints a different version of itself every run by incrementing a serial number:
f (n, p) = unlines $ ("serial = " ++ show n):p ++ [' ':show (n + 1, p)] main = putStr $ f (0, ["f (n, p) = unlines $ (\"serial = \" ++ show n):p ++ [' ':show (n + 1, p)]" , "main = putStr $ f"])
Another goal might be to make the quine less obvious. The following is not a quine:
main = let f s = (toEnum <$> s) ++ show s in putStrLn $ f t t = fromEnum <$> "main = let f s = (toEnum <$> s) ++ show s in putStrLn $ f"
However, it produces the following program, which is a quine where the string has been replaced by a list of ASCII codes:
main = let f s = (toEnum <$> s) ++ show s in putStrLn $ f[109,97,105,110 ,32,61,32,108,101,116,32,102,32,115,32,61,32,40,116,111,69,110,117,109,32 ,60,36,62,32,115,41,32,43,43,32,115,104,111,119,32,115,32,105,110,32,112 ,117,116,83,116,114,76,110,32,36,32,102]
Use or Mention?
William Van Orman Quine wrote Mathematical Logic, based on his graduate teaching in the 1930s and 1940s, where he explains the usemention distinction with examples:

Boston is populous.

Boston is disyllabic.

"Boston" is disyllabic.
In the first two cases, the first word plays an active role (use), while it plays a passive role (mention) in the third case. This distinction alone makes the second statement false and the third true.
Quine notes "Frege seems to have been the first logician to recognize the importance of scrupulous use of quotation marks for avoidance of confusion between use and mention of expressions", and laments that this lesson was largley ignored for 30 years.
Today, thanks to the web, even nonprogrammers may be aware of this distinction. URLs often appear intelligible except for bits and pieces replaced by percent signs and numbers. A layman might rightly suspect that this is a form of quoting which prevents the browser from acting a certain way on certain characters, that is, a means to indicate a mention and not a use.
Quine’s Paradox
Selfreplicating programs are called quines because of Quine’s paradox:

yields falsehood when preceded by its quotation.
If we do what it suggests, we get:

"yields falsehood when preceded by its quotation" yields falsehood when preceded by its quotation.
which is a variant of the venerable paradox: "This sentence is false".
We see the same string appear twice, once with quotes (mention), and once
without (use), just like our Haskell quine. The unquoted part tempts the reader
to place it in quotes, akin to show
in our Haskell quine. And Quine’s
motivation for constructing his paradox was to avoid direct selfreference,
just as our programs must.
Knowing Quine’s paradox might spoil the challenge for firsttime quine authors, which is why we only revealed it now.
Combinator quines
Given a combinator term, suppose performing a single reduction counts as running a program. Let \(M\) be the combinator that Smullyan dubbed the "mockingbird":
Then \(MM\) is a quine, as \(MM\) reduces to \(MM\).
Even this tiny example features the above concepts. When reducing combinators, the head combinator plays an active role (use) while its arguments play passive roles (mention). Thus by printing one \(M\) before another \(M\), we have produced a constructorandcopier (the first \(M\)) and its quotation (the second \(M\)).
What if we say normalization is execution? (Recall this means that as long as we’re able to, we reduce any part of the expression.) Then the problem is trivial: any combinator term in normal form is a quine. A similar argument applies to weak head normalization.
Better is supplying a continuation or print syscall \(f\) as the first argument to a quine. We might say \(x\) is a quine if:
where an arrow denotes normalization. We imagine the syscall \(f\) printing its argument \(x\), which is the original program. A term starting with \(f\) is considered to be in weak normal form.
In this case, the \(T\) combinator is a quine, since \(Tf = fT\) for any \(f\).
More interesting is the similar:
where we consider \(xf\) to be complete original program: one could argue \(x\) alone is incomplete because it needs to be given a syscall to work so \(f\) ought to appear as well.
Curry
A combinator \(x\) such that for all \(f\), both \(x f\) and \(f(x f)\) can reduce to the same term is called a fixedpoint combinator (or the easiertosay fixpoint combinator).
We use the equals sign to indicate this: \(x f = f(x f)\). We’ve also been using the equals sign to define combinators, so we rely on context to distinguish the two. That is, in a combinator definition, the equals sign defines the meaning of the combinator, while in other contexts it indicates that both sides can be reduced to the same term. (Some authors avoid any possible confusion by writing, say, the triple bar (≡) for definitions.)
In mathematics, a fixed point of a function \(f\) is a solution of \(f(z) = z\), and fixedpoint theorems are important. Thus \(x\) is a called fixedpoint combinator because \(x f\) is a fixed point of \(f\) for any combinator \(f\).
The most famous fixedpoint combinator is the \(Y\) combinator, commonly attributed to Haskell Curry:
However, the \(Y\)combinator is unsuitable for our purposes: although both \(Y f\) and \(f(Y f)\) can be reduced to the same term, there is no way to reduce \(Y f\) to \(f(Y f)\).
Enter its less famous cousin, due to Alan Turing:
Then \(\Theta\) is a quine:
The definitions of both \(Y\) and \(\Theta\) apply a combinator to itself, reminding us of the \(MM\) quine. This is no coincidence. We’ll see fixedpoint combinators are a sort of parameterized version of \(M\).
Curry’s Paradox
Cardone and Hindley, History of Lambdacalculus and Combinatory Logic, page 8, state that Turing’s fixedpoint combinator \(\Theta\) was the first to be published (1937, The pfunction in λKconversion, in Journal of Symbolic Logic 2), and the Y combinator was first published by Rosenbloom in 1950, in the form:
where \(W x y = x y y\).
However, in a 1929 letter to Hilbert, Curry wrote:
\[ (\lambda y.N(y y))(\lambda y.N(y y)) \]
where \(N\) represents negation. It seems likely Curry had something like the Y combinator in his head all along.
Another example is Curry’s 1942 paper, The Inconsistency of Certain Formal Logics, where he introduces what is now known as Curry’s paradox. Again, in order to find a combinator that satisfies a certain recurrence, rather than define \(Y\) separately, he writes down a normalized \(YX\) for a particular function \(X\).
Thus we credit Curry for the Y combinator. This allows us to say Curry proved Curry’s paradox arises in certain logics with Curry’s Y combinator!
What is Curry’s paradox? Like Quine, Curry sought a paradox satisfying certain constraints. Quine wanted to avoid direct selfreference, while Curry wanted to avoid negation.
Russell’s paradox is often told in the form of a story about a barber. It so happens Curry’s paradox can be viewed as a generalization of Russell’s paradox which changes the barber paradox to the following.
Let \(P\) be any proposition, such as "Germany borders China". The more ludicrous the better.
Curry’s barber shaves all men who, if they shaved themselves, then \(P\) is true. Does the barber shave himself?
If so, then \(P\) is true, because the barber shaves men for which this strange implication holds.
We just said if the barber shaves himself, then \(P\) is true. Thus the barber is exactly the kind of man the barber shaves. So yes, the barber does shave himself. This implies \(P\). That is, any proposition \(P\) is true!
Fixedpoint Combinators
How were fixedpoint combinators found? Let’s try to force our way through, and simply define:
This definition of \(X\) is illegal because \(X\) also appears on the righthand side. Again, we encounter the problem of a selfdescription attempting to contain itself.
We can make it legal by replacing \(X\) on the righthand side a new variable \(x\), which we’ll introduce on the lefthand side:
It may seem unclear how this helps, but one weird trick is all we need. If we
apply all occurences of x
on the righthand side to themselves, we are in a
position to exploit the usemention distinction:
We find:
Thus \(X = X''X''\) solves our equation above. The first \(X''\) is a use, the second a mention. As a lambda term, we have:
So we have in fact just derived Turing’s \(\Theta\) combinator.
Here’s an alternate solution. We adopt Curry’s point of view and treat \(f\) as a given function; a free variable.
We try to use brute force to define a fixed point of \(f\):
We apply the usemention trick to remove the \(Z\) on the righthand side:
If we now bind \(f\) with a lambda, we wind up with the \(Y\) combinator:
Recursion
Recursion is intimately connected to quines. A recursive function can be considered a selfreplicating function, except instead of printing its clone, it calls it with different parameters.
We typically define a recursive function \(F\) with equations where \(F\) also appears on the righthand side, that is, the selfdescription contains a description of itself. For instance:
fib n = add (fib (pred n)) (fib (pred (pred n)))
As above, such a definition is illegal in the world of combinators:
And as above, we can fix this with the usemention trick:
However, now that we have derived \(Y\) and \(\Theta\), a simpler approach is to replace each \(F\) with \(f\) and prepend \(\lambda f\):
then pass the whole thing to a fixedpoint combinator like \(\Theta\):
We find:
Using a predefined fixedpoint combinator for recursion may be preferable to always applying the usemention selfapplication trick because:

The lambda terms are simpler.

When translated, the combinator terms are simpler.

In some languages, recursive functions fail to typecheck, while the argument of a fixedpoint combinator may be legal. Adding a builtin fixedpoint combinator allows us to run it as intended, while clearly demarcating the only parts of the program that might loop forever.
We can explore fixedpoint combinators in Haskell:
import Data.Function (fix) fib = fix (\f n > if n <= 1 then n else (+) (f (pred n)) (f (pred (pred n)))) testFib20 = fib 20 == 6765
Von Neumann Paradox
There is also a paradox bearing von Neumann’s name, and since we already discussed Quine’s and Curry’s paradoxes, it seems only fair to bring it up. Roughly speaking, among other things, the von Neumann paradox uses the axiom of choice to show how to cut up a unit square and shuffle around the pieces to form two unit squares.
In logics based on lambda calculus or combinators, we can stay consistent by introducing types to disallow pathogens like fixedpoint combinators which lead to paradoxes.
In mathematics, paradoxes like von Neumann’s are strange but tolerable. While one can live without the axiom of choice, most prefer to keep it.
A Bonus Paradox
In Haskell, the type of the fixedpoint combinator is (a → a) → a
.
Transporting this to logic via the CurryHoward correspondence, this is
"for all A, if A implies A, then A is true". Since any proposition implies
itself, this implies any proposition is true.
Epilogue
Some authors shorten the definitions of \(Y\) and \(\Theta\) by using \(M = \lambda x.x x\) to apply a term to itself.
We could have defined:
Our remarks mostly apply if we choose weak head normalization instead of normalization, though \(M = SII\) no longer works.
Smullyan, in Diagonalization and SelfReference, attempts naturallanguage versions of some of the above, but his examples turn out to be flawed.
Amazingly, von Neumann figured out selfreplication years before DNA was discovered. DNA can be viewed as a program. In some contexts, DNA plays a passive role (mention) and is simply duplicated, and in others, DNA plays an active role (use) and causes various amino acids to combine in certain ways to build things.
Turing is wellknown for helping crack the Enigma Machine in World War II. Perhaps less known is that Quine was also a key figure in Allied military intelligence. He served in the United States Navy, deciphering messages from German submarines.