Haskell - Code That Counts

= Code That Counts =

Combinatorial calculations with combinator calculus.

Several explanations in http://math.mit.edu/\~rstan/ec/[_Enumerative
Combinatorics Volume 1_] by http://math.mit.edu/\~rstan/[Richard P. Stanley]
are outlines of algorithms. We follow along parts of the first chapter, but
replace descriptions of algorithms with real code.

Sets take center stage in mathematics, but sets are troublesome for computation
because we must work to decide whether one representation of a set is
equivalent to another. For example, we only know \(\{1,2,3\} = \{3,1,2\}\) by
confirming one is a permutation of the other. In addition, at some point we
must check for duplicates.

We therefore focus on lists instead of sets. Translating results back to the
sets is easy enough, and our choice turns out to have an advantage when we
consider multisets and ordered sets.

== Subsets ==

From `Data.List`, we have:

[ ]:

isSubsequenceOf :: (Eq a) => [a] -> [a] -> Bool
isSubsequenceOf []    _                    = True
isSubsequenceOf _     []                   = False
isSubsequenceOf a@(x:a') (y:b) | x == y    = isSubsequenceOf a' b
                               | otherwise = isSubsequenceOf a b
-- Examples:
and
  [ [] `isSubsequenceOf` [1..5]
  , [2,4] `isSubsequenceOf` [1..5]
  , [1..5] `isSubsequenceOf` [1..5]
  , not $ reverse [1..5] `isSubsequenceOf` [1..5]
  , not $ [6] `isSubsequenceOf` [1..5]
  ]

We represent a set \(U\) with a list `u` of distinct elements. A _subset_ or
_combination_ of \(U\) is represented by a list `s` satisfying `isSubsequenceOf
s u`. In other words:

  1. The empty set, represented by the empty list, is a subset of every set.

  2. The empty set has no non-empty subsets.

  3. There are two kinds of non-empty subsets of a non-empty set:
  those that begin with its head element `x`, and those that do not. In both
  subcases, the remaining subset elements are to be found among `a`,
  the tail of the list representing the set.

Subsets of size \(k\) are called _k-subsets_ or _k-combinations_.
To generate them, we closely follow the above cases:

[ ]:

subsets :: [a] -> Int -> [[a]]
subsets _     0 = [[]]
subsets []    _ = []
subsets (x:a) k = map (x:) (subsets a (k - 1)) ++ subsets a k

subsets "abcde" 3

Write \({n \choose k}\) for the number of \(k\)-subsets of a set of size \(n\).

We read this as "n choose k".

The first case of `subsets` implies for all \(n \ge 0\):

\[{n \choose 0} = 1\]

The second case implies that for any \(k \gt 0\):

\[{0 \choose k} = 0\]

The third case implies:

\[{n \choose k} = {n - 1 \choose k - 1} + {n - 1 \choose k }\]

where \(n, k \gt 0\). The left-hand term of the sum counts all subsets starting
with `x`, while the other side counts all subsets not containing `x`. In
the latter, we know `x` must be absent because of the uniqueness of each
item of the list representing `u`.

These numbers are called the _binomial coefficients_, because the coefficient
of \(x^k\) in \((1 + x)^n\) is \({n \choose k}\). We can prove via induction
that the coefficients satisfy the same recurrence relations.

Pascal's triangle brings these equations to life:

[ ]:

pascal = iterate (\p -> zipWith (+) ([0] ++ p) (p ++ [0])) [1]
showPascal n = unlines $ take n $ unwords . map show <$> pascal

putStr $ showPascal 16

By induction, we can show:

\[
{n \choose k} = \frac{n^\underline{k}}{k!}
\]

where we adopt Knuth's underline
notation for the _falling factorial:_

\[
n^\underline{k} = n(n-1)..(n-k+1)
\]

Stanley writes +++\( (n)_k \)+++, but we prefer Knuth because it naturally
leads to overline notation for the _rising factorial:_

\[
n^\overline{k} = n(n+1)..(n+k-1)
\]

(Thus \( k! = k^\underline{k} = 1^{\overline{k}} \).)

Although there exists link:/~blynn/pr/choose.html[a much faster algorithm for
computing binomial coefficients], we use falling factorials to compute binomial
coefficients for simplicity:

[ ]:

falling n k = product $ take k $ iterate pred n
choose n k = falling n k `div` product [1..k]

choose 10 5

We can view this formula from another perspective.
There are \(n\) ways to pick an element from a set of size \(n\):

[ ]:

picks = \case
  [] -> []
  x:xs -> (x, xs):(second (x:) <$> picks xs)

picks [1..5]

Iterating this \(k\) times yields \(n(n-1)...(n-k+1)\) ways of picking a
sequence of size \(k\) consisting of distinct elements of the set:

[ ]:

orderedSets xs n = map fst $ iterate (concatMap step) [([], xs)] !! n
  where step (ys, zs) = first (:ys) <$> picks zs

orderedSets "abcde" 3

In the case \(n = k\), we have that the number of permutations of \(k\) objects
is \(k\) factorial.

Now, there are \({n \choose k}\) distinct \(k\)-subsets, and from each we can
produce \(k!\) sequences of size \(k\).
By equating two different ways of counting the same thing:

\[ n^\underline{k} = {n\choose k} k! \]

== Polynomials ==

Recall:

\[
(1+x)^n = \sum_k {n\choose k} x^k
\]

To translate this to code, we represent polynomials with lists of integers,
with \([a_0, a_1, ...]\) representing \(a_0 + a_1 x + ...\). Then addition
and multiplication of polynomials are given by:

[ ]:

p .+. q = zipWith (+) (pad p) (pad q)
  where pad = take (length p `max` length q) . (++ repeat 0)
p      .*. [] = []
[]     .*. q  = []
(a:as) .*. q  = map (a *) q .+. (0:(as .*. q))

take 5 $ iterate ([1, 1] .*.) [1]

It turns out our representation works well the other way, that is, we can
represent a sequence of numbers as a polynomial; in this scheme, an infinite
sequence corresponds to a formal power series. This gives us an algebra for
manipulating sequences, known as _generating functions_.

By redefining \({n\choose k}\) to be the coefficient of \(x^k\) in
\((1 + x)^n\)
written as a power series, we generalize binomial coefficients beyond positive
integers \(n\). When \(n\) is a negative integer, we could derive a formula
via calculus and Taylor series expansions, or using the recurrences and
induction, but we'll soon see a simpler approach.

A _composition_ of \(n\) is an ordered list of positive integers that sum to \(n\). A _k-composition_ of \(n\) is a composition of \(n\) into \(k\) parts.
For example, a 4-composition of 7 is:

\[ 1 + 3 + 2 + 1 = 7 \]

We obtain related definitions by allowing zero. A _weak composition_ of \(n\)
is an ordered list of nonnegative integers that sum to \(n\). A _weak
k-composition_ of \(n\) is a weak composition of \(n\) into \(k\) parts.
For example, a weak 4-composition of 7 is:

\[ 3 + 0 + 2 + 2 = 7 \]

How can we generate all \(k\)-compositions and weak \(k\)-compositions of
\(n\)?

Try answering this before studying the following:

[ ]:

compositions n k = ungap <$> subsets [1..n-1] (k-1)
  where ungap xs = zipWith (-) (xs ++ [n]) ([0] ++ xs)

weakCompositions n k = map pred <$> compositions (n + k) k

putStrLn "4-compositions of 7:"
print $ compositions 7 4

putStrLn "weak 3-compositions of 5:"
print $ weakCompositions 5 3

Thus the number of \(k\)-compositions of \(n\) is \({n-1\choose k-1}\),
and the number of weak \(k\)-compositions of \(n\) is \({n+k-1\choose k-1}\).

== Power Sets ==

The _power set_ of a set \(S\) is the set of all subsets of \(S\).
We can compute it with the standard library `subsequences`
function, or bust out https://wiki.haskell.org/Blow_your_mind[a well-known
mind-blowing trick]:

[ ]:

powerset = filterM $ const [False, True]

powerset "abc"

The power set of an \(n\)-set has size \(2^n\), as suggested by the length of
the list `[False, True]`. In brief, the list monad is syntax sugar for
`concatMap`. Applying each of \(x\) functions to each of \(y\) items in a list
and concatenating everything leads to a list of size \(xy\), so iterating this
procedure on \(n\) lists of size 2 leads to a list of size \(2^n\).

Less abstrusely, for each item, we can decide to include it or leave it out,
leading to \(2^n\) total different decision paths.

Accordingly, we denote the power set of \(S\) by \(2^S\).

== Multisets ==

Informally, a _multiset_ is a set where elements may appear more than once.
More rigorously, a _finite multiset_ \(M\) on a set \(S\) is a function
\(v:S\to\Bbb{N}\) with \(\sum_{s\in S} v(s) < \infty\). We're back to ordinary
subsets when \(v(s) \le 1\).

A _k-combination with repetition_ of a set \(S\) is a multiset on \(S\) of size
\(k\). The number of \(k\)-combinations with repetition of an \(n\)-set is denoted
by \(\left({n \choose k}\right)\).

We can generate all \(k\)-combinations with repetition of \([1..n]\)
as follows:

[ ]:

multisets n k = zipWith (flip (-)) [0..] <$> subsets [1..n + k - 1] k

which implies \(\left({n \choose k}\right) = {n + k - 1 \choose k}\):

[ ]:

multichoose n k = choose (n + k - 1) k

We also have:

\[
(1 + x + x^2 + ...)^n = \sum x^{|M|} [M \text{ is a multiset on } [1..n]]
= \sum_{k\ge 0} \left({n \choose k}\right) x^k
\]

Thus:

\[
(1 + x + x^2 +... )^n = (1 - x)^{-n} = \sum_{k\ge 0} {-n \choose k}(-1)^k x^k
\]

and therefore:

\[\left({n\choose k}\right) = (-1)^k {-n\choose k} = {n + k - 1\choose k}\]

This is an easy way to compute binomial coefficients for negative \(n\). The
first equation is an example of a _combinatorial reciprocity theorem_.

We generalize binomial cofficients to _multinomial coefficients_:
if \(a_1,...,a_m\) are nonnegative integers that sum to \(n\), that is, a
weak \(m\)-composition of \(n\), then define \({n\choose a_1,...,a_m}\) to be
number of ways to divide \([1..n]\) into \(m\) different categories. We can
also view this as the number of permutations of the multiset where there are
\(a_i\) copies of the \(i\)th element. Additionally, we have:

+++\[
(x_1 + ... + x_m)^n = \sum {n\choose a_1,...,a_m} x_1^{a_1}...x_m^{a_m}
\]+++

When \(m = 2\), we have binomial coefficients. Conventionally, we write
\({n\choose k}\) instead of \({n\choose k,n-k}\). The latter notation does
have advantages: we immediately have \({n\choose k} = {n\choose n-k}\), and
we're mindful that choosing \(k\) elements out of an \(n\)-set is the same
as choosing to omit \(n-k\) elements. The recurrence of Pascal's triangle also
becomes symmetric:

\[{n\choose a,b} = {n-1\choose a-1, b} + {n-1\choose a,b-1}\]

Multinomials can be computed by:

\[
{n\choose a_1,...,a_m} = \frac{n!}{a_1!...a_m!}
\]

and satisfy the recurrence:

\[
{n\choose a_1,...,a_m} =
{n\choose a_1-1,...,a_m} + ... + {n\choose a_1,...,a_m-1}
\]

The following is a multiset version of our `picks` function.
We rely on a helper function to keep out degenerate empty lists.
We expect identical elements in the input list to be grouped together, for
instance, `["b", "aaa", "nn"]` instead of "banana".

[ ]:

multipicks = \case
  [] -> []
  a@(x:a'):rest -> (x, consNonEmpty a' rest) : (second (a:) <$> multipicks rest)
  where
  consNonEmpty = \case
    [] -> id
    a  -> (a:)

multipicks ["b", "aaa", "nn"]

== Permutations ==

We've already exhibited code to enumerate all permutations of a list `xs`.
Simply evaluate `orderedSets xs (length xs)`.

In _Algorithm Design with Haskell_, Richard Bird and Jeremy Gibbons describe
a more direct approach:

[ ]:

permsPicks :: [a] -> [[a]]
permsPicks = \case
  [] -> [[]]
  xs -> concatMap (\(x, ys) -> (x:) <$> permsPicks ys) $ picks xs

permsPicks "abc"

They also describe another function for generating all permutations,
one that is inductive as opposed to recursive:

[ ]:

perms :: [a] -> [[a]]
perms = foldr (concatMap . inserts) [[]] where
  inserts x = \case
    [] -> [[x]]
    y:ys -> (x:y:ys) : ((y:) <$> inserts x ys)

perms "abc"

They discuss important features of these algorithms, which we regrettably
ignore for now because we're focusing on combinatorics.

We can use the multiset version of `picks` to enumerate all permutations of
elements of a multiset:

[ ]:

permsMultipicks :: [[a]] -> [[a]]
permsMultipicks = \case
  [] -> [[]]
  xs -> concatMap (\(x, ys) -> (x:) <$> permsMultipicks ys) $ multipicks xs

permsMultipicks ["b", "aaa", "n"]

A permutation of a set can be represented as a _word_. In particular,
we can represent the permutation \(\pi\) on \([1..n]\) defined by
\(\pi(i) = a_i\) as the word \(a_1 ... a_n\).

Given a permutation \(\pi\) on a finite set \(S\), for any \(x\in S\) the
sequence \(x, \pi(x), \pi(\pi(x)), ...\) must eventually repeat, thus
permutations can be written as a _product_ of _cycles_:

[ ]:

cycles xs = f [] [1..n] where
  f [] [] = []
  f [] (i:is) = f [i] is
  f cs is | head cs == next = cs : f [] is
          | otherwise       = f (cs ++ [next]) $ filter (/= next) is
          where next = xs!!(last cs - 1)
  n = length xs

-- Seven cycles of length 1:
cycles [1..7]

-- One cycle of length 7:
cycles $ 7:[1..6]

-- Should be [[1,4],[2],[3,7,5],[6]]:
cycles [4,2,7,1,3,6,5]

We define a _standard representation_ for writing products of cycles. We sort
cycles in increasing order by their maximum elements, and each cycle is written
starting with its maximum element. We write \(\hat\pi\) for the standard
representation of \(\pi\).

Such a representation is unique, and furthermore, we can remove all bracketing:
a new cycle begins precisely when an element exceeds the previous maximum,
namely, on all the _left-to-right maxima_. Thus the standard representation
defines a bijection on permutations.

[ ]:

maximum = foldr1 max
minimum = foldr1 min
partition p xs = foldr (select p) ([],[]) xs
select p x (ts,fs) | p x       = (x:ts,fs)
                   | otherwise = (ts, x:fs)
sort = \case
  [] -> []
  x:xt -> sort a ++ [x] ++ sort b where (a, b) = partition (<= x) xt

sortOn f xs = snd <$> sort (zip (f <$> xs) xs)

standardize :: Ord a => [[a]] -> [a]
standardize = concatMap (\c -> uncurry (flip (++)) $ span (< maximum c) c) .
  sortOn maximum

cyclesFromStandard = \case
  [] -> []
  x:xs -> (x:a) : cyclesFromStandard b where (a, b) = span (< x) xs

standardize $ cycles [4,2,7,1,3,6,5]  -- Expect [2,4,1,6,7,5,3]

Given a permutation \(\pi\), let \(c_i(\pi)\) be the number of cycles of length
\(i\). Then the _type_ of a permutation \(\pi\) is the sequence
\( (c_1(\pi),...,c_n(\pi)) \), and the total number of cycles of \(\pi\) is the
sum of this sequence and denoted \(c(\pi)\).

[ ]:

cyclesOfLength xs k = length $ filter (== k) $ length <$> cycles xs

permType xs = cyclesOfLength xs <$> [1..length xs]

numCycles = length . cycles

Exercise: Show the number of permutations of type \((c_1,...,c_n)\) is:

+++\[
\frac{n!}{1^{c_1}c_1!...n^{c_n}c_n!}
\]+++

Solution: Suppose we write \([1..n]\) in every possible order, and for each such
ordering, from left to right we divide the elements into \(c_1\) cycles of
length 1, then \(c_2\) cycles of length 2, and so on up to \(c_n\) cycles of
length \(n\). We have produced all permutations of the desired type, but we
have counted each permutation multiple times because we can write a cycle of
length \(k\) in \(k\) different ways and for each \(i\) we can permute \(c_i\)
cycles amongst themselves without changing the product.

Following Knuth (who credits Jovan Karamata), we write:

\[{n \brack k}\]

for the number of permutations of \([1..n]\) with exactly \(k\) cycles.
(Stanley writes \(c(n, k)\).) Thanks to standard representations, this is also
the number of permutations of \([1..n]\) with \(k\) left-to-right maxima. These
are known as the _signless Stirling numbers of the first kind_.

The following enumerates them all:

[ ]:

kCyclePerms :: Int -> Int -> [[[Int]]]
kCyclePerms 0 0 = [[]]
kCyclePerms _ 0 = []
kCyclePerms 0 _ = []
kCyclePerms n k = concatMap grows (kCyclePerms (n-1) k)
  ++ map ([n]:) (kCyclePerms (n-1) (k-1))
  where
  grows = \case
    [] -> []
    xs:xst -> (:xst) <$> ins n xs <|> (xs:) <$> grows xst
  ins n = \case
    [] -> []
    x:xt -> (x:n:xt) : ((x:) <$> ins n xt)

putStrLn "Permutations of [1..4] consisting of 2 cycles:"
kCyclePerms 4 2

The code implies the recursion:

\[
{n \brack k} = (n-1) {n-1 \brack k} + {n-1 \brack k-1}
\]

[ ]:

stirling1 0 0 = 1
stirling1 0 k = 0
stirling1 n 0 = 0
stirling1 n k = (n - 1) * stirling1 (n - 1) k + stirling1 (n - 1) (k - 1)

putStr $ unlines $ take 8
  [unwords [show $ stirling1 n k | k <- [0..n]] | n <- [0..]]

We can prove:

+++++
\[
\sum_{k=0}^n {n\brack k} x^k = x^\overline{n} = x(x+1)...(x+n-1)
\]
+++++

by showing the coefficients satisfy the same recursive relations.

Another proof: the coefficient of \(x^k\) on the right-hand side is:

\[
\sum_{1\le a_1,...,a_{n-k}\le n-1} a_1 ... a_{n-k}
\]

which can viewed as the count of tuples \((S, f)\) where
\(S\) is a \((n-k)\)-subset of \([1..n-1]\) and \(f : S -> [1..n - 1]\) is
a function satisfying \(f(i) \le i\).

We show each such tuple corresponds to a \(k\)-cycle permutation, which can be
written as a permutation of \([1..n]\) with \(k\) left-to-right maxima.

Subtract each element of \(S\) from \(n\) to get \(b_1 > ... > b_{n-k}\). The
\(k\) remaining elements of \([1..n\) are our left-to-right maxima, so we start
by writing them in ascending order. Then we insert the \(b_i\) values so that
exactly \(f(b_i)\) numbers larger than \(b_i\) appear to the left of \(b_i\).

[ ]:

permFromSF n s fs = foldl ins ([1..n] \\ bs) $ zip bs fs where
  bs = (n -) <$> s
  ins xs (b, f) = uncurry (++) $ second (b:) $ splitBigsN f xs where
    splitBigsN 0 xs     = ([], xs)
    splitBigsN r (x:xt) = first (x:) $ splitBigsN (r - fromEnum (x > b)) xt

permFromSF 9 [1,3,4,6,8] [1,2,1,3,6]  -- Expect (2)(4)(753)(9168)

Yet another proof: we show the polynomials in question agree for all positive
integers \(x\). For such an \(x\), the left-hand side is the number of
\(k\)-cycle permutations multiplied by the number of functions from \(k\)
objects to \([1..x]\), while the right-hand side is the number of integer
sequences \((a_1,...,a_n)\) satisfying \(0 \le a_i \le x - 1 + n - i\). (The
\(a_i\) are "backwards" and zero-indexed for historical reasons; our code will
undo this.)

We can produce a permutation and function from such a sequence as follows:

[ ]:

permFunctionFromSequence n x as = foldl f ([], []) $ zip bs [n, n-1..] where
  f (ps, fs) (b, n) | b <= x    = ([n]:ps, b:fs)
                    | otherwise = (ins n (b - x) ps, fs)
  bs = (+1) <$> reverse as
  ins n k (p:ps) | length p >= k = (p0 ++ n:p1) : ps
                 | otherwise     = p : ins n (k - length p) ps
                 where (p0, p1) = splitAt k p

Setting \(x = 1\) shows for all positive integers \(n, k\), the number of
sequences \((a_1, ..., a_n)\) with \(0 \le a_i \le n - i\) containing exactly
\(k\) zeroes is \({n\brack k}\). Or in code:

[ ]:

prop_137 n k =
  stirling1 n k == length (filter ((k ==) . length . filter (0 ==)) $
    sequence $ enumFromTo 0 . (n -) <$> [1..n])

We can build a permutation \(\pi\) in word form more directly from such a
sequence:

[ ]:

permFromInversionTable n as = foldl f [] $ zip (reverse as) [n,n-1..] where
  f p (a, n) = p0 ++ n:p1 where (p0, p1) = splitAt a p

In the sequence, \(a_i\) is the number of entries \(j\) in \(\pi\) to the left
of \(i\) such that \(j > i\). As hinted by our function name, if
\(\pi = b_1 ... b_n\) then \((b_i, b_j)\) is an _inversion_ if \(i < j\) and
\(b_i > b_j\), and the sequence \(a_i\) is called the _inversion table_ of the
permutation \(\pi\), and written \(I(\pi)\). An inversion table is another way
to represent a permutation.

[ ]:

inversionTable p = (\a -> length $ filter (> a) $ takeWhile (/= a) p) <$> p

If \(i(\pi)\) denotes the number of inversions in the permutation \(\pi\), we
see:

\[
\sum_{\pi\in S_n} q^{i(\pi)} = (1 + q)(1 + q + q^2)...(1 + q + ... + q^{n-1})
\]

The _descent set_ \(D(\pi)\) of a permutation \(\pi = a_1 ... a_n\) is defined
by:

[ ]:

descentSet p = map fst $ filter snd $ zip [1..] $ zipWith (>) p $ tail p
-- List comprehension variant.
descentSet' p = [i | i <- [1..length p - 1], p!!(i-1) > p!!i]

For a subset \(S\) of \([1..n-1]\), define \(\alpha(S)\) and \(\beta(S)\) as
follows:

[ ]:

alpha n s = length [p | p <- perms [1..n], descentSet p `isSubsequenceOf` s]
beta n s  = length [p | p <- perms [1..n], descentSet p == s]

Thus an alternative definition of \(\alpha(S)\) is:

[ ]:

alpha' n s = sum $ beta n <$> powerset s

If we write \(S\) as an increasing sequence \(s_1, ..., s_k\), then
\[
\alpha(S) = {n \choose {s_1, s_2 - s_1, ..., n - s_k}}
\]

For a permutation \(\pi\), write \(des(\pi)\) for the number of descents
\(|D(\pi)|\). Then

\[
A_n(x) = \sum_{\pi\in S_n} x^{1+des(\pi)}
\]

is an _Eulerian polynomial_. We write \(A(n, k)\) for the coefficient of \(x^k\)
in \(A_n(x)\), and we call it an _Eulerian number_. Hence:

[ ]:

eulerian n k = length $ filter (((k - 1) ==) . length . descentSet) $ perms [1..n]

eulerianPolys = [[eulerian n k | k <- [1..n]] | n <- [1..]]

The definition of the standard representation of a permutation implies:

\[
n - des(\hat\pi) = \#\{i \in [1..n] | \pi(i) \ge i\}
\]

A number \(i\) is called a _weak excedance_ of \(\pi\) if \(\pi(i)\ge i\),
and an _excedance_ if \(\pi(i) > i\):

[ ]:

weakExcedances  p = map fst $ filter (uncurry (>=)) $ zip p [1..]
excedances      p = map fst $ filter (uncurry (>))  $ zip p [1..]

From:

[ ]:

prop_weakVsStrictExcedances p =
  length (weakExcedances p) == n - length (excedances $ (n + 1 -) <$> reverse p)
  where n = length p
prop_descentCounts p = null p ||
  length (descentSet p) == n - 1 - length (descentSet $ reverse p)
  where n = length p

it follows that the Eulerian number \(A(n, k+1)\) is equal to the number of
permutations of \(S_n\) with \(k\) excedances and also the number of
permutations of \(S_n\) with \(k+1\) weak excedances.

The sum of the descent set of a permutation \(\pi\) is called the _greater
index_ and is denoted \(\iota(\pi)\). It is also called the _major index_ and
denoted \(maj(\pi)\). It turns out:

[ ]:

prop_remarkable n =
  (sort $ sum . inversionTable <$> perms [1..n]) ==
  (sort $ sum . descentSet     <$> perms [1..n])

== Trees ==

A binary tree of numbers has the _heap property_ if each child node exceeds its
parent node. We represent a permutation as a heap as follows:

[ ]:

data Tree a = Node a [Tree a] deriving Show

heap [] = Node Nothing  []
heap p  = Node (Just m) [heap p0, heap p1]
  where (p0, m:p1) = span (/= minimum p) p

Define an element \(w_i\) of a word \(w_1...w_n\) to be

|======================================================================
|a _double rise_ or _double ascent_ | if \(w_{i-1} < w_i < w_{i+1} \)
|a _double fall_ or _double descent_ | if \(w_{i-1} > w_i > w_{i+1} \)
|a _peak_ | if \(w_{i-1} < w_i > w_{i+1} \)
|a _valley_ | if \(w_{i-1} > w_i < w_{i+1} \)
|======================================================================

where we set the sentinels \(w_0 = w_{n+1} = 0\). These four classes of terrain
determine whether a node has no children, a left child, a right child, or both.

From this bijection, we see:

  * The number of binary trees with nodes \([1..n]\) with the heap property
  is \(n!\).

  * The number of such trees with \(k\) nodes with left children is \(A(n, k + 1).\)

  * The number of such trees with \(k\) leaf nodes is equal to the number of such
  trees with \(k\) nodes with two children.

  * The number of such trees that are full (that is, each node has zero or two
  children) with \(2n + 1\) nodes is equal to the number of
  _alternating permutations_ of \(S_{2n+1}\):
+
\[ a_1 > a_2 < a_3 > ... < a_{2n+1} \]

We can also represent a permutation \(\pi\) as an unoriented heap. We prepend the
sentinel value of 0 to our permutation, and start with root node of 0.
Then the children of the node \(n\) are all nodes \(a\) that appear to the right of
\(n\) in \(\pi\) satisfying \(n < a\) and \(b > a\) for all numbers \(b\) that
appear between \(n\) and \(a\) in \(\pi\).

[ ]:

unorientedHeap p = f (0:p) 0 where
  f p n = Node n $ f p <$>
    filter (\a -> n < a && head (dropWhile (> a) q) == a) q
      where q = tail $ dropWhile (/= n) p

Thus:

 * The number of unoriented heaps with \(n+1\) nodes is \(n!\).

 * The number of such trees where the root has \(k\) children is \({n\brack k}\).

 * The number of such trees with \(k\) leaf nodes is \(A(n, k)\).

== Partitions ==

A _partition_ of a natural number \(n\) is a non-increasing sequence of
integers that sums to \(n\). We ignore terminating 0s when deciding equality of
partitions. A partition is like a composition, except the order of the summands
is unimportant.

Write \(p(n)\) for the number of partitions of \(n\), \(p_k(n)\) for the number
of partitions of \(n\) into \(k\) parts, and \(p(j, k, n)\) for the number of
partitions of \(n\) into \(k\) parts with largest part at most \(j\).

We can generate all partitions with:

[ ]:

intPartitions 0 0 = [[]]
intPartitions n k
  | k <= 0 || n <= 0 = []
  | otherwise = ((++ [1]) <$> intPartitions (n - 1) (k - 1)) ++
    (map (+1) <$> intPartitions (n - k) k)

intPartitions 7 4

which gives the recurrence:

\[ p_k(n) = p_{k-1}(n - 1) + p_k(n - k) \]

[ ]:

-- Awkward name to avoid defining `p` at top-level.
pPar 0 0 = 1
pPar n k
  | k <= 0 || n <= 0 = 0
  | otherwise       = pPar (n - 1) (k - 1) + pPar (n - k) k

The following code produces the _Ferrers diagram_ or _Ferrers graph_ of a
partition:

[ ]:

ferrers = unlines . map (`replicate` '*')

If we had drawn squares instead, then we would have a _Young diagram_.

A _partition_ of a finite set \(N\) is a set of
disjoint subsets \(\{B_1,...,B_k\}\) of \(N\) whose union is \(N\).
A subset \(B_i\) is called a _block_.

Roughly speaking, partitions of a set are like partitions of an integer, except
the elements of a set are distinguishable. We can generate them with:

[ ]:

partitions [] 0 = [[]]
partitions [] _ = []
partitions _  0 = []
partitions (x:xs) k =
  (ins x =<< partitions xs k) ++ (([x]:) <$> partitions xs (k - 1))
  where
  ins x = \case  -- Insert x in each block.
    [] -> []
    ys:yss -> ((x:ys):yss) : map (ys:) (ins x yss)

partitions "abcd" 2

Define
\({n \brace k}\)
to be the number of ways to partition \(N\) into \(k\) blocks.
These are called the _Stirling numbers of the
second kind_. (Again, we adopt Knuth's recommendation instead of Stanley's
\(S(n, k)\).)

Our `partitions` function implies:

[ ]:

stirling2 0 0 = 1
stirling2 _ 0 = 0
stirling2 0 _ = 0
stirling2 n k = k * stirling2 (n - 1) k + stirling2 (n - 1) (k - 1)

stirling2 7 4

By induction:

+++++
\[
x^n = \sum_k {n\brace k} x^\underline{k}
\]
+++++

The total number of partitions of an \(n\)-set is called a _Bell number_ and
is denoted \(B(n)\).

[ ]:

bell n = sum $ stirling2 n <$> [0..n]

bell <$> [0..10]  -- [1,1,2,5,15,52,203,877,4140,21147,115975]

To recap:

\[
{0 \choose 0} = {0 \brack 0} = {0 \brace 0} = 1
\]

For \(k > 0\):
\[
{0 \choose k} = {0 \brack k} = {0 \brace k} = 0
\]

For \(n > 0\):
\[
{n \choose 0} = 1
\]

\[
{n \brack 0} = {n \brace 0} = 0
\]

For \(n, k > 0\):

+++++
\[
\begin{align}
{n \choose k} &=&      &{n-1 \choose k} + {n-1 \choose k-1} \\
{n \brack k} &=& (n-1) &{n-1 \brack k} + {n-1 \brack k-1} \\
{n \brace k} &=& k     &{n-1 \brace k} + {n-1 \brace k-1}
\end{align}
\]
+++++

Summations:
\[
\begin{align}
\sum_k {n \choose k} &= 2^n \\
\sum_k {n \brack k} &= n! \\
\sum_k {n \brace k} &= B(n)
\end{align}
\]

== The Twelvefold Way ==

Two functions \(f, g : N \to X\) are _equivalent with \(N\) indistinguishable_
if there exists a bijection \(\pi : N \to N\) such that \(f \circ \pi = g\).

They are _equivalent with \(X\) indistinguishable_ if there exists a bijection
\(\sigma : X \to X\) such that \(\sigma \circ f = g\).

They are _equivalent with \(N\) and \(X\) indistinguishable_ if there exists
bijections \(\pi : N \to N\) and \(\sigma : X \to X\) such that
\(\sigma \circ f \circ \pi = g\).

Given \(n, x\) and two booleans representing the distinguishability of an
\(n\)-set \(N\) and and \(x\)-set \(X\), the following function returns a list
containing the number of functions, injections, and surjections from \(N\) to
\(X\).

[ ]:

numFunctions n x nDist xDist = case (nDist, xDist) of
  (True , True)  -> [x^n, product [x-n+1..x], product [1..x] * s n x]
  (False, True)  -> [multichoose x n, choose x n, multichoose x $ n - x]
  (True , False) -> [sum $ s n <$> [1..x], fromEnum $ n <= x, s n x]
  (False, False) -> [sum $ p n <$> [1..x], fromEnum $ n <= x, p n x]
  where
    p = pPar
    s = stirling2

Ben Lynn blynn@cs.stanford.edu 💡

Contents