Power Series

Consider a power series such as:

\[ \cos x = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + … \]

How would you put it in a computer program?

Doug McIlroy represents power series with lazy lists in Haskell, a language I could hardly speak when I stumbled upon Power series, power serious.

Reading this paper shocked me. What sorcery is this?! How can a handful of succinct lines do so much? And how can a computer possibly understand them? They look like mathematical equations that only make sense to highly trained humans. I felt compelled to keep digging until I could answer these questions.

Today, I’m proud that not only do I know how it all works, but I have also written a Haskell compiler that can run McIlroy’s code on this Jupyter-esque webpage. Some changes are needed because I strayed from standard Haskell. I dislike the Num typeclass; I prefer to categorize mathematical objects like a mathematician. I have a Ring typeclass for things that can be added, subtracted, and multiplied, and there is no requirement for signum or abs, which lack meaning in some rings. Nonetheless, porting my code to standard Haskell is straightforward.

For power series, we have:

instance Ring a => Ring [a] where
  (+) = zipWith (+)
  (-) = zipWith (-)
  (f:ft) * gs@(g:gt) = f*g : map (f*) gt + ft*gs
  fromInteger = (: repeat 0) . fromInteger

How about division? Equational reasoning spits out the answer. Suppose:

f:ft = (q:qt) * gs@(g:gt)

From the definition of multiplication:

f:ft = q*g : map (q*) gt + qt * gs

Thus for nonzero g:

q = f/g
qt = (ft - map (q*) gt)/gs

This is a serviceable definition, which appears in the paper, but McIlroy later refines it:

ft = map (q*) gt + qt * gs
   = map (q*) gt + qt * gt + map (g*) qt
   = qs * gt + map (g*) qt

Hence:

qt = map (/g) (ft - qs * gt)

Adding a case for g = 0 yields the division routine:

instance (Eq a, Ring a, Field a) => Field [a] where
  (0:ft) / (0:gt) = ft/gt
  (f:ft) / (g:gt) = qs where qs = f/g : map ((1/g)*) (ft - qs*gt)

Differentiation and integration from 0 to \(x\) are easy, since \(D x^n = n x^{n-1}\).

diff (_:ft) = zipWith (*) ft [1..]
int fs = 0 : zipWith (/) fs [1..]

And then, magic.

Small is beautiful

I forget how I felt this far into the paper. I might have been unmoved: sure, some expressions may look nicer in Haskell, and maybe laziness makes some ideas easier to express, but is it really that different to other languages? However, my outlook had definitely changed by the time I’d read the next definitions.

We define sine and cosine as power series satisfying:

\[ \begin{align} D \sin x &= \cos x & \sin 0 = 0 \\ D \cos x &= -\sin x & \cos 0 = 1 \\ \end{align} \]

Equivalently:

\[ \begin{align} \sin x = \int_0^x \cos t dt \\ \cos x = 1 - \int_0^x \sin t dt \\ \end{align} \]

In Haskell:

sins = int coss
coss = 1 - int sins

take 10 sins
take 10 coss

Unlike conjuring tricks, I still experience a sense of wonder on seeing this, despite knowing the secret. Jack Rusher echoes my sentiments in his talk, Stop Writing Dead Programs:

"the most beautiful piece of program that I’ve ever seen"
"the best source code in the world"
"I want to cry when I look at this because of how beautiful it is"

Sadly, he then claims it has "nothing to do with software engineering". He has a point: current compilers are unable to turn these two lines into efficient machine code. I myself use CORDIC to compute trigonometric functions.

But is this because of a fundamental limit? If humans can turn elegant equations into quick calculations, surely we can teach computers to do the same. So I say we should view McIlroy’s gems as shining beacons, drawing in computer scientists to work towards systems that routinely transform beautiful mathematics into fast code.

I also claim that software engineering includes rapid prototyping, for which our code is well-suited. We can already use the few lines we’ve written so far to compute trigonometric functions:

eval (f:ft) x = f : map (x*) (eval ft x)

sum $ take 16 $ eval sins (3.141592654/6)
sum $ take 16 $ eval coss (3.141592654/4)

Anyway, back to McIlroy. The exponential function can be defined similarly to the sine and cosine:

exps = 1 + int exps

take 10 exps
fromRational $ sum $ take 10 exps :: Double

The composition of \(f(z)\) and \(g(z)\) is \(f(g(z))\), which we can think of as evaluating a power series at another power series rather than a number. Our code can only compute the answer’s constant term when \(g(0) = 0\), leading to a definition that is similar to eval:

(f:ft) # gs@(0:gt) = f : gt*(ft#gs)

The reversion or inverse of \(f(z)\) is the power series \(r(z)\) satisfying \(f(r(z)) = z\). Much has been published on this operation; Knuth dedicates 4 pages to it near the end of The Art of Computer Programming, Volume 2.

In contrast, McIlroy applies a dash of equational reasoning. From above, we must have \(r(0) = 0\), thus:

(f:ft) # rs@(0:rt) = f : rt*(ft#rs) = 0 : 1

Hence f = 0 and rt = 1 / (ft#rs):

revert (0:ft) = rs where rs = 0 : 1/(ft#rs)

Done! Let’s take it for a spin on:

\[ \tan^{-1} x = \int_0^x \frac{dt}{1 + t^2} \]

We revert to get the tangent power series:

tans = revert $ int $ 1/(1:0:1)
take 10 tans

We confirm \(\tan(x) = \sin(x) / \cos(x)\) for several terms:

take 10 $ tans - sins / coss

Next up are square roots. Suppose \((q(x))^2 = f(x)\). By the chain rule:

\[ Dq = \frac{Df}{2q} \]

If \(f(0) = 1\) then:

\[ q(x) = 1 + \int_0^x \frac{Df(t) dt}{2q(t)} \]

We generalize slightly to obtain a square root routine that works for power series whose first term has coefficient 1 and has an even degree:

sqrt1 = \case
  0:0:ft -> 0 : sqrt1 ft
  fs@(1:ft) -> qs where qs = 1 + int((diff fs)/(2*qs))

Now we can test another identity:

take 10 $ sins - sqrt1(1 - coss^2)

We can do much more with McIlroy’s code, such as solve difficult combinatorics problems.

Ben Lynn blynn@cs.stanford.edu 💡

Power Series

Small is beautiful

Contents