= Get a Life = ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Zoom: Steps: 2^

Presets:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ When I was a kid, I wrote a BASIC program to explore https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life[Conway's Game of Life] on my 8088. It was painfully slow, so I began an assembly rewrite. But I failed at Life: I abandoned the project in frustration because I kept freezing up my system. It was just as well. I believe the only optimizations I had in mind were bit twiddling and focusing on or around live cells. So I was astounded when I later learned of https://en.wikipedia.org/wiki/Hashlife[Bill Gosper's Hashlife algorithm], which miraculously simulates an exponential number of generations in linear time. All the bit twiddling in the world couldn't come close. I eagerly studied any explanations I could find. Unfortunately, although Hashlife's ingredients are elementary in isolation (quadtrees; memoization; interning), some aspects of the mixture confused me. I hoped one day to figure it all out and run Life in the fast lane. I have at last fulfilled this lifelong ambition. Perhaps I acted out of anger, as now and then I'll see a post about coding the Game of Life, only to discover the naive approach lurking behind it. Seriously?! Who cares about the elegant source or whatever if the algorithm is mediocre? [I exaggerate; in truth, I enjoy seeing https://dfns.dyalog.com/c_life.htm[the Game of Life in one line of APL], with comonads, in the Game of Life itself, and so on.] At some point it's easier to quit complaining and fight back. Here, then, is yet another write-up on programming the Game of Life, but one of the minority featuring Hashlife. Batteries included: we walk through link:life.lhs[the entire source for the demo above]. \begin{code} {-# LANGUAGE CPP #-} #ifdef __HASTE__ {-# LANGUAGE PackageImports #-} #endif {-# LANGUAGE LambdaCase, TupleSections, RecordWildCards #-} {-# LANGUAGE FlexibleContexts #-} #ifdef __HASTE__ import "mtl" Control.Monad.State.Strict import Haste.DOM import Haste.Events import Haste.Graphics.Canvas import Data.IORef import Text.Read (readMaybe) #else import Control.Monad.State.Strict #endif import Data.List (find) import Data.Map (Map, (!)) import qualified Data.Map as M \end{code} == A matter of life and death == Life takes places on an infinite grid of cells, each of which is dead or alive. A cell's neighbours are the eight cells surrounding it (https://en.wikipedia.org/wiki/Moore_neighborhood[Moore neighbourhood]). Then: 1. A dead cell with exactly 3 live neighbours comes to life. 2. A live cell with exactly 2 or 3 live neighbours remains alive. 3. Otherwise the cell will be dead. We represent a dead cell with 0 and a live cell with 1. The following function summarizes the rules: \begin{code} nextLife :: Int -> Int -> Int nextLife 0 3 = 1 nextLife 1 2 = 1 nextLife 1 3 = 1 nextLife _ _ = 0 \end{code} Curiously, this is the same as saying a cell is alive exactly when the sum or product of these two numbers is 3. == Quad workout == https://en.wikipedia.org/wiki/Quadtree[Quadtrees] are fundamental to Hashlife. In our rendition, at the lowest level, we represent a single cell with a bit. At the level above, we organize the cells in 2x2 blocks. We represent the 4 bits of a block with an Int. The following functions convert between an Int and the 4 cells in a 2x2 block: \begin{code} data ZNode = ZNode Int Int Int Int deriving (Show, Eq, Ord) pack4 :: Int -> Int -> Int -> Int -> Int pack4 a b c d = a + 2*b + 4*c + 8*d unpack4 :: Int -> ZNode unpack4 n = ZNode (f 1) (f 2) (f 4) (f 8) where f k = div n k `mod` 2 \end{code} We record the 4 cells of a 2x2 block in https://en.wikipedia.org/wiki/Z-order_curve[Z-order]. If we imagine a compass, then the order is NW, NE, SW, SE, the same order a pen travels when we write the letter Z. We use screen coordinates. \begin{code} zorder :: [(Int, Int)] zorder = [(0,0), (1,0), (0,1), (1,1)] \end{code} At the next level above, we organize the 2x2 blocks themselves in 2x2 superblocks. These are represented as a `ZNode`; each number represents a 2x2 block of cells in Z-order. Consider a 4x4 block of labelled cells, where each cell's state is represented with a bit: ------------------------------------------------------------------------ a0 a1 b0 b1 a2 a3 b2 b3 c0 c1 d0 d1 c2 c3 d2 d3 ------------------------------------------------------------------------ We represent the above with `ZNode a b c d` where `a` is the Int with binary representation `a3 a2 a1 a0`, and similarly for the other fields. Given a 4x4 block of cells, the following returns the 2x2 central block of cells of the next generation. The rules imply no cell outside the 4x4 grid may affect this calculation. They also imply this 2x2 central block is the only part of the next generation we may reliably compute. \begin{code} base :: Int -> Int -> Int -> Int -> Int base a b c d = pack4 nw ne sw se where ZNode a0 a1 a2 a3 = unpack4 a ZNode b0 b1 b2 b3 = unpack4 b ZNode c0 c1 c2 c3 = unpack4 c ZNode d0 d1 d2 d3 = unpack4 d nw = nextLife a3 $ sum [ a0, a1, b0 , a2, b2 , c0, c1, d0 ] ne = nextLife b2 $ sum [ a1, b0, b1 , a3, b3 , c1, d0, d1 ] sw = nextLife c1 $ sum [ a2, a3, b2 , c0, d0 , c2, c3, d2 ] se = nextLife d0 $ sum [ a3, b2, b3 , c1, d1 , c3, d2, d3 ] \end{code} == The building blocks of Life == We label each a ZNode with an Int. Then we can represent an 8x8 block of cells as a ZNode, whose fields are now labels of 4x4 cells. We can go further: by induction, for any n, we can represent any 2^n^ x 2^n^ block of cells with a ZNode, which we interpret as a 2x2 block of 2^n-1^ x 2^n-1^ cells. We sometimes call this n the _level_. It is not stored in the ZNode itself, so functions taking ZNodes often need another argument that contains the level. We only store one copy of a ZNode (think https://en.wikipedia.org/wiki/Hash_consing[hash consing]; https://en.wikipedia.org/wiki/String_interning[string interning]; maximal sharing; the flyweight pattern). Life patterns are often repetitive, so a handful of ZNodes sometimes suffice for representing many cells. This is how Hashlife got its name, though we shall use an ordered tree instead of hashing to map ZNodes to Ints. We maintain `Data.Map` of ZNodes to Ints. On encountering a ZNode, we first check this map: if already present, then we reuse its corresponding Int. Otherwise we assign it an unused Int and add it to the map. The Ints start from 16 to avoid colliding with numbers that represent 2x2 blocks of cells. We also maintain an inverse map from an Int to the ZNode it represents. In addition, each ZNode is paired with an Int that represents its future center; see below. Empty cells are a special case. The Int 0 represents empty squares of any level. This comes in handy when we want to embed a given pattern in a large empty space. \begin{code} data Mem = Mem (Map Int (ZNode, Int)) (Map ZNode Int) deriving Show initMem :: Mem initMem = Mem mempty mempty intern :: ZNode -> Int -> State Mem Int intern z future = do Mem m idxs <- get let next = M.size idxs + 16 put $ Mem (M.insert next (z, future) m) $ M.insert z next idxs pure next recall :: Int -> State Mem (ZNode, Int) recall 0 = pure (ZNode 0 0 0 0, 0) recall k = (\(Mem m _) -> m!k) <$> get \end{code} == Double Life == Suppose we are given 8x8 cells. As this happens to be the size of a chessboard, let's label the cells the same way a chess player would: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-chessboard.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Our `base` function takes 4x4 cells and returns the central 2x2 cells one generation into the future. On the top-left quadrant: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-a8d5.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ it computes the following 2x2 cells, where the prime symbol indicates one generation into the future: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-b7c6p.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Similarly, on the 4x4 cells: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-c8f5.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ the `base` function returns: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-d7e6p.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Thus pasting together the result of 9 calls to `base` yields: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-b7g2p.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Similarly, pasting together the result of 4 more calls to `base` yields: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ include::life-c6f3pp.svg[] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ where a double prime indicates 2 generations into the future. To recap, given 8x8 cells, repeated calls to `base` reveals the central 4x4 cells 2 generations into the future. We can scale this up for higher powers of 2. For example, given 256x256 cells, a bunch of recursions gives us the central 128x128 cells exactly 64 generations into the future. We have the perfect conditions for https://en.wikipedia.org/wiki/Memoization[memoization]. Thus we record the future central cells the first time we compute them, and look them up every time thereafter. This is why we associate each ZNode with an Int; the latter references the future central cells. We split off a function `reduce3x3` that applies a function `f` to each 2x2 box in a 3x3 grid, then applies `g` to their results. \begin{code} gosper :: Int -> Int -> Int -> Int -> Int -> State Mem Int gosper 0 a b c d = pure $ base a b c d gosper n a b c d = do (ZNode _ a1 a2 a3, x0) <- recall a (ZNode b0 _ b2 b3, x2) <- recall b (ZNode c0 c1 _ c3, x6) <- recall c (ZNode d0 d1 d2 _, x8) <- recall d x1 <- rec a1 b0 a3 b2 x3 <- rec a2 a3 c0 c1 x4 <- rec a3 b2 c1 d0 x5 <- rec b2 b3 d0 d1 x7 <- rec c1 d0 c3 d2 reduce3x3 rec (memo n) x0 x1 x2 x3 x4 x5 x6 x7 x8 where rec a b c d = memo n a b c d >>= fmap snd . recall reduce3x3 f g x0 x1 x2 x3 x4 x5 x6 x7 x8 = do y0 <- f x0 x1 x3 x4 y1 <- f x1 x2 x4 x5 y2 <- f x3 x4 x6 x7 y3 <- f x4 x5 x7 x8 g y0 y1 y2 y3 memo :: Int -> Int -> Int -> Int -> Int -> State Mem Int memo _ 0 0 0 0 = pure 0 memo 0 a b c d = pure $ pack4 a b c d memo n a b c d = seek >>= maybe create pure where z = ZNode a b c d seek = (\(Mem _ idxs) -> M.lookup z idxs) <$> get create = intern z =<< gosper (n - 1) a b c d \end{code} == Such is Life == We're done! We've presented a complete implementation of Hashlife. However, we need a bit more code to show it off. We define a bookkeeping data type to hold the current pattern's size (it measures 2^lifeSize+1^ x 2^lifeSize+1^ cells), the coordinates of the top-left corner, and its ZNode index. \begin{code} data Life = Life { lifeSize :: Int , lifeOrigin :: (Int, Int) , lifeIndex :: Int , lifeMemory :: Mem } deriving Show \end{code} A straightforward recursion turns a list of points into a quadtree: \begin{code} enliven :: [(Int, Int)] -> Life enliven [] = Life 0 (0, 0) 0 initMem enliven ps = uncurry (Life sz (xmin, ymin)) $ runState (enc sz (xmin, ymin)) initMem where (xs, ys) = unzip ps xmin = minimum xs ymin = minimum ys xmax = maximum xs ymax = maximum ys loggish n = max 0 $ head (filter (\k -> 2^k >= n) [0..]) - 1 sz = loggish $ max (ymax - ymin) (xmax - xmin) + 1 enc _ (ox, oy) | ox > xmax || oy > ymax = pure 0 enc n (ox, oy) = mapM go zorder >>= (\[a,b,c,d] -> memo n a b c d) where go (dx, dy) | n == 0 = pure $ fromEnum $ elem (ox + dx, oy + dy) ps | otherwise = enc (n - 1) (ox + dx*p, oy + dy*p) p = 2^n \end{code} We'd like an inverse to this function, but we first deal with an annoyance. For quadtree nodes at the lowest level, it's irritating that we break a number into 4 bits rather than look it up in a map. We add a function to hide this: \begin{code} visit :: Int -> State Mem ZNode visit k | k < 16 = pure $ unpack4 k | otherwise = fst <$> recall k \end{code} Listing the coordinates of the live cells of a given pattern is now another straightforward recursion: \begin{code} points :: Life -> [(Int, Int)] points Life{..} = evalState (coords lifeSize lifeOrigin lifeIndex) lifeMemory where coords :: Int -> (Int, Int) -> Int -> State Mem [(Int, Int)] coords _ _ 0 = pure [] coords n (x, y) k = do ZNode a b c d <- visit k concat <$> zipWithM go [a,b,c,d] zorder where go p (dx, dy) | n == 0 = pure $ if p == 0 then [] else [(x+dx, y+dy)] | otherwise = coords (n - 1) (x + e*dx, y + e*dy) p e = 2^n \end{code} We perform a quick sanity check with GHCi. We define a pattern: \begin{code} rPentomino :: [(Int, Int)] rPentomino = [(1,0), (2,0), (0,1), (1,1), (1,2)] \end{code} Then confirm: ------------------------------------------------------------------------ points $ enliven rPentomino ------------------------------------------------------------------------ returns the same points. Let's predict the future! The following function takes a 2^n+1^ x 2^n+1^ pattern and returns its 2^n^ x 2^n^ center, 2^n-1^ generations into the future. We just pluck out the memoized future center index and repackage it: \begin{code} scry :: Life -> Life scry Life{..} = Life n (ox + p, oy + p) x lifeMemory where (ox, oy) = lifeOrigin n = lifeSize - 1 p = 2^n x = snd $ evalState (recall lifeIndex) lifeMemory \end{code} We find: ------------------------------------------------------------------------ points $ scry $ enliven rPentomino ------------------------------------------------------------------------ returns the single point `[(1,2)]`. The r-pentomino fits in a 4x4 square, so `scry` returns the middle 2x2 square 1 generation into the future. The result is correct, but underwhelming. To travel further through time and space, we repeatedly pad our pattern with empty squares before calling `scry`. Here, we find it handy that 0 represents an empty square of any size. For the sake of code reuse, to double the dimensions of a pattern, we surround it with empty squares to make a 3x3 grid, then reduce with `middle`, a function which takes a 4x4 grid and discards all but the central 2x2 grid. \begin{code} pad :: Life -> Life pad Life{..} = Life { lifeSize = n , lifeOrigin = (ox - p, oy - p) , lifeIndex = i' , lifeMemory = st } where (ox, oy) = lifeOrigin p = 2^lifeSize n = lifeSize + 1 i = lifeIndex (i', st) = runState (reduce3x3 (middle n) (memo n) 0 0 0 0 i 0 0 0 0) lifeMemory middle :: Int -> Int -> Int -> Int -> Int -> State Mem Int middle n a b c d = do ZNode _ _ _ a3 <- visit a ZNode _ _ b2 _ <- visit b ZNode _ c1 _ _ <- visit c ZNode d0 _ _ _ <- visit d memo (n - 1) a3 b2 c1 d0 \end{code} Rather than return a list of points, let's draw the central 64x64 cells: \begin{code} #ifndef __HASTE__ main :: IO () main = putStr $ unlines $ [[if elem (c, r) ps then '#' else ' ' | c <- [-32..32]] | r <- [-32..32]] where ps = points $ scry $ iterate pad (enliven rPentomino) !! 10 #endif \end{code} What should we see? Well, we started with a 4x4 pattern, then padded it 10 times, yielding a 2^12^ x 2^12^ pattern. Hashlife should return the central 2^11^ x 2^11^ cells 2^10^ = 1024 generations into the future, and we display the central 64x64 cells: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ------------------------------------------------------------------------ ## # # # # # # # # ## ## ## # # # # # ### ### # # # # # # # # # # ## # # ### ## # # # ## # ## # ## ## # ## #### # # ## # # # # # # # # # # # ## # ## # ## ## # # # # ### ## ------------------------------------------------------------------------ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ This took under a quarter of a second on my laptop. Evidently we take less than 250 microseconds per generation. If we bump up the 10 to 20, we find the running time hardly changes, so we're somehow beating 250 nanoseconds a generation. Bump it up some more, and it seems we compute each generation in less than a CPU clock cycle! Of course, in reality, the pattern has stabilized and we're just recycling memoized answers. == Baby steps == Hashlife is great for exploring the distant future, but what if we want to take it slow and watch the pattern change one generation at a time? No problem. Instead of traveling some power of 2 generations into the future in each reduction step, we simply travel 0 generations, except for the base case which remains the same as before, namely, it computes one generation ahead. More generally, to step forward 2^k^ generations at a time for any k, we travel 0 generations as we reduce until we reach the desired level, then switch to the original Hashlife algorithm. We start with a more general version of `gosper` that is like the big brother of `reduce3x3`: a function that slides a 2x2 window over a 4x4 grid and applies a function `f` to each, followed by applying `g` to the 3x3 results. We could rewrite `gosper` to use this function, but that would result in more lookups, and we prefer to have the code above to stand on its own. \begin{code} reduce4x4 f g a b c d = do ZNode a0 a1 a2 a3 <- visit a ZNode b0 b1 b2 b3 <- visit b ZNode c0 c1 c2 c3 <- visit c ZNode d0 d1 d2 d3 <- visit d x0 <- f a0 a1 a2 a3 x1 <- f a1 b0 a3 b2 x2 <- f b0 b1 b2 b3 x3 <- f a2 a3 c0 c1 x4 <- f a3 b2 c1 d0 x5 <- f b2 b3 d0 d1 x6 <- f c0 c1 c2 c3 x7 <- f c1 d0 c3 d2 x8 <- f d0 d1 d2 d3 g x0 x1 x2 x3 x4 x5 x6 x7 x8 \end{code} We begin by recursively reducing with `middle`, as this function extracts the center cells with no time travel. Upon reaching the level `k` we look up the future center. \begin{code} baby :: Int -> Life -> Life baby k Life{..} = Life { lifeSize = sz , lifeOrigin = (ox + p, oy + p) , lifeIndex = i' , lifeMemory = st } where (ox, oy) = lifeOrigin sz = lifeSize - 1 p = 2^sz go _ 0 0 0 0 = pure 0 go n a b c d | n <= k = memo (n + 1) a b c d >>= fmap snd . recall | otherwise = reduce4x4 (middle n) (reduce3x3 (go (n - 1)) (memo n)) a b c d (i', st) = runState (visit lifeIndex >>= \(ZNode a b c d) -> go sz a b c d) lifeMemory \end{code} There may be live cells on the edge of a pattern, so before we step, we pad it with enough empty cells to guarantee no live cells are lost after reduction. At the least, this requires quadrupling both dimensions. Afterwards, the extra room may prove unnecessary. In an interactive setting, we repeatedly display the pattern after advancing some number of generations, so to avoid runaway padding, we shrink the pattern when possible. We look for a 2x2 box in a 4x4 grid such that all cells outside the box are dead. If one exists, then we discard all but the box and recurse. \begin{code} shrink :: Life -> Life shrink Life{..} = uncurry ($) $ runState (go lifeSize lifeOrigin lifeIndex) lifeMemory where f a b c d = pure $ ZNode a b c d zsum (ZNode a b c d) = a + b + c + d go 0 d k = pure $ Life 0 d k go n (dx, dy) k = do ZNode a b c d <- visit k reduce4x4 f g a b c d where g x0 x1 x2 x3 x4 x5 x6 x7 x8 = let tot = sum $ zsum <$> [x0, x2, x6, x8] xs = [x0,x1,x2,x3,x4,x5,x6,x7,x8] xds = zip xs [0..] in case find ((tot ==) . zsum . fst) xds of Just (ZNode a b c d, i) -> let (y, x) = divMod i 3 in go (n-1) (dx + x*2^(n-1), dy + y*2^(n-1)) =<< memo (n-1) a b c d Nothing -> pure $ Life n (dx, dy) k run :: Int -> Life -> Life run k lf@Life{..} = shrink $ baby k $ iterate pad lf !! n where n = max 2 $ k + 1 - lifeSize \end{code} Our display is limited to a viewport, so we refine our `points` function to prune cells outside a given rectangle. We also use difference lists to avoid costly concatenation. \begin{code} -- | Assumes x0 y0 even, x1 y1 odd, x0 < x1, y0 < y1. crop :: (Int, Int) -> (Int, Int) -> Life -> [(Int, Int)] crop (x0, y0) (x1, y1) Life{..} = evalState (go lifeSize lifeOrigin lifeIndex) lifeMemory [] where go _ _ 0 = pure id go n (x, y) k | x > x1 || y > y1 || x + 2*e <= x0 || y + 2*e <= y0 = pure id | otherwise = do ZNode a b c d <- visit k foldr (.) id <$> zipWithM f [a,b,c,d] zorder where f p (dx, dy) | n == 0 = pure $ if p == 0 then id else ((x+dx, y+dy):) | otherwise = go (n - 1) (x + e*dx, y + e*dy) p e = 2^n \end{code} Lastly, we throw together a makeshift UI for the interactive demo at the top of this page: \begin{code} #ifdef __HASTE__ main :: IO () main = withElems ["canvas", "goB", "level", "customB", "customT", "zoomDown", "zoomUp"] $ \[canvasE, goB, lvl, customB, customT, zoomUp, zoomDown] -> do Just canvas <- fromElem canvasE lf <- newIORef $ enliven rPentomino zoomRef <- newIORef 4 let snapshot = do zoom <- readIORef zoomRef let w = div 320 (zoom * 2) h = div 240 (zoom * 2) cell (x', y') = renderOnTop canvas $ fill $ rect (x, y) (x + z, y + z) where x = fromIntegral $ zoom*(x' + w) y = fromIntegral $ zoom*(y' + h) z = fromIntegral zoom render canvas $ setFillColor (RGB 0 0 0) mapM_ cell . crop (-w,-h) (w-1,h-1) =<< readIORef lf setup but pat = do Just b <- elemById but void $ b `onEvent` Click $ const $ do writeIORef lf $ enliven pat snapshot snapshot setup "acorn" [(1,0),(3,1),(0,2),(1,2),(4,2),(5,2),(6,2)] setup "rpent" rPentomino setup "pihept" [(0,0),(1,0),(2,0),(0,1),(0,2),(2,1),(2,2)] setup "rabbits" [(0,0),(4,0),(5,0),(6,0),(0,1),(1,1),(2,1),(5,1),(1,2)] void $ customB `onEvent` Click $ const $ do ps <- maybe [] id . readMaybe <$> getProp customT "value" writeIORef lf $ enliven ps snapshot void $ goB `onEvent` Click $ const $ do n <- maybe 0 id . readMaybe <$> getProp lvl "value" setProp lvl "value" $ show n modifyIORef lf $ run n snapshot void $ zoomUp `onEvent` Click $ const $ do modifyIORef zoomRef $ max 1 . (`div` 2) snapshot void $ zoomDown `onEvent` Click $ const $ do modifyIORef zoomRef $ min 16 . (*2) snapshot #endif \end{code} == A better Life == We could replace our persistent tree-based maps with mutable hash maps in the ST monad. I avoided this to cause less trouble when compiling with Haste, which I use to build the JavaScript version. It may be cleaner to split off the cached future center into another map. And for faster baby steps, perhaps we could memoize multiple center blocks at different points in the future and only compute futures lazily. It turns out we can remove the `pack4` and `unpack4` mess (and use our generic ZNode scheme to hold 4 bits) at the cost of some performance; about 10% on the r-pentomino. The code would be marginally cleaner, but the base case is still special, and our write-up would need to mention interning earlier. On the other hand, to speed up our program, we could double down on bit twiddling and make 8x8 the lowest level. We could place entries of a 4x4 or 3x3 matrix in an array instead of making them function arguments and writing out everything explicitly. See https://www.ioccc.org/2019/dogon/apholife.c[Gil Dogon's annotated C implementation of Hashlife].