Euclid's Algorithm vs. Euclid's Geometry
Euclid alone has looked on Beauty bare. On the other hand, Euclid’s algorithm alone shrouds the beauty of Euclidean geometry in a fog of algebra, yet is surprisingly effective at providing answers:
Theorem:
Divide and conquer
Euclid’s algorithm is powered by a division that finds a remainder smaller than the divisor. Traditionally, we start from two natural numbers \(s\) and \(p \ne 0\), and find \(q, r\) such that:
with \(r < p\). We iterate until we reach the smallest possible remainder, namely zero, and in doing so we find the answer seek, which in this case is the greatest common divisor of \(s\) and \(p\).
Euclid’s algorithm also works for polynomials if the coefficients are rational, or from some other field. In this setting, we measure size by looking at the degree of a polynomial, and once again the remainder polynomial is guaranteed to decrease until we reach zero. Namely, given the dividend and divisor polynomials \(s(x)\) and \(p(x) \ne 0\), we find quotient and remainder polynomials \(q(x), r(x)\) such that:
with \(\deg r(x) < \deg p(x)\). (Here, we consider the degree of 0 to be negative.)
But what if the coefficients are integers, or themselves polynomials in other variables? Then the first step of polynomial long division only works if the leading coefficient of the dividend is an exact multiple of the leading coefficient of the divisor. Even then, we run into the same issue in the next step of the division, and again require compatible leading coefficients.
This suggests loosening the rules. We permit multiplying the dividend by some constant, that is, we allow the choice of some constant factor \(a\) in the following pseudodivision:
As before \(p \ne 0\) and we want \(\deg r < \deg p\).
If we choose \(a\) to be a sufficiently high power of the leading coefficient of the divisor \(p(x)\), then pseudodivision suceeeds, though with finesse we can sometimes scale up less.
Wu’s method uses the Euclidean algorithm with pseudodivision to solve Euclidean geometry problems. See chapter 5 of John Harrison, Handbook of Practical Logic and Automated Reasoning.
{# LANGUAGE CPP #}
import Data.Function (on)
import Data.Either (partitionEithers)
import Data.List (minimumBy, intersperse, nub, sortOn)
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe (catMaybes)
#ifdef __HASTE__
import Text.Parsec
import Control.Monad (void)
import Haste.DOM
import Haste.Events
import Haste.Foreign (ffi)
import Haste.JSString (pack)
lookupMax m = if M.null m then Nothing else Just $ M.findMax m
type Parser = Parsec String ()
digitChar = digit; letterChar = letter
anySingleBut = noneOf . pure; some = many1
#else
import Text.Megaparsec
import Text.Megaparsec.Char
lookupMax = M.lookupMax
type Parser = Parsec () String
#endif
Multivariate Madness
We represent a monomial (product of variables) with a map of variables to their multiplicities:
type Var = String
type Mo = Map Var Int
We represent a polynomial with a map of monomials to their coefficients:
newtype Poly = Poly (Map Mo Integer) deriving Eq
In some applications, the monomials should be ordered by (combined) degree. Fortunately, we won’t need this.
Unfortunately, we do need the ability to view a multivariate polynomial as a univariate polynomial in some given variable x. We represent a univariate polynomial with a map from the degree of a nonzero term to its coefficient, which is a multivariate polynomial from which x should be absent.
type Univ = Map Int Poly
We now have three varieties of related maps bouncing around our code. It might be best to hide the maps behind data types and use a Ring typeclass so we can reuse various mathematical operators. However, this requires more declarations than I’d prefer for a toy demo. I settled on only hiding the map for the multivariate polynomials. I found this clear enough.
We give the multivariate polynomials pretty binary operators. In contrast, operations on monomials are mostly inlined.
noZ :: (Num n, Eq n) => Map k n > Map k n
noZ = M.filter (/= (fromIntegral 0))
pZero = Poly mempty
pOne = Poly $ M.singleton mempty 1
pIntegral n = Poly $ M.singleton mempty $ fromIntegral n
pVar v = Poly $ M.singleton (M.singleton v 1) 1
pNull (Poly p) = M.null p
pNegate (Poly p) = Poly $ M.map negate p
(Poly p) .+ (Poly q) = Poly $ noZ $ M.unionWith (+) p q
p . q = p .+ pNegate q
(Poly p) .* (Poly q) = Poly $ noZ $ M.fromListWith (+)
[(k1 `moMul` k2, v1 * v2)  (k1, v1) < M.toList p, (k2, v2) < M.toList q]
where moMul xs ys = noZ $ M.unionWith (+) xs ys
pSetZero v (Poly p) = Poly $ M.filterWithKey (\xs _ > M.notMember v xs) p
Operations on univariate polynomials are similar. One difference arises in multiplication: instead of multiplying monomials, we add degrees.
uCanon = M.filter $ \(Poly p) > not $ M.null p
uConst a = uCanon $ M.singleton 0 a
uAdd p q = uCanon $ M.unionWith (.+) p q
uNegate p = M.map pNegate p
uSub p q = uAdd p $ uNegate q
uMul p q = uCanon $ M.fromListWith (.+)
[(k1 + k2, v1 .* v2)  (k1, v1) < M.toList p, (k2, v2) < M.toList q]
Now for pseudodivision on polynomials, the star of the show. Let s and p be univariate polynomials. Let a be the leading coefficient of the divisor. Then our next function returns k and r such that
for some quotient q.
Each step of the long division involves \(x^{mn}\) times the divisor times an appropriate constant, where \(m\) is the degree of the dividend and \(n\) is the degree of the divisor.
We optimize slightly: if the leading coefficients happen to match then we can skip one step of scaling up the dividend by a.
euclid :: Univ > Univ > (Int, Univ)
euclid s p = go 0 s where
(n, a) = M.findMax p
go k s
 M.null s = (k, s)
 m < n = (k, s)
 a == b = go k $ s `uSub` p'
 otherwise = go (k + 1) $ (uConst a `uMul` s) `uSub` (uConst b `uMul` p')
where
(m, b) = M.findMax s
p' = M.mapKeysMonotonic (+ (m  n)) p
The respect function views a multivariate polynomial as a univariate polynomial with respect to a given variable. I only used its inverse disrespect during testing.
respect :: Var > Poly > Univ
respect v (Poly p) = M.map Poly $ M.fromListWith M.union $ go <$> M.toList p
where
go (xs, coeff) = case M.lookup v xs of
Nothing > (0, M.singleton xs coeff)
Just k > (k, M.singleton (M.delete v xs) coeff)
disrespect :: Var > Univ > Poly
disrespect v u = Poly $ M.fromList $
[(if k > 0 then M.insert v k xs else xs, r)
 (k, Poly p) < M.toList u, (xs, r) < M.toList p]
Triangles in Algebra
Cartesian coordinates transform questions of geometry into questions of algebra. Consider the theorem: if M is the midpoint of AC, and BM is perpendicular to AC, then AB = AC. We translate it as follows.
If:

\(A_x + C_x  2 M_x = 0\)

\(A_y + C_y  2 M_y = 0\)

\((A_x  C_x) (M_x  B_x) + (A_y  C_y) (M_y  B_y) = 0\)
Then:

\((A_x  B_x)^2 + (A_y  B_y)^2  (B_x  C_x)^2  (B_y  C_y)^2 = 0\)
More generally, theorems of Euclidean geometry often take the following form: when the multivariate polynomials \(p_1, ..., p_n\) are all zero, the multivariate polynomial \(p\) is zero.
Suppose the variable \(x\) appears in at least two of the \(p_1,...,p_n\). We view them as univariate polynomials \(f(x)\) and \(g(x)\), labeled so that \(\deg f \ge \deg g\). By pseudodivision, we can scale up \(f\) and subtract a multiple of \(g\) to produce a polynomial \(f'\) with \(\deg f' < \deg g\).
The zeros of \(f\) and \(g\) are also zeros of \(f'\), so we may replace \(f\) with \(f'\). This is like Gaussian elimination except irreversible as the converse may be untrue. Recall the scaling factor possibly contains variables that we are temporarily viewing as constants, so it may really be a polynomial with its own zeros.
Iterating this removes \(x\) from all but one polynomial in \(p_1,...,p_n\). Euclid’s algorithm strikes again! That is, we have accomplished a noteworthy feat by chasing smaller and smaller remainders.
We repeat on the remaining \(n  1\) polynomials: we pick some other variable \(y\) and eliminate it from all but one of them, and so on. After \(n\) repetitions we reach a triangular form \(p_1,...,p_k\): for some variables \(x_1,...,x_n\), if we view all other variables as constants, then the polynomial \(p_k\) only contains the variables \(x_1,...,x_k\).
Our triangulate function eliminates variables in the order they appear in a given list. It returns a list of polynomials viewed as univariate polynomials in the given variables. The variable names are recorded with their corresponding polynomials.
For pseudodivision we choose a divisor from the candidates of lowest degree as heuristically this ought to reduce the degrees of the other polynomials most rapidly.
triangulate :: [Var] > [Poly] > [(Var, [Univ])]
triangulate [] [] = []
triangulate [] ps = error "leftovers"
triangulate (v:vt) ps = go [] $ respect v <$> ps where
findConstant p = case lookupMax p of
Nothing > Left pZero
Just (0, c) > Left c
_ > Right p
go done todo
 length ncs <= 1 = (v, ncs) : triangulate vt (cs ++ done)
 otherwise = go (cs ++ done) $ p : filter (not . M.null) (map (\s > snd $ euclid s p) ncs)
where
(cs, ncs) = partitionEithers $ findConstant <$> todo
p = minimumBy (compare `on` (fst . M.findMax)) ncs
Let us state our transformed problem: given a triangular set of polynomials \(p_1,...,p_n\) with variables \(x_1,...,x_n\) where all other variables are viewed as constants, our goal is to show that when they are all zero, \(p\) is also zero.
Once again, we call upon pseudodivision and recursion. Due to triangulation, only \(p_n\) can possibly shed any light on the effect of \(x_n\) on our answer. Viewing our polynomials as univariate in \(x_n\), we pseudodivide:
When \(p_n(x_n) = 0\), we have \(p(x_n) = 0\) if \(a \ne 0\) and \(r(x_n)\) is the zero polynomial, that is, each coefficient of \(r(x_n)\) is zero. Again, the implication is oneway; these conditions are sufficient, but may not be necessary.
We record \(a \ne 0\), and as for the coefficients, each is a polynomial not containing \(x_n\), so we can recursively generate more conditions with the triangular set \(p_1,...p_{n1}\).
We bottom out when nothing remains in the triangular set. At this point, we have no choice but to simply generate the condition \(r = 0\). As an extreme example, if asked to prove \(AB = CD\) with no premises, then the triangular set is initially empty, and we simply return the condition \(AB = CD\).
Typically, Wu’s method generates conditions that prevent degeneracy. For example, an equation might state that a certain length is nonzero.
genDegens :: [(Var, [Univ])] > Poly > [(Poly, Bool)] > [(Poly, Bool)]
genDegens [] p acc = (p, False):acc
genDegens ((v, qs):tri) p acc
 pNull p = acc
 null qs = genDegens tri p acc
 k == 0 = acc'
 otherwise = (snd $ M.findMax q, True):acc'
where
q = head qs
(k, r) = euclid (respect v p) q
acc' = foldr (genDegens tri) acc $ M.elems r
Popularity Contest
The order we process variables during triangulation affects the speed of the algorithm, thus our wu function takes a vars list so the caller can choose the order.
The results of Euclidean geometry are invariant under translation and rotation, so we may choose one of the points to lie at the origin, and another one of the points to the lie on one of the axes, or more generally, two points to lie on one axis, and a point to lie on the other. If the theorem has certain constraints, we may zero even more variables. Accordingly, our wu function also accepts a list of variables to replace with zero, which should be disjoint from the vars list.
Scaleinvariance suggests we could replace another variable with unity, but we must take care when points can possibly coincide. We pass on this idea.
We allow multiple polynomials in the conclusion. We prove each is zero separately.
wu vars zeros antes concls =
(flip (,) False . pVar <$> zeros) ++
concat [genDegens tri p []  p < unpop concls]
where
tri = triangulate vars $ unpop antes
unpop = map $ \p > foldr pSetZero p zeros
Heuristically, for a general problem, it seems we should always select the variable that appears in the fewest polynomials, so there is less to eliminate, and we should zero out the most frequently appearing variables to reduce our workload. We ensure the three winners of the popularity contest contain at least one \(x\)coordinate and at least one \(y\)coordinate to avoid losing generality.
wuPop antes concls = wu (reverse rest) pops antes concls where
fvs = map (\(Poly p) > nub $ concatMap M.keys $ M.keys p) antes
freqs = foldr (M.unionWith (+)) mempty $
map (M.fromList . map (flip (,) 1)) fvs
byPop = map fst $ sortOn snd $ M.toList freqs
(pre0, (popx:post0)) = break ((== 'x') . last) $ reverse byPop
(pre1, (popy:post1)) = break ((== 'y') . last) $ pre0 ++ post0
(pop3:rest) = pre1 ++ post1
pops = [popx, popy, pop3]
Fermat and Descartes
We provide one grammar for polynomials, and a higherlevel macro grammar so we can type things like "line A B C", which automatically expands to a polynomial representing that these points are collinear.
The last line is the conclusion of the theorem, and the other lines are the antecedents. A single line may expand into multiple polynomials.
macroDefs =
[ ("line", ["(1_x  2_x) * (2_y  3_y)  (1_y  2_y) * (2_x  3_x)"])
, ("para", ["(1_x  2_x) * (3_y  4_y)  (1_y  2_y) * (3_x  4_x)"])
, ("perp", ["(1_x  2_x) * (3_x  4_x) + (1_y  2_y) * (3_y  4_y)"])
, ("mid", ["2_x + 3_x  2 * 1_x", "2_y + 3_y  2 * 1_y"])
, ("intersect",
[ "(1_x  2_x) * (2_y  3_y)  (1_y  2_y) * (2_x  3_x)",
"(1_x  4_x) * (4_y  5_y)  (1_y  4_y) * (4_x  5_x)"])
, ("len", ["(1_x  2_x)^2 + (1_y  2_y)^2  (3_x  4_x)^2  (3_y  4_y)^2"])
, ("sumsq", ["(1_x  2_x)^2 + (1_y  2_y)^2 + (3_x  4_x)^2 + (3_y  4_y)^2  (5_x  6_x)^2  (5_y  6_y)^2"])
, ("same", ["1_x  2_x", "1_y  2_y"])
]
Fermat and Descartes both pioneered analytic geometry. Fermat started from algebraic equations, while Descartes started from geometric curves, so we name our parsers accordingly.
Thus fermat parses a language representing multivariate polynomials and returns a Poly, while descartes parses a macro language describing geometric constraints and returns strings of our polynomial language.
spc = many $ char ' '
fermat :: Parser Poly
fermat = expr where
sp = (<* spc)
spch = sp . char
expr = foldl (flip ($)) <$> term <*> many
( (.+) <$> (spch '+' *> term)
<> (flip (.)) <$> (spch '' *> term)
)
term = foldl (.*) <$> pwr <*> many (spch '*' *> pwr)
pwr = do
a < atom
option a $ (iterate (a .*) pOne !!) <$> sp (spch '^' *> (read <$> some digitChar))
atom = between (spch '(') (spch ')') expr
<> pIntegral . read <$> sp (some digitChar)
<> pVar <$> var
var = sp $ some $ letterChar <> char '_'
macro :: Parser ([String], [String])
macro = go . catMaybes <$> some line where
go ls = (concat $ init ls, last ls)
line = spc *> (newline *> pure Nothing <> nonEmpty <* newline)
nonEmpty = do
mac < some (anySingleBut ' ') <* spc
case mac of
"raw" > Just . (:[]) <$> some (anySingleBut '\n')
"#" > some (anySingleBut '\n') *> pure Nothing
_ > Just <$> do
args < some $ some letterChar <* spc
let
sub (n:t@('_':_)) = args !! (read [n]  1) ++ sub t
sub (h:t) = h:sub t
sub x = x
maybe (fail mac) (pure . map sub) $ lookup mac macroDefs
descartes :: String > Either String ([Poly], [Poly])
descartes s = either (Left . show) Right $ do
(antes, concls) < parse macro "" s
(,) <$> mapM (parse fermat "") antes <*> mapM (parse fermat "") concls
Who’s a Pretty Polynomial?
We prettyprint equations in LaTeX format, so MathJax can render them above.
prMo xs = foldr (.) id $ intersperse (' ':) $ (\(x, k) > (x++) . prPow k) <$> M.toList xs
where
prPow 1 = id
prPow k = showString "^" . shows k
prettyPoly (Poly p) = (if snd h < 0 then ('':) else id) . go h .
foldr (.) id (map (\(k, v) > ((if v < 0 then "  " else " + ")++) . go (k, v)) t)
where
go (k, v)
 M.null k = shows $ abs v
 otherwise = prCoeff (abs v) . prMo k
prCoeff 1 = id
prCoeff n = shows n . (' ':)
(h:t) = M.toList p
prettyEqn (p, b) = prettyPoly p . showString (if b then " \\ne 0" else " = 0")
I called the following function from GHCi extensively during development:
solve s = putStr $ either show (($ "") . foldr (.) id .
map ((. ('\n':)) . prettyEqn) . uncurry wuPop) $ descartes s
Frontend
Barebones code for a barebones user interface:
#ifdef __HASTE__
renderEqns es = ("<ul>"++)
. foldr (.) id [("<li>\\("++) . prettyEqn e . ("\\)</li>"++)  e < es]
. ("</ul>"++)
main :: IO ()
main = withElems ["theorem", "out", "wu"] $ \[iEl, oEl, wuB] > do
let
setTheorem s = do
setProp oEl "innerHTML" "<p><i>Push the button!</i></p>"
setProp iEl "value" s
handle buttonId s = do
Just b < elemById buttonId
void $ b `onEvent` Click $ const $ setTheorem s
handle "isosceles" isosceles
handle "centroid" centroid
handle "oops" centroidOops
handle "two_one" twoToOne
handle "orthocenter" orthocenter
handle "simson" simson
handle "pythag" pythag
handle "desargues" desargues
handle "area" area
void $ wuB `onEvent` Click $ const $ do
s < getProp iEl "value"
case descartes s of
Left e > setProp oEl "innerHTML" $ "<pre>" ++ show e ++ "</pre>"
Right (as, cs) > do
setProp oEl "innerHTML"
$ ("<p>"++)
. ("If:"++)
. renderEqns (flip (,) False <$> as)
. ("then:"++)
. renderEqns (flip (,) False <$> cs)
. ("provided"++)
. renderEqns (wuPop as cs)
. ("(up to translation and rotation)</p>"++)
$ ""
ffi $ pack "MathJax.typeset"
setTheorem isosceles
#endif
Geometric Algebra
Having been wowed by conformal geometric algebra as well as Wu’s method, I naturally wondered what might happen if we mixed the two. Could we tackle more problems? Would proofs be shorter and simpler? Could we look on Beauty bare?
It turns out others thought of this years ago. For example, see Li, Automated Theorem Proving in the Homogeneous Model with Clifford Bracket Algebra and other papers.