Haskell is one of my favourite programming languages because it allows me to write expressive code. Because of this and the correctness guarantees it provides, I chose to use it for Advent of Code—an advent calendar for programming problems.
Every day from 1 December, a new programming problem is released, increasing in difficulty as Christmas approaches. I used the challenge as programming Katas to hone my coding skills.
Day 18’s problem is interesting because of its recursive structure. On the surface, the nested bracketed expressions, which the problem focuses on, are complex. However, realising that they can be thought of as binary trees makes the problem much simpler. The recursive structure of the expressions makes them very easy to parse using a monadic parser in Haskell. Trees work well as immutable data, and Haskell’s features make it easy to implement new immutable data types and process them recursively.
Day 18: Structural recursion on immutable trees
On day 18 of Advent of Code, you encounter snailfish who can help you find the keys to Santa’s sleigh, but only if you solve some maths problems for them. The maths problems are represented as bracketed expressions that need to be parsed and interpreted.
To solve this problem, I first defined a data type to model the snailfish maths problems as a tree. Then, I parsed them into this model. Because of the recursive tree structure of these maths problems, parsing them was easier than the parsing required for Day 16. The harder part of this problem was an intricate sequence of update steps. Fortunately, Haskell’s idiomatic structural recursion on immutable trees made this more straightforward than it might have been.
Recursive immutable trees
Snailfish numbers are bracketed expressions which look like this:
These expressions have a tree-like structure where each nested pair of brackets can be thought of as a subtree, and each integer can be thought of as a leaf in a tree. Trees can be efficiently manipulated without mutation, and Haskell has various features that make it easy to work with immutable data:
- Data types are immutable by default
- The standard library comes with immutable data structures built in
- The type system distinguishes between pure functions and functions which perform mutation
Using immutable data structures avoids common programming errors and enhances the safety of Haskell code.
The data model I chose for snailfish numbers is a straightforward recursive tree structure:
data Tree = Node Int Tree Tree | Leaf Int
This definition declares that a Tree is either a Node with two subtrees or a Leaf with a value. Using this model, the snailfish number above would be represented as:
Node 2 (Leaf 0) (Node 1 (Leaf 7) (Leaf 5))
The integer stored in a Node is the height of the tree. The height of trees is important for solving the problem, so it makes sense to store this in each subtree and avoid computing it repeatedly.
A recursive parser
Once again, an important step is parsing the snailfish numbers into the data model. Using parsec the parser is straightforward and declarative:
parseTree :: GenParser Char st Tree parseTree = parseLeaf <|> parseNode parseLeaf :: GenParser Char st Tree parseLeaf = Leaf <$> int parseNode :: GenParser Char st Tree parseNode = do _ <- char '[' left <- parseTree _ <- char ',' right <- parseTree _ <- char ']' return (Node (1 + (treeHeight left `max` treeHeight right)) left right) int :: GenParser Char st Int int = read <$> many1 digit treeHeight :: Tree -> Int treeHeight (Node h _ _) = h treeHeight (Leaf _) = 0
The definition of
parseTree says to try to parse a leaf and if that fails then to parse a node. Parsing a leaf is as simple as parsing an integer. Parsing a node requires reading a bracket, parsing the left Tree, reading a comma, parsing the right Tree and closing the bracket, and then constructing a Node with the two subtrees.
Immutable updates using recursion
The problem requires adding snailfish numbers which involve a few steps. First, the trees are added which simply involves combining them into a new tree:
addTrees :: Tree -> Tree -> Tree addTrees one two = Node (1 + (treeHeight one `max` treeHeight two)) one two
Then two steps, taken from the problem statement, are repeatedly applied to the expression until neither is required any more:
- If any pair is nested inside four pairs, the leftmost such pair explodes.
- If any regular number is 10 or greater, the leftmost such regular number splits
The rules given by Advent of Code for splitting an expression are:
To split a regular number, replace it with a pair; the left element of the pair should be the regular number divided by two and rounded down, while the right element of the pair should be the regular number divided by two and rounded up. For example, 10 becomes [5,5], 11 becomes [5,6], 12 becomes [6,6], and so on.
Splitting can be done with simple structural recursion on the tree data type.
split :: Tree -> (Tree, Bool) split l@(Leaf v) = if v >= 10 then let (d, r) = v `divMod` 2 in ((Node 1 (Leaf d) (Leaf (d + r))), True) else (l, False) split n@(Node _ l r) = case (split l, split r) of ((_, False), (_, False)) -> (n, False) ((newL, True), _) -> (Node (1 + (treeHeight newL `max` treeHeight r)) newL r, True) (_, (newR, True)) -> (Node (1 + (treeHeight l `max` treeHeight newR)) l newR, True)
The function returns a tuple (Tree, Bool) of the tree that results after splitting and a boolean indicating whether a split occurred. The boolean is used to detect whether the simplification process is finished.
In the case of a Leaf, if its value is ten or more, then create a new Node and return it along with True. Otherwise, return the Leaf with False. In the case of a Node, the leftmost node that requires splitting must be split, so first try to split the left Tree recursively. If that can’t be split, then recursively split the right Tree. If nothing was split, return the original Node; otherwise, combine the new subtree with the old subtree and return that.
It may seem that the left and right trees are always split in the above definition. However, because Haskell uses lazy evaluation, the right subtree will only be processed if there is nothing to split in the left subtree. Lazy evaluation delays a computation until it is required. If it is never used, then it isn’t evaluated. In this case, it helps write expressive code without sacrificing efficiency.
The function’s return type is a tuple of a new tree and a boolean to indicate whether a split happened. Because the Tree data type is immutable, I don’t have to worry about shared structure being changed during the split operation. Immutable data types in a side-effect-free function let you focus on the logic of only that function because it can’t have far-reaching interactions with the rest of your program. The rule for exploding pairs given by Advent of Code is:
To explode a pair, the pair’s left value is added to the first regular number to the left of the exploding pair (if any), and the pair’s right value is added to the first regular number to the right of the exploding pair (if any). Exploding pairs will always consist of two regular numbers. Then, the entire exploding pair is replaced with the regular number 0.
There is a similar recursive process involved in the explode step, but it’s more intricate. You can find the details here.
Advent of Code is a wonderful annual adventure for programmers. You can treat it as a competition and try to get better times than your friends, or you can treat it as a fun daily exercise to explore new techniques and fine-tune your discipline as a programmer. I used Advent of Code to explore Haskell’s many advantages for writing safe and expressive code. You can check out Part 1 and Part 2 for more on how I did that.
Henry is a Scala engineer at GrapheneDB where he uses functional programming to manage graph databases in the cloud. He loves functional programming because it makes code robust and easy to reason about.