PurelyFunctional.tv Newsletter 434: Re-combination of parts

Issue 434 - July 13, 2021 · Archives · Subscribe

Clojure Tip 💡

Re-combination of parts

A couple of weeks ago, we explored the idea of why not stating the assumptions about your domain can be good. It can lead to enough savings in code to make it significantly shorter. In this issue, I'd like to dive into one example Jack Diederich gave in his talk Stop Writing Classes. He shows how you can write an implementation of Conway's Game of Life in two functions. I'd like to explore the tradeoffs between explicit and implicit code using Clojure.

As far as I can tell, the original inspiration of this version of Game of Life is from an APL implementation, which is a one-liner. There's a version in Clojure Programming by Chas Emerick, Brian Carper, and Christophe Grand. You can see an explanation on Christophe Grand's site. Here it is:

(defn neighbours [[x y]]
  (for [dx [-1 0 1] dy (if (zero? dx) [-1 1] [-1 0 1])]
    [(+ dx x) (+ dy y)]))

(defn step [cells]
  (set (for [[loc n] (frequencies (mapcat neighbours cells))
             :when (or (= n 3) (and (= n 2) (cells loc)))]
         loc)))

Diederich asserts that this is shorter than modeling the problem in an object-oriented way. That is obvious. The OO way requires defining a Cell class, which has a neighbors method and keeps track of the state. And then a Grid class that contains the Cell instances. The Clojure approach I list above seems to be some kind of trick. Where are all of the domain concepts? Like we talked about in a previous issue, they are implicit. But they are all there. We can begin to make them explicit if we wish.

A grid is represented as a set, and the empty grid is an empty set:

(def empty-grid #{})

One can set a cell at a certain point alive by adding it to the set:

(defn make-alive [grid point]
  (conj grid point))

One can kill the cell by removing it:

(defn make-dead [grid point]
  (disj grid point))

And then, we can check if a cell is alive or dead:

(defn alive? [grid point]
  (contains? grid point))

Finally, we'd like to be able to enumerate all live cells:

(defn live-cells [grid]
  (seq grid))

These functions are simple wrappers around existing collection functions. Are they really needed?

We can also make a function that sets a collection of cells alive:

(defn make-all-alive [grid points]
  ;; (into grid points)
  (reduce make-alive grid points))

Notice that by using our own domain functions, we lose the ability to reuse an existing function. That's another reason things get longer.

The neighbours function calculates the neighbors of a point. Perhaps the implementation is a bit too clever. Let's optimize it for clarity:

(defn neighbours [[x y]]
  [[(dec x) (dec y)] [x (dec y)] [(inc x) (dec y)]
   [(dec x) y      ]             [(inc x) y      ]
   [(dec x) (inc y)] [x (inc y)] [(inc x) (inc y)]])

Now for the step function. It employs a clever trick. Instead of counting a cell's live neighbors, it generates all neighbors of live cells and counts them. This trick is useful because it always starts from the live cells (instead of all cells in the grid), and we have the live cells in a set. Since the grid is infinite, we can't really start from every cell, so we have to keep this trick.

As an aside, I think this trick is mind-bending and the kind of thing you typically want to avoid. It doesn't obviously follow from the rules of the game. In addition, I think the economy of this implementation is due in large part to this trick. I don't think it would be as short without it. We can return to the centrality of this trick later.

That said, we can leave the trick in and clarify the code. We'll use the statement of the rules as condensed in Wikipedia:

(defn survives? [live-neighbors]
  (or (= 2 live-neighbors) (= 3 live-neighbors)))

(defn comes-alive? [live-neighbors]
  (= 3 live-neighbors))

(defn cell-alive-next-step? [grid [point live-neighbors]]
  (if (alive? grid point)
    (survives? live-neighbors)
    (comes-alive? live-neighbors)))

(defn step [grid]
  (->> grid
    ;; generate neighbors of all live cells
    (mapcat neighbors)
    ;; count them into [point, count] pairs
    (frequencies)
    ;; keep the ones alive at the next step
    (filter #(cell-alive-next-step? grid %))
    ;; just the points
    (map first)
    ;; make a new grid and add all live cells
    (make-all-alive empty-grid)))

I argue that neighbors and step are much clearer now. However, that clarity comes at a cost. We have made the explicit implicit. We paid for it in code.

We need to revisit the counting neighbors trick. It is the kind of "clever algorithm" that often makes code hard to understand. Notice that we are double-dipping by using a set. It allows us to enumerate the live cells and tell us whether a cell at a given point is alive. And we get that for free because we used an existing collection.

Further, the ability to list the live cells hints at the trick. Would we have even seen the clever solution without that ability? I'm not sure. And the origin of the APL code is lost to time.

In the end, I'm not sure what to conclude. Sometimes I want to lay things out to be very explicit. And other times I like the concision. Explicitness can make things easy to read. It creates an abstraction barrier beyond which you don't have to keep an implementation in your head. It frees up mental resources to think at the level of the problem.

But concision can make things easier to rework at a deeper level. You are freed from the confines of the original domain modeling. And sometimes, you need that freedom to make a breakthrough. In the end, I think I prefer to work at the lower level. At that level, the model is a recombination of existing parts.

We often underestimate how nice it is to have an excellent collection library in Clojure. How many functions we have available! And how much syntactical help we get! Abstraction barriers are useful, but they should be used strategically.

Podcast episode🎙

This week on the podcast, I talk about why I believe Clojure's stance on data is so powerful.

Book update 📘

You can order the print book on Amazon. You can order the print and/or PDF versions on Manning.com (use TSSIMPLICITY for 50% off).

I can't vouch for the Kindle version since I haven't seen them yet. Manning has hired a firm to do a better job than the previous version. I'll keep you updated.

Quarantine update 😷

I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.

I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skill. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.

Stay healthy. Wash your hands. Wear a mask. Get vaccinated if you can. Take care of loved ones.

Clojure Challenge 🤔

Last issue's challenge

Issue 433

This week's challenge

Sentence searcher

Sometimes I want to find a word in a document, but I want the context for the word. Write a function that takes a document and a word and returns the sentences that contain that word. The sentences should be returned in the order they appear in the document.

Examples



(search "This is my document." "Hello") ;=> nil
(search "This is my document. It has two sentences." "sentences") ;=> ["It has
two sentences."]
(search "I like to write. Do you like to write?" "Write") ;=> ["I like to
write." "Do you like to write?"]

Sentences end with \., \!, or \?.

The search should be case insensitive.

Return nil if the word is not found.

Thanks to this site for the problem idea, where it is rated Hard in Python. The problem has been modified.

Please submit your solutions as comments on this gist.

Rock on!
Eric Normand