The Making of Cells

Kenny Tilton
9 min readJun 5, 2021

A case study in dumb, blind luck.

Preamble

Almost ten years ago I wrote up a library I had developed twelve years earlier, in another post named The Cells Manifesto. Since then I have been struck by how hard it is for developers to grasp its power, or the power of similar reactive libraries such as the original MobX, or binding.Scala.

Then I had an idea. The goal in this post is to convey the power of dataflow simply by describing the problem we were trying to solve, and how we solved it, how we then fixed the solution, and how we then slippery-sloped up to dataflow, without ever intending the paradigm.

But it worked great. We were shocked, actually. Call it dumb, blind luck. Here is how it went down.

The Problem: Dynamic, Nested UI Layout

We were building an educational math app aimed at struggling students, so the last thing we wanted was for them to be fighting to type in their math. We developed a WYSIWYG maths editor as a start, then worried about them running out of screen space as they worked a sizable, multi-step problem.

Resize bars need not apply: these kids are unhappy already.

I came up with a nifty dynamic layout calculation scheme, but on just day two my lead developer JD reported a problem with my approach. The use case was quite specific: fractions.

Background: the geometry I conceived for any visual element consisted of (a) that element’s (x,y) offset within its parent container, and (b) the element’s local bounds: left, top, right, bottom relative to (0,0) (hence “local”). Where did a box get drawn? Sum the offsets of my containers starting from the window’s (0,0) and shift the local bounding box accordingly.

Fun note: a later subordinate complained that my geometry was understandable only to myself. Harrumph. I felt better when I learned OpenGL has the same system.

So what was the problem with fractions? My algorithm hoped to calculate a child’s geometry before calculating its parent’s geometry, but to keep the operands of a fraction centered vertically the numerator, e.g., had to be horizontally inset half the difference between the fraction’s width and the numerator’s width. But the fraction did not know its width until the numerator had decided its width, because the fraction would take the max of the numerator and denominator. So neither child nor parent could go first.

Ok, ok: later I realized fractions should have a vertical alignment attribute saying “centered” and that the numerator should simply position itself horizontally at minus half its width, but I looked at this misbegotten use case and realized my overall idea of deciding an element’s entire geometry in one go would never work. I also knew there was no circularity in fact, only in my algorithm.

“Everyone to the whiteboard!”, I called out as if it were a breach in the city wall of Harfleur.

The Invention

True to my insight that it was not just fractions, I started with laying out the problem number (be it “1.” or “42(a).”), the problem statement (be it “Simplify” or “Solve and characterize as conditional, contradiction, or identity”) and below them the problem of any size the student typed in.

Step Zero

“OK, what is the left edge of the problem number?”, I asked rhetorically, accidentally solving the first problem: my “baby steps” habit when struggling with hard problems had led me to contemplate the derivation of one bit of the widget’s geometry, not its full geometry,

“Zero,” I concluded. “Same with the top.” Unlike OpenGL, most screen geometries increase as you go down the screen.

“We are on a roll,” I said, sensing Harfleur was ours. “Now what is the right edge?”

Step One

Wait.We need some caveats here before diving in to Step One.

  • All code below was typed in using the Blogger text editor. Parens will not balance, typos will abound.
  • If you do not know Lisp, the code may be a challenge. A good example is that (defstruct a b c) is a rarely used short form that defines a struct of type ‘a’ with slots b and c.
  • Hell, it has been a year since I coded Lisp in anger. If something below looks like Clojure, it probably is.
  • Goofs and obscurity aside, if like me you stare hard at examples and try to work out what must be going on, you will come away with a ton of questions and “that won’t work!”s. The story here of incremental elaboration continued for weeks at a steady pace and for years in fits and starts as new challenges to the scheme arose. Maybe this one time, do not stare too hard.

We return you now to the show, picking up with Step One where, after luckily shifting to slot-wise thinking for the left edge (and top) of a widget, I contemplated the right edge of the same widget, which we wanted to grow and shrink as the student edited the problem number.

Step One

“OK, the right edge is the left edge plus the string width of the problem number given the font metrics of the chosen font,” I stated rather obviously since we were already doing that. But the city was ours: a value had been specified as a function of other values.

When making the the textedit widget, we would not specify a reasonable width of 42, too big for most problem numbers and requiring scrolling for larger numbers. Instead, we would specify:

(make-instance ‘textedit
…etc…
:local-right (lambda (self)
(+ (local-left self)
(fontStringWidth (text self)
(math-font *app*))))

Cool. We never again have to worry about in which order to calculate things: other values we read will be calculated JIT. Er, how?

We just need an accessor (reader) that knows what we are up to (and a new defmodel macro to write all these accessors):

(defmethod local-right ((self textedit))
(let ((sv (slot-value self ‘local-right)))
(if (functionp sv)
(funcall sv self)
sv)))

Yeah, we need to do better because a function is a fine value for a slot so we cannot assume it is a rule to be funcalled, but it is a fair assumption for screen locations and it will not last long anyway.

JD knocked it off in an afternoon and we were delighted. Then of course the wheels came off: forever recomputing geometry was too slow. Note that this is an exponential problem, because property P might compute off several other properties and several other properties might compute off P, and recursively so as container after nested container computes their properties.

Step 2

We cache the calculation, and now we need a struct:

(defstruct cell rule cached-value)

Note that this solves our earlier problem: we no longer need to assume a function is a formula. Indeed, a function is a function, we are looking for cells now.

Brilliant: when we get a Mac OS update event we recompute the whole geometry but never recompute anything twice. We clear all the cached values and recompute and redraw.

That lasted two days. Back then 100mhz was considered fast so soon enough we had enough stuff on the screen to make a full recomputation too slow.

The good news is that I had anticipated that it would not scale, but when it comes to optimizing, it is always nice to be asked. We had been asked.

Step 3

To get past this we have to recalculate only as needed. If the student types another digit in the denominator, we need to recalculate the denominator’s width, recalculate its horizontal offset within the fraction, and — iff the denominator is wider than the numerator — recalculate the width of the fraction and pass that up the container hierarchy.

Note the happy optimization: if the denominator was sufficiently less wide than the numerator, the fraction would decide on the same width as before and there would be no need to recompute further up the container hierarchy.

So how do we decide “recalculate only as needed”?

(defstruct cell rule cached-value users)

A cell needs to know what other cells used its value in their calculations, so it can tell them to recalculate if it changes. Hold onto your hats:

(defmethod local-right ((self textedit))
(let ((sv (slot-value self ‘local-right)))
(if (typep sv ‘cell)
(progn
(when *user*
(c-record-user sv *user*))
(let ((*user* sv))
(setf (cached-value sv) (funcall rule self))
sv)))

Ya gotta love Lisp special variables: above we are at once (a) tracking who might be using us and (b) letting anyone our rule happens to access know that we are using them. What may not jump out at the reader is that this works recursively because, at start-up of new structure, the variables our rule accesses may have to JIT compute their values, rebinding *user* to themselves.

Thanks to Lisp macros, this does not require too much typing (so the semantics stand out). First, the macro:

 (defmacro c? (rule)
`(make-instance ‘cell
:rule (lambda (self)
,rule)))

(The sick thing above is that the lambda parameter self will be captured by the code of the rule. Hygiene? We don’t need no stinkin’ hygiene!)

And now:

(make-instance ‘textedit
…etc…
:local-right (c? (+ (local-left self)
(fontStringWidth (text self)
*math-font*)))

But that only records the dependency. Gust two:

(defmethod (setf local-right) (new-value self)
(let ((sv (slot-value self ‘local-right)))
(if (typep sv ‘cell)
(progn
(when (not (eql new-value (value sv)))
(setf (value sv) new-value)
(dolist (user (users sv))
(c-recompute user)))
new-value)
(setf (slot-value self ‘local-right) new-value))))

Note how propagation halts on a non-changing change. We are really making functional scream.

Note also that I was so focused on making functional fast that I missed that the world had turned upside down.

Those are inversion goggles. Wear them for a few days non-stop and suddenly the world becomes easily navigable; the brain flips the world back, as if a single bit controlled our vision.

In this case, the better metaphor might be a river changing direction. Yes, the original problem was knowing in what order to pull information into a computation, but after tweak number three the library was about empowering some initial change in state to domino efficiently to other state. Pull had become push, causation had been harnessed.

There we go again, chewing the scenery. Let’s do practical again: how does the screen actually get updated? Let us just look at the snippet above where change is detected, now with a simple trick to do more than just change other computed values:

(when (not (eql new-value (value sv)))
(c-observe self ‘local-right new-value (value sv)) ;; <<--- NEW!
(setf (value sv) new-value)
(dolist (user (users sv))
(c-recompute user)))

c-observe is a generic multi-method we supply as an “on-change” callback:

(defmethod c-observe ((self vis-obj) (slot (eql ‘l-right)) new old)
… insert here code to trigger the update of the
union of the old + new rectangles of self…
)

Not only are we arranging for efficient, minimal, and complete change, we now have a hook to augment change processing arbitrarily.

One final point: we mentioned up front that nowadays everyone is doing reactive. That is true, but Hoplon/Javelin is the only one that does not require explicit subscribe and notify (and that works by lifting code inspection so it recomputes excessively). Thanks to Lisp macros and some relatively modest engineering, subscribe looks like this (again):

(make-instance ‘textedit
…etc…
:local-right (c? (+ (local-left self)
(fontStringWidth (text self)
*math-font*)))

There is no subscribe, we just code a derivation and Cells notices what other values we consume. And notify looks like this:

(setf (text self) (concatenate ‘string (text self) new-char))

Again, invisible. No need to break my coding flow and code up a publisher at a new source and subscription here, I just write my code and let the engine note the dependency. The more complicated a UI gets, the exponentially greater grows this win.

And that is how Cells got invented, much like Lisp itself:

  • We started with a hard problem;
  • we used meta-programming to solve the hard problem;
  • and we worried about the programmer experience.

--

--

Kenny Tilton

Developer and student of reactive systems. Lisper. Aging lion. Some assembly required.