Beta abstraction for bottom up theory building

When we begin working on a problem, one of our main tasks is to build a theory about the problem space so that we can capture and communicate our understanding about it to others and to machines. Given that when we begin building these theories we might know only a few parts of the elephant and not the whole elephant itself, what chance do we stand to discover the whole elephant if our starting point is a few limited perspectives? In this post, I share an example of how to arrive at higher level theories about a domain via bottom up exploration using systematic beta abstraction.

This post is based on and reuses parts of a discussion on beta abstraction done in our SICP study group at Pramati, Chennai.

Theory building

Peter Naur argued that programming is a process of theory building.

… programming properly should be regarded as an activity by which the programmers form or achieve a certain kind of insight, a theory, of the matters at hand. – Peter Naur, 1985

He used the word “theory” in the sense of Ryle. In Naur’s own words -

… a person who has or possesses a theory in this sense knows how to do certain things and in addition can support the actual doing with explanations, justifications, and answers to queries about the activity of concern.

When we begin exploring a problem space, we have a glimpse of a few interesting facets of the space. As we study it more, we get to turn the diamond around and understand it from multiple perspectives, eventually arriving at what the whole diamond is about. It is at this point that we become capable of answering queries about this diamond, know what we can do with the diamond, etc. Is this a fantasy or do methods exist by which we can proceed systematically to understand the whole from the facets?

A common illustration of this that I give folks is to consider learning to multiply two numbers expressed in the decimal place-value system using the cross multiplication technique. When we master this algorithm, we know how to compute the product of two numbers given their digit representations.

Once we master this algorithm, we have the ability to compute the product of any two numbers no matter how many digits they may be made of. Now if this were our starting point, how can we arrive at an abstract understanding of what multiplying two numbers means? That we could think of it as computing area of a rectangle, using one of the numbers as a “scaling factor” to determine how much to grow or shrink whatever the other number represents, that products are commutative, that they distribute over addition and other myriad properties of conventional number multiplication that we eventually imbibe to form a coherent “theory” of multiplication.

Keep the the above metaphor in mind for bottom-up theory building as we consider a problem more relatable to programmers.

A tale of two tables

Given two tables - one of names of persons and their ids, and the other the geographical coordinates associated with the ids - find out the geographical coordinates of a person given the name.

(define person-info '(("Chandrakanth" 42)
                      ("Suryaprakash" 24)))

(define location-info '((42 -117.0 34.0)
                        (24 -150.0 24.0)))

(define (record-key record)
  (first record))

(define (record-value record)
  (rest record))

; search finds the record corresponding to the given key
; in the given list of records.
(define (search records key)
  (if (null? records)
      #f
      (if (equal? (record-key (first records)) key)
          (record-value (first records))
          (search (rest records) key))))

Our given primitive is a search function that can walk through a list of records and find the record with the given key. Given this primitive, can we put together the information in these two tables to come up with a function that will return the geo coordinates of a person given the name?

The frog in the well

Our first cut at solving the problem is to find the location id from the person-info list and plonk that into the location-info list search to get at the result.

(define (location1 name)
  (search location-info (first (search person-info name))))

We’re now like a frog in a well. We know how to find the location given the name and two lists that correlate with each other. … but we don’t exactly know what it is that we’re doing. Our situation is comparable to knowing how to multiply two given numbers, but not knowing what multiplication is all about, at a high enough level of abstraction.

Climbing the ladder of abstraction

The way we go about raising our level of abstraction is to rewrite the functions we’ve written thus far and use “beta abstraction” rather systematically .. while also using our “taste” in the process - which is important since it is a creative act in itself.

Beta abstraction: Replacing an expression with a variable such as (... x ...) with the expression ((lambda (z) (... z ...)) x). Using the substitution rule, it is easy to see that the two are functionally indistinguishable. This is an “abstraction” because we’re lifting the concrete usage of x out of the expression of interest to us.

We first start with another version of location that separates the various steps we’re doing.

(define (location2 name)
  (define a (search person-info name))
  (define b (search location-info (first a)))
  b)

That didn’t take us to a higher level of abstraction just yet. Let’s do “beta abstraction” on both the “define” lines.

Here is location3 with beta abstraction applied to both its steps. Still, no gain in abstraction at the level of the operation.

(define (location3 name)
  (define a ((lambda (name) (search person-info name)) name))
  (define b ((lambda (locid) (search location-info locid)) (first a)))
  b)

Now we do beta abstraction again on location3, but this time, pull out both the nested lambda expressions in their entirety.

(define (location4 name)
  ((lambda (f1 f2)
     (define a (f1 name))
     (define b (f2 (first a)))
     b)
  (lambda (blah) (search person-info blah))
  (lambda (blah) (search location-info blah))))

At this point, we notice something interesting. The two lines with (define ..) are nearly the same, except for the “first” sticking in front of the second expression. We have a few options here. We can either modify the problem to get rid of the extra “first”, or we can standardize on “first” everwhere. Let’s choose the former. (Exercise: choose the latter and explore.)

(define person-info '(("Chandrakanth" 42)
                      ("Suryaprakash" 24)))

(define location-info '((42 (-117.0 34.0))
                        (24 (-150.0 24.0))))

(define (record-key record)
  (first record))

(define (record-value record)
  (first (rest record)))

(define (location4 name)
  ((lambda (f1 f2)
     (define a (f1 name))
     (define b (f2 a))
     b)
  (lambda (blah) (search person-info blah))
  (lambda (blah) (search location-info blah))))

Now we notice that the first nested “lambda” is simply function composition of single argument, single result functions.

(define (compose f1 f2)
  (lambda (x) (f2 (f1 x))))

We thus have a location 5 in terms of compose -

(define (location5 name)
  ((compose (lambda (k) (search person-info k))
            (lambda (k) (search location-info k)))
   name))

Or more simply, we have

(define location5 (compose (lambda (k) (search person-info k))
                           (lambda (k) (search location-info k))))

The last form is called “point free form” because it elimintes the need to use an explicit variable name in the act of function definition.

The next abstraction we can do is to capture the concept common to both the lambda expressions we’re creating. The are very similar in form and differ only in the table we’re passing to them. Note that each of these abstraction steps is only the beta abstraction, but we’re now trying to write it in more convenient notation than just nested lists.

(define (table records)
  (lambda (k) (search records k)))

… which gives us

(define location5 (compose (table person-info) 
                           (table location-info)))

This is a good point to pause and recap. Using a sequence of beta abstraction steps, we’ve come up with a definition for the problem’s solution that uses two “concepts” -

the concept of a “table” and 2) the concept of composition of tables. The fact that the code for location5 directly expresses how these concepts are employed is indicative that we’ve, at least partially, climbed out of the well - i.e. we know something about what we’re doing, more than just the how.

Exercise

Interpret the consequences of making the following beta abstractions on the final definition of “location5” above.

Abstract on the “compose” function
Abstract on the “table” function

Also what do you get when you abstract on the search function? Discuss whether beta abstraction on anything will yield a higher point of view. If you think so, what is stopping us from abstracting all over the place? If not, how do we judge what to abstract over?

The road less travelled - continuation passing style

Let us revisit the definition of “location2” -

(define (location2 name)
  (define a (search person-info name))
  (define b (search location-info (first a)))
  b)

How do we expect the definition to be evaluated when used within another expression? The “define” expressions will first need to be evaluated before the result can be declared to be the third expression in the sequence - “b”.

Let us make this “declare the result to be EXPR” explicit in the definition of location2 -

(define (location2-r name)
  (define a (search person-info name))
  (define b (search location-info (first a)))
  (return b))

We pretend that the “return” is a function that is to be called with one argument and its effect is to return control to the point at which location2-r was called so that further evaluation can proceed with the given value. The “return” function itself is not expected to “return” … that would be weird!

Now, let us beta-abstract on “return” within the body of location2-r.

(define (location2/c name return)
  (define a (search person-info name))
  (define b (search location-info (first a)))
  (return b))

TIP: When the “/” character occurs within a symbol name in Scheme, it is expected to be read as “with”. So we read the above function as “location2 with c”. We’ll come to what the “c” stands for shortly.

We’ve not performed a beta abstraction as originally described, but we’ve taken some liberty using the language’s multiple arguments feature to just add the abstracted item as a new argument to the original function. This still qualifies as beta abstraction.

Now, what do we get if we make every function we ever write into another form that takes an extra argument - which is to be a function that will itself never “return”, but stands for “the rest of what should be computed”. Such a function is known as a “continuation” - hence the “/c” is to be read as “with continuation”. You may be more familiar with the name “callback” than continuation, in which case you can read it as “with callback” too.

Every function “blah” can be systematically transformed thus into a “blah/c” that takes an extra argument. This transformation is a “continuation passing style transformation”, or CPS transformation for short.

Exercise

Rewrite all the functions involved in the problem solution in CPS form. Use “/c” suffix to indicate that.

Hint: You have to define the following functions - record-key/c, record-value/c, null?/c, equal?/c, first/c, rest/c and search/c … and eventually location5/c, compose/c and table/c. You can assume that “if” construct works as usual and doesn’t need to be CPS transformed. Here are a few of them to get you going.

(define (record-key/c record return)
  (first/c record return))

(define (record-value/c record return)
  (rest/c record return))

(define (null?/c x return)
  (return (null? x)))

(define (equal?/c x y return)
  (return (equal? x y)))

Next up, we’ll take up the design consequences of performing CPS transformation on the “location5” function. It’ll help if you try to work that out yourself before proceeding further.

CPS part 2 - so what?

Firstly, why bother making return explicit like this? In our original code, we were assuming some standard control flow features. When you “call” a function, for instance, once the function has finished its computation, it magically returns to the point within its caller’s code. With the CPS transformation, we’re taking back into our hands what would otherwise be left as such magic that we have no control over. Now that we can pass such “continuations” explicitly, we’re now free to create our own ways to continue from a computation. For example, we can pass in a continuation that will add itself to a run queue and await its turn. This will get us a form of cooperative multi-tasking. Another use could be to store away the continuation in hash table and use the hash key as a “session reference” in a web transaction, so that when a user gets back with a follow up request, we can resume the computation with the new information and we’ll have kept all our previous computational history alive during the period. In other words, making continuations explicit allows us to control the flow of time and events in our application.

.. and did you do the exercise? Well, that was given for a purpose. Anyway, here are the table/c and compose/c functions -

(define (table/c records)
  (lambda (k return)
    (search/c records k return)))

What we note is that table is a higher order function - in this case, one that computes another function. What we required of a higher order CPS-style function is to compute a CPS-style lambda which will search a given records collection and provide its result to a given return function.

We do the same for the compose/c function as well.

(define (compose/c f1 f2)
  (lambda (x return)
    (f1 x (lambda (y)
            (f2 y return)))))

Here, again, compose/c computes a lambda that is itself in CPS form. What we’re not doing here is to make compose/c provide the computed lambda via a separate return argument. Note that the arguments of compose/c are now CPS style functions.

QUESTION: If you did express compose/c and table/c using such an extra argument, what kind of an abstraction would that buy you?

Using these definitions, we can now express location5/c like this -

(define location5/c (compose/c (table/c person-info)
                               (table/c location-info)))

The important lesson to learn in this exercise is that the final form of the location5 function, whether expressed in normal evaluation style or CPS style, is essentially described in terms of two concepts - a) a “table of values” and b) composition of two such tables. This is amply clear by just looking at the forms of location5 and location5/c.

Getting real - transducers in clojure

Rich Hickey, talks about what transducers are in this talk.

Transducers are abstractions that “are the essence of map, filter, etc.”. From 16m6s into the video, Rich Hickey explains how to derive transducers. First, he expresses map, filter etc. in terms of foldr and cons.

(defn mapr [f coll]
  (foldr (fn [x r] (cons (f x) r))
         () coll))

(defn filterr [pred coll]
  (foldr (fn [x r] (if (pred x) (cons x r) r))
         () coll))

And similarly via reduce -

(defn mapl [f coll]
  (reduce (fn [r x] (conj r (f x)))
          [] coll))

(defn filterl [pred coll]
  (reduce (fn [r x] (if (pred x) (conj r x) r))
          [] coll))

(defn mapcatl [f coll]
  (reduce (fn [r x] (reduce conj r (f x)))
          [] coll))

Around 21m47s, Rich performs a beta abstraction on conj on the core mapping function. In essence he does the following -

(defn mapping' [f]
  (fn [r x] (conj r (f x)))

; .. to ..
(defn mapping' [f]
  ((fn [step] (fn [r x] (step r (f x))) conj))

; .. and defines 
(defn mapping [f]
  (fn [step] (fn [r x] (step r (f x))))

; .. so that
(defn mapping' [f]
  ((mapping f) conj))

We can now express the specific mapl and such functions in terms of mapping -

(defn mapl [f coll]
  (reduce ((mapping f) conj)
          [] coll))

Similarly with filterl -

(defn filtering [pred]
  (fn [step]
    (fn [r x] (if (pred x) (step r x) r)))))

We can now compose these functions using normal function composition -

(comp (mapping f)
      (filtering pred))

When expanded, the function composition looks like -

(let [t1 (mapping f)
      t2 (filtering pred)]
  (fn [step]
    (t1 (t2 step))))

Note: While the function composition is applied right one first, the resulting transducer function will run the steps of (mapping f) first and then (filtering pred). This is because the result of (filtering pred) becomes the step function that gets passed to (mapping f).

Having beta abstracted on conj, we’re now able to talk about and compose processes that work on collections or streams or whatever in the abstract, without knowing the specifics of the material that the process is going to work on. Functions such as mapping and filtering make “transducers” which can be composed to express step-wise processing.

That’s right - one beta abstraction step on something that we now wish to consider as incidental - conj - and we have transducers. There is a bit more detail to transducers, but this is pretty much the essence of how we get them. We’ve gone from an understanding of “how to map over a list”, “how to map over a vector”, “how to map over a set” and such specific “how to” instructions, to “this is what mapping is all about” and now we can apply that concept to specifics about lists, vectors and sets. Beta abstraction has once again led us to a higher level of understanding about the kinds of “how to” computations we’re expressing in code so that we can apply a common understanding in multiple specific contexts.

Conclusion

As the masters say, you can never claim to have understood a concept if you understand it in only one way. We approached the problem of composing two tables using two different operational means and arrived at the same two concepts - compose and table - combined in exactly the same way in both the cases. This lends some reality to these concepts, which we arrived at by working from a concrete implementation of cross referencing. We also saw how the derivations of transducers in Clojure is (pretty much) a single beta abstraction step.

Beta abstraction is a powerful systematic tool to explore problem spaces where we can take out entities only incidentally involved in an expression of problem through this mechanism to arrive at the core set of concepts and their relationship to each other - i.e. a “theory” of our domain.

Codaholic

Does the dog have Turing nature?

Contents

Beta abstraction for bottom up theory building

Theory building

A tale of two tables

The frog in the well

Climbing the ladder of abstraction

Exercise

The road less travelled - continuation passing style

Exercise

CPS part 2 - so what?

Getting real - transducers in clojure

Conclusion