The art of replacing long lambdas II: llamas vs. anaphorics in dash and xht

(See Part I.)

Anaphorics and llama compete as a viable solution for replacing long lambdas.

But if you're already requiring a library that has anaphoric macros, should you use them? Or should you instead pick the regular non-anaphoric function plus a llama?

To answer that question, it helps if we analyze the matter by arity, because it affects readability.

And we can see that almost half of dash's anaphorics, and almost all of xht's anaphorics, replace dyadic lambdas.

In dash

Dash has about 67 anaphoric macros, of which:

31 replace one monadic lambda
26 replace one dyadic lambda
06 replace two monadic lambdas
02 replace one dyadic plus one monadic lambda
02 are threading macros, so replace no lambda

How do anaphorics compare with regular plus llama in dash?

For dyadic lambdas, dash has better readability

For dyadic lambdas, anaphorics' explicit variable-naming can make a difference.

The choice of anaphora quickly conveys its meaning. So llama's shortness is often bought at the cost of some readability.

This one below is fine, because:

you know sorting functions well
the two arguments are consecutive items
and the thing matches exactly what you would expect from something like >:

(--sort (> it other) '(13 20 0 19)) => '(20 19 13 0)
(-sort (##> % %2) '(13 20 0 19))    => '(20 19 13 0)

The next one is already a bit less readable, and there could be more confusion of which one is the accumulator (although it wouldn't matter in this particular example, because we are adding them).

(-tree-mapreduce (-const 1) (##+ % %2) '(a (b (c d e) (f)) ((g h) i ((j))))) => 10
(--tree-mapreduce 1 (+ it acc) '(a (b (c d e) (f)) ((g h) i ((j)))))         => 10

Here llama gets harder to read:

;; Notice that dash's and seq's map-indexed functions have opposite argument order

(seq-map-indexed (lambda (elt idx) (/ idx elt)) '(3 .5 .5 3)) ; seq  + lambda
(-map-indexed (lambda (idx elt) (/ idx elt)) '(3 .5 .5 3))    ; dash + lambda
(--map-indexed (/ it-index it) '(3 .5 .5 3))                  ; dash anaphoric
(seq-map-indexed (##/ %2 %) '(3 .5 .5 3))                     ; seq  + llama
(-map-indexed (##/ % %2) '(3 .5 .5 3))                        ; dash + llama

=> '(0 2.0 4.0 1)

Quick, look at the two llamas above: which one is index, which is element?
While llama is shorter, it's not obvious what % and %2 refer to.

Finally, well-coded anaphorics include "ignores" in their definitions for when some variable is not used, which allows for shorter syntax. So does llama since more recent versions — but readability can still suffer.

Example:

(seq-map-indexed (lambda (item index) (expt index 2)) '(1 2 3 4)) => '(0 1 4 9)
(--map-indexed (expt it-index 2) '(1 2 3 4))                      => '(0 1 4 9)
(seq-map-indexed (##expt %2 2) '(1 2 3 4))                        => '(0 1 4 9)

(Quick, what does %2 refer to above?)

Corner cases and additional comparisons

Anaphorics look simpler when using identity

That's true when arity=1:

(-filter (lambda (x) x) '(a nil nil b nil c)) => '(a b c)
(-filter #'identity '(a nil nil b nil c))
(--filter it '(a nil nil b nil c))
;; these work, but are longer
(-filter (##identity %) '(a nil nil b nil c))
(-filter (##prog1 %) '(a nil nil b nil c))
;; (well, in this one particular case, there are these, too)
(remove nil '(a nil nil b nil c))
(-non-nil '(a nil nil b nil c))

Also true when arity=2, for both arguments.

Here's identity with the second argument:

(-map-indexed (lambda (index item) item) '(a b c)) => '(a b c)
(--map-indexed it '(a b c))
;; these work, but are longer
(-map-indexed (##identity %2) '(a b c))
(-map-indexed (##prog1 %2) '(a b c))

Here's identity with the first argument:

(-map-indexed (lambda (index item) index) '(a b c)) => '(0 1 2)
(--map-indexed it-index '(a b c))
;; these work, but are longer
(-map-indexed (##prog2 _%2 %1) '(a b c))
(-map-indexed (##prog1 %1 _%2) '(a b c))

When using llama, backquotes may need to be replaced

When it happens at the beginning of the body, we need to adapt it:

(funcall (lambda (y) `(2 ,y)) 3) => '(2 3)
(funcall (-cut list 2 <>) 3)
(funcall (##list 2 %) 3)
;; so -cut would work here, but it'd also need to be expanded as list

(funcall (lambda (y) `(2 . ,y)) 3) => '(2 . 3)
(funcall (-cut cons 2 <>) 3)
(funcall (##cons 2 %) 3)
;; so -cut would work here, but it'd also need to be expanded as cons

When nested, we can displace the backquoting:

(funcall (lambda (a b c) `(,a (,b [,c]))) 1 3 5) => '(1 (3 [5]))
(funcall (##list %1 `(,%2 [,%3])) 1 3 5)
;; there seems to be no way to use -cut directly here

In dash, replacing every anaphoric with regular+llama doesn't seem warranted

Clarity is pretty much the same when dealing with simple monadic lambdas, as in:

(--each bar (foo (1+ it)))

versus

(-each bar (##foo (1+ %)))

But for dyadic lambdas, we have it plus it-index (or plus other, or plus acc) versus % plus %2. In these cases, the anaphoric solution is usually clearer than using dash's regular plus a llama.

Moreover, llama would be an additional library, whereas dash's anaphorics would be already available.

Therefore, if dash is already being required, it seems preferable to simply use its anaphorics.

In xht

More than 30 of xht's regular functions have anaphoric counterparts.

How do anaphorics compare with regular plus llama in xht?

For dyadic lambdas, anaphorics are usually more readable than llamas

xht's regular function h-lmap takes one dyadic lambda; h--lmap is its anaphoric counterpart.
They both return a list.

xht's regular function h-hmap takes two dyadic lambdas; h--hmap is its anaphoric counterpart.
They both return a hash table.

How would the regular ones look like when replacing the lambda with a llama?

example 2a

In this example, the function passed is a sharp-quoted symbol.

(h-lmap #'concat
        (h* "Bob"      "cat"
            "Whiskers" "cat"
            "Bubbly"   "fish"))

(h-lmap (##concat % %2)
        (h* "Bob"      "cat"
            "Whiskers" "cat"
            "Bubbly"   "fish"))

(h--lmap (concat key value)
         (h* "Bob"      "cat"
             "Whiskers" "cat"
             "Bubbly"   "fish"))

=> '("Bobcat" "Whiskerscat" "Bubblyfish")

So the first one is the shortest and simplest, which is good.
But it's not clear how many arguments it should receive, nor what they are supposed to be, which is bad.

The llama one makes it a bit longer, which is bad.
But it also makes it clearer that the variadic concat expected two arguments, which is good.
But then you don't know exactly what those arguments are, which is bad.
But then you know what it is that you're mapping (so it's not that hard to infer), which is good.
But this comes at a small cognitive cost, which is bad.

The anaphoric one makes it still a bit longer, which is bad.
But it also makes it clearer that the variadic concat expected two arguments, which is good.
And it also shows that these are key and value, so there's no extra cognitive cost, which is good.

The first, sharp-quoted one is the shortest.
The anaphoric one is the clearest.

example 2b

In this example, the function passed is a dyadic lambda that uses both arguments.

(h-lmap (lambda (k v) (format "%s→%s" (type-of k) (type-of v)))
        (h* "a" 1 'd  (h* "d1" 41 "d2" 42)))

(h-lmap (##format "%s→%s" (type-of %) (type-of %2))
        (h* "a" 1 'd  (h* "d1" 41 "d2" 42)))

(h--lmap (format "%s→%s" (type-of key) (type-of value))
         (h* "a" 1 'd  (h* "d1" 41 "d2" 42)))

=> '("string→integer" "symbol→hash-table")

In this, the first one is now too large, and llama makes it much shorter, as does the anaphoric.

Between these two, llama is a bit shorter, but the anaphoric provides extra clarity by naming the variables.

example 2c

In this example, two functions are passed, as dyadic lambdas.

Only the first argument is used in the first lambda. In the second lambda, both are used.

(h-hmap  (lambda (k v) (upcase k))
         (lambda (k v) (concat k (capitalize v)))
         (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish"))

(h-hmap  (##upcase % _%2)
         (##concat % (capitalize %2))
         (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish"))

(h--hmap (upcase key)
         (concat key (capitalize value))
         (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish"))

H=> (h* "BOB"      "BobCat"
        "WHISKERS" "WhiskersCat"
        "BUBBLY"   "BubblyFish")

Here, the version with the llamas is shorter than the one with the regular lambdas.
The dyadic nature of the lambda is marked with a _%2.
Without that, h-hmap would throw a wrong-number-of-arguments error.

In the anaphoric version, the dyadic nature is already implied.
Both variables are named, which makes it immediately readable.

The llama version is a bit shorter, but the anaphoric version is clearer.

example 2d

In this example, two dyadic lambdas are passed.

Only one of the arguments is used in each of them.

In the second lambda, there's no function call: the value is passed as it is.

(h-hmap  (lambda (k _v) (h-as-keyword k))
         (lambda (_k v) v)
         (h* 'a 1 'b 2 'c 3))

(h-hmap  (##h-as-keyword % _%2)
         (##prog1 %2)
         (h* 'a 1 'b 2 'c 3))

(h--hmap (h-as-keyword key)
         value
         (h* 'a 1 'b 2 'c 3))

H=> (h* :a 1 :b 2 :c 3)

This example is similar to the previous one.

Here, however, there's no function to evaluate in the second lambda. It's just the value.
But llama expects a function to be passed to ##.
So we need to use a prog1, which makes it a bit longer.

In the anaphoric version, value can be be passed directly, without the prog1 workaround.

The anaphoric version makes it as clear as it can be. It's also shorter.

For typical use of triadic lambdas, anaphorics are much more readable than llamas

xht's regular function h-2d-hmap takes three triadic lambdas; h--2d-hmap is its anaphoric counterpart.

They both return a 2D hash table.

How would the regular one look like when replacing the lambdas with llamas?

example 3a

(->> (h-2d-new '(10 21 42) '(n 2 3 4))
     (h-2d-hmap (lambda (k _ _) k)
                (lambda (_ f _) f)
                (lambda (k f _) (expt k f)))
     h->orgtbl)

(->> (h-2d-new '(10 21 42) '(n 2 3 4))
     (h-2d-hmap (##prog1 %1 _%3)
                (##prog1 %2 _%3)
                (##expt %1 %2 _%3))
     h->orgtbl)

(->> (h-2d-new '(10 21 42) '(n 2 3 4))
     (h--2d-hmap key field (expt key field))
     h->orgtbl)

O=> "|  n |    2 |     3 |       4 |
     |----+------+-------+---------|
     | 10 |  100 |  1000 |   10000 |
     | 21 |  441 |  9261 |  194481 |
     | 42 | 1764 | 74088 | 3111696 |"

The llama version is only a bit shorter than the lambda one.
It's the least clear about the nature of input variables.
The triadic nature of the lambda is passed to llama with the _%3.
A prog1 is needed to make llama return just the variables.

The anaphoric version passes the named variables directly.
It's by far the shortest.
It's as readable and as simple as it can be.

In xht, anaphorics look better overall

Examples 2d and 3a make a particular strong case for the value of anaphoric macros.

Anaphorics look better here:

For all the examples above, among the three options, the anaphorics are the clearest, and almost always also the shortest.
The llamas are often shorter than their respective lambdas here, but the lambdas come out more readable.

Moreover, llama would be an additional library, whereas xht's anaphorics would be already available.

Therefore, if xht is already being required, it seems preferable to simply use its anaphorics.

Anaphorics vs. llama: a summary table

We can now look again at the question of anaphorics vs. llama in light of what we've seen so far.

Anaphoric alternative...	arity=1	arity=2	arity=3
doesn't exist at all	`mapatoms` · `mapconcat`	`cl-sort` (destructive)
exists in another library	`mapcar` · `mapc` · `mapcan` · `seq-find`	`maphash`
exists in the same library	`-map` · `-each` · `-mapcat` · `-find`	`-sort` · `h-lmap` · `h-hmap`	`h-2d-hmap`
has no lambda counterpart	`-->` · `-some-->`

With the exception of the last line (which doesn't apply), the case for llama gets stronger towards the table's upper left, and weaker towards its lower right.

See next

Mixing llamas and anaphorics is possible — but it's not for the faint of heart.

In Part III we see how we can combine them to disentangle nested anaphoras and reorganize threading positions.