The art of replacing long lambdas II: llamas vs. anaphorics in dash and xht
(See Part I.)
Anaphorics and llama compete as a viable solution for replacing long lambdas.
But if you're already requiring a library that has anaphoric macros, should you use them? Or should you instead pick the regular non-anaphoric function plus a llama?
Let's have a look at that.
It helps if we analyze the matter by arity, because in practice most lambdas receive a single argument, and there are some distinct features to consider in the other cases.
Some of dash
's anaphorics and most of xht
's anaphorics are dyadic. We can now look at each of these two libraries separately.
In dash
How do anaphorics compare with llama in dash?
Similar readability for monadic lambdas
Anaphoric dash and llama seem to be similar in terms of readability and length for monadic lambdas, because it
and %
are pretty much the same:
(--iterate (* it it) 3 4) => '(3 9 81 6561) (-iterate (##* % %) 3 4) (--splice (> it 1) (list it it) '(4 0 2)) => '(4 4 0 2 2) (-splice (##> % 1) (##list % %) '(4 0 2)) (--find (> it 1) '(-1 0 -2 7 3)) => 7 (-find (##> % 1) '(-1 0 -2 7 3))
For dyadic lambdas, dash has better readability
Most of dash's anaphorics happen to be monadic.
But beyond that, anaphorics have explicit variable-naming. The choice of anaphora quickly conveys its meaning. So llama's shortness is often bought at the cost of some readability.
This one below is fine, because:
- you know sorting functions well
- the two arguments are consecutive items
- and the thing matches exactly what you would expect from something like
>
:
(--sort (> it other) '(13 20 0 19)) => '(20 19 13 0) (-sort (##> % %2) '(13 20 0 19)) => '(20 19 13 0)
The next one is already a bit less readable, and there could be more confusion of which one is the accumulator (although it wouldn't matter in this particular example, because we are adding them).
(-tree-mapreduce (-const 1) (##+ % %2) '(a (b (c d e) (f)) ((g h) i ((j))))) => 10 (--tree-mapreduce 1 (+ it acc) '(a (b (c d e) (f)) ((g h) i ((j))))) => 10
Here llama gets harder to read:
;; Notice that dash's and seq's map-indexed functions have opposite argument order (seq-map-indexed (lambda (elt idx) (/ idx elt)) '(3 .5 .5 3)) ; seq + lambda (-map-indexed (lambda (idx elt) (/ idx elt)) '(3 .5 .5 3)) ; dash + lambda (--map-indexed (/ it-index it) '(3 .5 .5 3)) ; dash anaphoric (seq-map-indexed (##/ %2 %) '(3 .5 .5 3)) ; seq + llama (-map-indexed (##/ % %2) '(3 .5 .5 3)) ; dash + llama => '(0 2.0 4.0 1)
Quick, look at the two llamas above: which one is index, which is element?
While llama is shorter, it's not obvious what %
and %2
refer to.
Finally, well-coded anaphorics include "ignores" in their definitions for when some variable is not used, which allows for shorter syntax. So does llama since more recent versions — but readability can still suffer.
Example:
(seq-map-indexed (lambda (item index) (expt index 2)) '(1 2 3 4)) => '(0 1 4 9) (--map-indexed (expt it-index 2) '(1 2 3 4)) => '(0 1 4 9) (seq-map-indexed (##expt %2 2) '(1 2 3 4)) => '(0 1 4 9)
(Quick, what does %2
refer to above?)
Corner cases and additional comparisons
Anaphorics look simpler when using identity
.
That's true when arity=1
:
(-filter (lambda (x) x) '(a nil nil b nil c)) => '(a b c) (-filter #'identity '(a nil nil b nil c)) (--filter it '(a nil nil b nil c)) ;; these work, but are longer (-filter (##identity %) '(a nil nil b nil c)) (-filter (##prog1 %) '(a nil nil b nil c)) ;; (well, in this one particular case, there are these, too) (remove nil '(a nil nil b nil c)) (-non-nil '(a nil nil b nil c))
Also true when arity=2
, for both arguments.
Here's identity with the second argument:
(-map-indexed (lambda (index item) item) '(a b c)) => '(a b c) (--map-indexed it '(a b c)) ;; these work, but are longer (-map-indexed (##identity %2) '(a b c)) (-map-indexed (##prog1 %2) '(a b c))
Here's identity with the first argument:
(-map-indexed (lambda (index item) index) '(a b c)) => '(0 1 2) (--map-indexed it-index '(a b c)) ;; these work, but are longer (-map-indexed (##prog2 _%2 %1) '(a b c)) (-map-indexed (##prog1 %1 _%2) '(a b c))
When using llama, backquotes need to be replaced with list
:
(funcall (lambda (y) `(2 ,y)) 3) => '(2 3) (funcall (-cut list 2 <>) 3) (funcall (##list 2 %) 3) ;; so -cut would work here, but it'd also need to be expanded as list
But that replacement is only needed at the beginning:
(funcall (lambda (a b) `(,a (,b))) 1 3) => '(1 (3)) (funcall (##list % `(,%2)) 1 3) ;; there seems to be no way to use -cut directly here
In dash, replacing every anaphoric with regular+llama doesn't seem warranted
Clarity is pretty much the same when dealing with simple monadic lambdas, as in:
(--each bar (foo (1+ it)))
versus
(-each bar (##foo (1+ %)))
But for dyadic lambdas, we have it
plus it-index
(or plus other
, or plus acc
) versus %
plus %2
. In these cases, the anaphoric solution is usually clearer than using dash
's regular plus a llama.
Moreover, llama would be an additional library, whereas dash
's anaphorics would be already available.
Therefore, if dash
is already being required, it seems preferable to simply use its anaphorics.
In xht
xht
offers an anaphoric alternative for more than 30 of its regular functions.
For dyadic lambdas, anaphorics are usually more readable than llamas
xht
's regular function h-lmap takes one dyadic lambda; h--lmap is its anaphoric counterpart.
They both return a list.
xht
's regular function h-hmap takes two dyadic lambdas; h--hmap is its anaphoric counterpart.
They both return a hash table.
How would the regular ones look like when replacing the lambda with a llama?
example 2a
In this example, the function passed is a sharp-quoted symbol.
(h-lmap #'concat (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) (h-lmap (##concat % %2) (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) (h--lmap (concat key value) (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) => '("Bobcat" "Whiskerscat" "Bubblyfish")
So the first one is the shortest and simplest, which is good.
But it's not clear how many arguments it should receive, nor what they are supposed to be, which is bad.
The llama one makes it a bit longer, which is bad.
But it also makes it clearer that the variadic concat
expected two arguments, which is good.
But then you don't know exactly what those arguments are, which is bad.
But then you know what it is that you're mapping (so it's not that hard to infer), which is good.
But this comes at a small cognitive cost, which is bad.
The anaphoric one makes it still a bit longer, which is bad.
But it also makes it clearer that the variadic concat
expected two arguments, which is good.
And it also shows that these are key
and value
, so there's no extra cognitive cost, which is good.
The first, sharp-quoted one is the shortest.
The anaphoric one is the clearest.
example 2b
In this example, the function passed is a dyadic lambda that uses both arguments.
(h-lmap (lambda (k v) (format "%s→%s" (type-of k) (type-of v))) (h* "a" 1 'd (h* "d1" 41 "d2" 42))) (h-lmap (##format "%s→%s" (type-of %) (type-of %2)) (h* "a" 1 'd (h* "d1" 41 "d2" 42))) (h--lmap (format "%s→%s" (type-of key) (type-of value)) (h* "a" 1 'd (h* "d1" 41 "d2" 42))) => '("string→integer" "symbol→hash-table")
In this, the first one is now too large, and llama makes it much shorter, as does the anaphoric.
Between these two, llama is a bit shorter, but the anaphoric provides extra clarity by naming the variables.
example 2c
In this example, two functions are passed, as dyadic lambdas.
Only the first argument is used in the first lambda. In the second lambda, both are used.
(h-hmap (lambda (k v) (upcase k)) (lambda (k v) (concat k (capitalize v))) (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) (h-hmap (##upcase % _%2) (##concat % (capitalize %2)) (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) (h--hmap (upcase key) (concat key (capitalize value)) (h* "Bob" "cat" "Whiskers" "cat" "Bubbly" "fish")) H=> (h* "BOB" "BobCat" "WHISKERS" "WhiskersCat" "BUBBLY" "BubblyFish")
Here, the version with the llamas is shorter than the one with the regular lambdas.
The dyadic nature of the lambda is marked with a _%2
.
Without that, h-hmap
would throw a wrong-number-of-arguments error.
In the anaphoric version, the dyadic nature is already implied.
Both variables are named, which makes it immediately readable.
The llama version is a bit shorter, but the anaphoric version is clearer.
example 2d
In this example, two dyadic lambdas are passed.
Only one of the arguments is used in each of them.
In the second lambda, there's no function call: the value is passed as it is.
(h-hmap (lambda (k _v) (h-as-keyword k)) (lambda (_k v) v) (h* 'a 1 'b 2 'c 3)) (h-hmap (##h-as-keyword % _%2) (##prog1 %2) (h* 'a 1 'b 2 'c 3)) (h--hmap (h-as-keyword key) value (h* 'a 1 'b 2 'c 3)) H=> (h* :a 1 :b 2 :c 3)
This example is similar to the previous one.
Here, however, there's no function to evaluate in the second lambda. It's just the value.
But llama expects a function to be passed to ##
.
So we need to use a prog1
, which makes it a bit longer.
In the anaphoric version, value
can be be passed directly, without the prog1
workaround.
The anaphoric version makes it as clear as it can be. It's also shorter.
For typical use of triadic lambdas, anaphorics are much more readable than llamas
xht
's regular function h-2d-hmap takes three triadic lambdas; h--2d-hmap is its anaphoric counterpart.
They both return a 2D hash table.
How would the regular one look like when replacing the lambdas with llamas?
example 3a
(->> (h-2d-new '(10 21 42) '(n 2 3 4)) (h-2d-hmap (lambda (k _ _) k) (lambda (_ f _) f) (lambda (k f _) (expt k f))) h->orgtbl) (->> (h-2d-new '(10 21 42) '(n 2 3 4)) (h-2d-hmap (##prog1 %1 _%3) (##prog1 %2 _%3) (##expt %1 %2 _%3)) h->orgtbl) (->> (h-2d-new '(10 21 42) '(n 2 3 4)) (h--2d-hmap key field (expt key field)) h->orgtbl) O=> "| n | 2 | 3 | 4 | |----+------+-------+---------| | 10 | 100 | 1000 | 10000 | | 21 | 441 | 9261 | 194481 | | 42 | 1764 | 74088 | 3111696 |"
The llama version is only a bit shorter than the lambda one.
It's the least clear about the nature of input variables.
The triadic nature of the lambda is passed to llama with the _%3
.
A prog1
is needed to make llama return just the variables.
The anaphoric version passes the named variables directly.
It's by far the shortest.
It's as readable and as simple as it can be.
In xht, anaphorics look better overall
Examples 2d and 3a make a particular strong case for the value of anaphoric macros.
Anaphorics look better here:
- For all the examples above, among the three options, the anaphorics are the clearest, and almost always also the shortest.
- The llamas are often shorter than their respective lambdas here, but the lambdas come out more readable.
Moreover, llama would be an additional library, whereas xht
's anaphorics would be already available.
Therefore, if xht
is already being required, it seems preferable to simply use its anaphorics.
Anaphorics vs. llama: a summary table
We can now look again at the question of anaphorics vs. llama in light of what we've seen so far.
Anaphoric alternative... | arity=1 | arity=2 | arity=3 |
---|---|---|---|
doesn't exist at all | mapatoms · mapconcat | cl-sort (destructive) | |
exists in another library | mapcar · mapc · mapcan · seq-find | maphash | |
exists in the same library | -map · -each · -mapcat · -find | -sort · h-lmap · h-hmap | h-2d-hmap |
has no lambda counterpart | --> · -some--> |
With the exception of the last line (which doesn't apply), the case for llama gets stronger towards the table's upper left, and weaker towards its lower right.
See next
Mixing llamas and anaphorics is possible — but it's not for the faint of heart.
In Part III we see how we can combine them to disentangle nested anaphoras and reorganize threading positions.