Lisp-ish macros in Bash (sort of)

Introduction

There's a popular article called The Nature of Lisp (2006). Read it if you're interested in Lisp. All those parentheses, those macros — what's the big deal with Lisp, after all?

I remember admiring the article's gentle approach. Something happened, though. While reading it, I caught myself thinking: "Yes, macros are powerful when used judiciously. But are they exclusive to Lisp? Don't I also sort of do what they do, sometimes, in... Bash?"

Gasp! I couldn't be serious. Was I suddenly comparing the "Language of the Enlightened Programmer" to that... "non-language script thing"? A common feeling that mythical creatures called Real Programmers nurture towards Bash seems to be "reluctance to even touch it unless strictly unavoidable".

Why? Because it's slow, it's dangerous, it's untyped, it has inconsistent syntax, and everything is a string — unless when it's something that sort of looks like it might be an array, which in the end will no doubt return... strings.

And I love it anyway. Go figure.

I gave it some thought and concluded that (something resembling) Lisp macros–like constructions are right there, in the entrails of Bash, ready to be used (or abused) by the intrepid.

Ok, true:

  • it's not officially called "macro"¹,
  • it's not as elegant and solid and flexible and graceful as Lisp's,
  • it's not quite homoiconic,
  • it's almost certainly less safe than Lisp's,
  • it's much slower,
  • it abuses Bash's stringness,
  • and it sure has all sorts of lower-level differences from Lisp's.

¹ And yet, if a Lisp macro is basically a Lisp function that generates Lisp code, then is it too unreasonable to call "Bash macro" a Bash function that generates Bash code?

So be warned: The comparison that follows is inexact, limited, and to be taken with a large grain of salt.

But it's there — at least from the day-to-day perspective of someone wanting to quickly input data and output its evaluation as code.

Let me show you what I mean. Then you can tell me whether it's at least practically useful.

Let's pick a Lisp macro example...

...from the The Nature of Lisp article and run with it.

Say, this one (whose code I take the liberty to highlight):

"All we need to do is create the macros that convert our data to appropriate code! For example, a macro similar to our triple C macro we showed earlier looks like this:

(defmacro triple (x)
    '(+ ~x ~x ~x))

The quote prevents evaluation while the tilde allows it. Now every time triple is encountered in lisp code:

(triple 4)

it is replaced with the following code:

(+ 4 4 4)

That was one example of macros from the article. Neat.

Ok, so here's an implementation of it in that "poor, just glue, non-general, scripting, not-really-a-language, little" language:

#!/bin/bash
triple() (echo $((x="$1",x+x+x))) #<--- Look, Bash can do parentheses, too!
triple 4
12

"No!", you say. "Don't return the value, return the function. It's metaprogramming. Generate code that can be executed!"

No problem:

triple()     (echo "\$(($1+$1+$1))")
maketriple() (echo echo \"You have $(triple $1)\")  # * ** ***
runtriple()  (. <(maketriple "$1"))

Notes:

*   Alternative that avoids escaping quotes, but uses sed ("impure Bash"):
maketriple_alt() (sed 's/.*/echo "&"/' <<< "You have $(triple $1)")
**  For the sake of legibility, in this article I'm not double-quoting
    variables as much as it would be recommended for safety. I'm assuming that
    the "$1" will be numbers and entered by yourself. A more careful quoting
    and more checks would be needed otherwise.

*** I'm also "using more parentheses than needed", for aesthetic reasons.
    This sacrifices speed, because it calls a subshell. Regular braces are
    usually faster.

So let's run that:

triple     4
maketriple 4
runtriple  4

Output:

$((4+4+4))
echo "You have $((4+4+4))"
You have 12

Ok, let's try something else. Can we implement a mapping function?

data="4 1 18 9 3"

map()        (for y in $2; do $1 $y; done)
# yes, arrays over here ^ would have been better than unquoting the string to
# pass it to "for", but it'd make our examples here less legible: "${2[@]}", see?

map maketriple "$data"; echo
map runtriple  "$data"

Output:

echo "You have $((4+4+4))"
echo "You have $((1+1+1))"
echo "You have $((18+18+18))"
echo "You have $((9+9+9))"
echo "You have $((3+3+3))"

You have 12
You have 3
You have 54
You have 27
You have 9

More meta? Ok.

makemap()    (echo map $1 \"$2\")
makemap maketriple '$data'
makemap runtriple  '$data'

Output:

map maketriple "$data"
map runtriple "$data"

Still more meta? Ok.

runmap()     (. <(makemap "$@"))
runmap  maketriple '$data'; echo
runmap  runtriple  '$data'

Output:

echo "You have $((4+4+4))"
echo "You have $((1+1+1))"
echo "You have $((18+18+18))"
echo "You have $((9+9+9))"
echo "You have $((3+3+3))"

You have 12
You have 3
You have 54
You have 27
You have 9

And just to drive this home from a different angle, here are stepwise (macro?)expansions of the last expression above. All of the below evaluate to that same result.

runmap  runtriple  '$data'
. <(makemap "runtriple  '$data'")
. <(echo map runtriple \"$data\")
. <(echo "for y in \$data; do runtriple \$y; done")
. <(echo "for y in \$data; do . <(maketriple \$y); done")
. <(echo "for y in \$data; do . <(echo echo \"You have $(triple \$y)\"); done")
. <(echo "for y in \$data; do . <(echo echo \"You have $(echo "\$((\$y+\$y+\$y))")\"); done")

. <(echo \
      "for y in \$data; do
         . <(echo \
               echo \
                 \"You have $(a='$(($y+$y+$y))'
                              echo \"$a\")\")
       done")

Look at the form and size of the last expression. Notice:

  1. composition of functions, and
  2. "data" being executed as code.

As you do it, overlook for the moment Bash's unsugared and slightly inelegant backslashing and echoing. But also remember that while Lisp's strings-evaluating-to-themselves do away with the "echos", the language can be plagued with worse backslashing than Bash in its regexes — such as this short example, from Emacs' org.el:

(defun org-fill-line-break-nobreak-p ()
  "Non-nil when a new line at point would create an Org line break."
  (save-excursion
    (skip-chars-backward "[ \t]")
    (skip-chars-backward "\\\\")
    (looking-at "\\\\\\\\\\($\\|[^\\\\]\\)")))

(I suppose the proper way to translate the last line would be something like:
"You are allowed to create an Org line break whenever you find yourself staring at Ba'al, the Soul‑Eater.")

What is this odd . <(commands) syntax, anyway?

Let's see:

help .
.: . filename [arguments]
    Execute commands from a file in the current shell.

    Read and execute commands from FILENAME in the current shell.  The
    entries in $PATH are used to find the directory containing FILENAME.
    If any ARGUMENTS are supplied, they become the positional parameters
    when FILENAME is executed.

    Exit Status:
    Returns the status of the last command executed in FILENAME; fails if
    FILENAME cannot be read.

So the dot is a synonym for source, which executes commands in a file as if they were in the current shell (or script).

"But wait!", you say, "We aren't passing a file."

True. We're passing <(commands).
What this means is: execute commands in a subshell and pretend that the results were read from a file.

This is a useful syntax to use as input to commands that expect files as input. A construction I use often is:

paste <(seq 10 12) <(seq 20 22)
10	20
11	21
12	22

since paste take files as input and I don't want to create two 3-line files.

I also use it in while blocks:

while read -r line
do stuff with the "$line"
done < <(seq 5)

because I don't want to create a temp file for that as it'd be needed here:

while read -r line
do stuff with the "$line"
done < some-file-with-seq-results

And the following is fine, as long as you don't have variable assignments inside the while-loop, which would be lost because it's after the pipe (Bash is full of traps, you know).

seq 5 |
    while read -r line
    do stuff with the "$line"
    done

For more, search man bash:

/Process Substitution

In summary,

. <(commands)

means: pretend the output of commands come from a file, and execute them.

For more examples of use of this syntax, check my Enjoying Lisp-like quasiquotation in Bash.

Some closing thoughts

Speed

Parentheses here imply calling a subshell. This has a cost in speed, especially if this syntax is passed inside loops. When speed is an important concern, Bash is likely to be a poor choice of language anyway.

Security

Executing strings as code can be unsafe. So use it judiciously, or don't.

Is this syntax any safer than eval? I'm not sure. But I do find it more predictable than it, and simpler to use.

And so

Bash is no Lisp (like, obviously).
But it can (sort of, in a way, so to speak, to some extent, with a grain of salt) do macros.