Underappreciated Bash idioms
There're Bash idioms that I often use, much like, yet rarely see in the wild.
Is my taste to your taste? Then maybe these seemingly endangered species can reproduce and populate new habitats.
Pipes at EOL
In a 2021 article about the Oil Shell, there's something the author dislikes:
cat file.txt \ | sort \ # I can't put a comment here | cut -f1 \ # And I can't put one here | grep foo
I certainly welcome initiatives to create better shells — and who knows, maybe someday I'll switch to another one.
Yet I never have this particular problem.
This is because I always leave my Bash pipes at the end of the line:
cat file.txt |
sort | # You can put a comment here!
cut -f1 |
# And here, too!
grep foo
Do that and you can comment anywhere.
You also get rid of those backslashes, so:
- your code looks less noisy, and
- you get room for two extra characters at each of the piped lines.
And if you're an Emacs user with orgtbl-mode on, your end-of-line pipes won't get inadvertently parsed as an Org table — as happened to that first code block, whose lines you're seeing blue.
A customizable variable named fancy-joiner-leave-pipes-at-eol shows up in my Fancy Joiner package, to help people do that when in Bash. It defaults to true.
You: But then your pipes are not aligned at the end of line.
Me: Your line continuations are not aligned at the end of line either.
You: But I can align them if I want.
cat file.txt \ | sort \ | cut -f1 \ | grep foo
Me: So can I.
cat file.txt |
sort |
cut -f1 |
grep foo
You: But pipes at the beginning align automatically. It's painful to align them by hand.
Me: I agree. Fortunately, I have Emacs, and align-regexp is at a handy shortcut, after which it's | RET and done.
You: And you do that all the time?
Me: I could, but I often just leave them unaligned: I find that the alignment of the beginning of the piped lines suffices.
Prepended Here Strings
You can do this when the input is a file:
<somefile sed 's/foo/bar/g' | rev | tac | tr x y >anotherfile
This is neat because:
- You start with the input
- You process it in the middle — with no need to put that inside braces or parentheses
- You end with the output, which you can optionally redirect to some file
Visually easy to parse.
Guess what? You can do the same thing when the input is a variable or a plain string.
<<<"a string" sed 's/foo/bar/g' | rev | tac | tr x y >afile <<<"$somevar" sed 's/foo/bar/g' | rev | tac | tr x y >afile
In other words: you can put your Here Strings at the beginning.
Prepended Here Documents
And then you come across some multiline string full of quotes.
You don't want to escape them, so you use a Here Document:
{ sed 's/foo/bar/g' | rev | tac | tr x y ;} <<EOF I have a lot to say about: - "foo", - 'foo', and also - ‘foo’. EOF
And you can put Here Documents at the beginning, too.
Here's a rewrite of the above, to which I also add a redirect to a file:
<<EOF sed 's/foo/bar/g' | rev | tac | tr x y >afile I have a lot to say about: - "foo", - 'foo', and also - ‘foo’. EOF
Goodbye braces.
Note that the EOF sentinel needs to be alone in the last line — you can't add anything else there. But that's fine, because the operation is clear in the first line: input, some pipes, output.
Alas, this inversion confuses Emacs' sh-script.el: syntax highlighting of the Here Document text becomes unstable.
Parameter transformation: expand backslashes
So you have this variable holding a TSV, and it's full of newlines and tabs.
You want to interpret those backslashes and output the variable.
"Well, that's easy", you say. "Some simple echo or printf would do that."
True:
tsv="Name\tSize\tWeight\nFlange 1\t10\t70\nFlange 2\t5\t42" printf "$tsv"
Name Size Weight Flange 1 10 70 Flange 2 5 42
Now suppose you want to cut the third column, to get only the weights. You try:
tsv="Name\tSize\tWeight\nFlange 1\t10\t70\nFlange 2\t5\t42" cut -f3 <<< "$tsv"
Name\tSize\tWeight\nFlange 1\t10\t70\nFlange 2\t5\t42
...and it doesn't work, because cut by itself won't cut it. So you do this:
tsv="Name\tSize\tWeight\nFlange 1\t10\t70\nFlange 2\t5\t42" printf "$tsv" | cut -f3
Weight 70 42
which works fine.
Yet you can also interpret backslashes directly in the variable, without echo or printf. Like this:
tsv="Name\tSize\tWeight\nFlange 1\t10\t70\nFlange 2\t5\t42" cut -f3 <<< "${tsv@E}"
Weight 70 42
Here's a quick comparison:
x="a\tb" echo "1. $x" echo -e "2. $x" echo "3. ${x@E}" sed "s/^/4. /" <<< "$x" sed "s/^/5. /" <<< "${x@E}"
1. a\tb 2. a b 3. a b 4. a\tb 5. a b
Parameter expansion: change case for case
Say you want to write a function that detects if your input is the string foo — but you want it to be case insensitive, so that Foo, FOO, fOo etc are also good.
I've many times seen this sort of solution:
isfoo() { case $1 in [Ff][Oo][Oo]) echo "Yes, it's foo." ;; *) echo "Unfortunately not a foo." esac } isfoo fOO
Yes, it's foo.
That unquoted $1, although supposedly ok, makes me nervous — but nevermind that.
The point here is that the FfOoOo thing is not elegant. Were it abracadabra, we'd have 44 tediously typed characters there.
One solution is to preprocess the $1 with sed's lesser known \L flag, thereby downcasing it:
isfoo() { local low low=$(<<<"$1" sed "s/.*/\L&/") case "$low" in foo) echo "Yes, it's foo." ;; * ) echo "Unfortunately not a foo." esac } isfoo fOO
Yes, it's foo.
You could also ditch the local variable like this:
isfoo() { : "$(<<<"$1" sed "s/.*/\L&/")" case "$_" in foo) echo "Yes, it's foo." ;; * ) echo "Unfortunately not a foo." esac }
or directly:
isfoo() { case "$(<<<"$1" sed "s/.*/\L&/")" in foo) echo "Yes, it's foo." ;; * ) echo "Unfortunately not a foo." esac }
But really, we don't need that sed, because Bash can lowercase it with parameter expansion.
(Here's all you need to know about case modification)
| Char | Does |
|---|---|
^ |
Upcasing |
, |
Downcasing |
~ |
Togglecasing |
- Use one, and it's applied to the first letter.
- Use two, and it's applied to all.
- Add something after it, and application is restricted to those letters.
Example:
x=foo echo "${x^}"
Foo
Many examples, output as a table:
x=foo echo "${x^} ${x^^} ${x~} ${x~~} ${x^g} ${x^f} ${x^^o}" x=FOOBar echo "${x,} ${x,,} ${x~} ${x~~} ${x,F} ${x,,O} ${x,,[OB]}"
| Foo | FOO | Foo | FOO | foo | Foo | fOO |
| fOOBar | foobar | fOOBar | foobAR | fOOBar | FooBar | Foobar |
See? Easy.
So you can simply do this:
isfoo() case "${1,,}" in foo) echo "Yes, it's foo." ;; * ) echo "Unfortunately not a foo." esac
Note that I got rid of those braces, too. This is because you can have...
Braceless function definitions
I'll show you some unusual function definitions, so brace yourself — or rather, don't.
foo() (cd "$1" && printf "%s\n" y*) # list all in dir $1 that start with y foo() ((counter++)) # increment variable $counter foo() [[ -e "$1" ]] # does file $1 exist? foo() while :; do printf "foo: "; wc -l ~/foo; sleep 10m; done # keep checking foo() until [[ -e ~/foo ]]; do echo "Not yet"; sleep 10m; done # keep checking foo() for f; do [[ -e "$f" ]] && echo "$f: ok" || exit 1; done # files exist? foo() if (("$1">42)); then echo "$1: too large" >&2; exit 2; fi # valid $1? foo() case "${1,}" in y) echo Ok;; *) echo Quitting; exit 3;; esac # continue?
ShellCheck will scream at all of these except the first two.
Yet I don't think it should: they're perfectly legal and well-documented. Here's man bash:
Shell Function Definitions
… Shell functions are declared as follows:
fname () compound-command [redirection]
function fname [()] compound-command [redirection]
This defines a function named fname. The reserved word function is
optional. If the function reserved word is supplied, the parentheses
are optional. The body of the function is the compound command
compound-command … usually a list of commands between { and }, but may
be any command listed under Compound Commands above. If the function
reserved word is used, but the parentheses are not supplied, the
braces are recommended.
Ok, but what's a compound-command?
Compound Commands
A compound command is one of the following. …
… (list)
… { list ;}
… ((expression))
… [[ expression ]]
… for … do … done
… select … do … done
… case … esac
… if … then … [elif … then] … else … fi
… while … do … done
… until … do … done
Braces are a good default because they work for most cases, but any of these are equally valid for defining functions.
And have you noticed the [redirection] detail? This means that if you append a redirect you'll still not need braces, as exemplified by this braceless emoji-flag–generating function:
f-territ() while read -N1 -r c do : "$(printf "$c" | od -An -t x1)" : "$(printf "%X" "$((0xa5 + 0x${_// }))")" printf "\U1F1$_" done < <(printf "$1")
f-territ EU #⇒ 🇪🇺
Arithmetic evaluation and arithmetic expansion
They're shorter, more readable, faster.
What's there not to like?
They're simple to understand, and it's easy to get started.