Counting Words V: More solutions in Emacs Lisp

Years ago, I played with Ben Hoyt’s word-counting exercise. At that time, I wrote 13 solutions in Emacs Lisp. Eleven of these solutions were under relaxed constraints, as explained in the Introduction, whereas the last two strictly followed the instructions.

“Only use the language’s standard library functions” was one of the constraints that I relaxed. Why? I wrote:

Because doing this without dash, f, s, and ht would feel a bit tedious and counterproductive. I’m used to them. And they are common (or “pretty standard”).
So it’s not as if I’m pulling from somewhere some count-word library to “solve” the problem with:

(require 'count-words)  ; Nope, apparently not available
(count-words "kjv×10.txt")

Yet I did think that such a count-words function would be a useful thing to have out-of-the-box from some library.

At that time, I hadn’t yet written xht — so when I did write xht, I made it a point of honor to write some hash table functions that could slurp a string and parse their lines or words into a hash table. So we can now do this:

(h<-lines "alice\nbob\ncharlie")  H=> (h* 0 "alice" 1 "bob" 2 "charlie")

which uses 0-indexed line numbers as keys and the line strings as values — and this:

(h-count<-lines "These here\nare 5\nlines\nand not 2\nlines")

H=> (h* "These here" 1
        "are 5"      1
        "lines"      2
        "and not 2"  1)

which uses line strings as keys and their counts as values — and, most relevant to our current topic, this:

(h-count<-words "\
These are words, and words,
and more words.")

H=> (h* "These" 1
        "are"   1
        "words" 3
        "and"   2
        "more"  1)

which counts the words in a string.

Yet I hadn’t come back to apply it to this very problem.

Now, you could say that using this would totally count as “pulling from somewhere some count-word library to ‘solve’ the problem” — and I’d agree. But since I wrote that whole multi-thousand-line count-word–capable hash table library myself, if I decide to use it to solve this word-counting problem, then I can’t quite be charged with using some “cheating shortcut”, right? That’s quite the opposite of a shortcut. I’m thus more than comfortable unquoting that “solve”.

So without further ado, here’re still more Emacs Lisp solutions, xht edition. Note that none of them use the cw-sort-alist helper, as many of the previous did, so each of these is a complete solution — and among the shortest.

xht solutions

(require 'xht)
(require 'f)
;;;;;;; X1. Slurp file, use only hash tables and simple lists
(defun cw-count-it-X1 (file)
  "Given a FILE, count words."
  (let* ((wc-ht (->> file  f-read  downcase
                     (s-replace-regexp "[^[:alpha:]]+" " ")
                     h-count<-words))
         (cw-ht (h-empty-clone wc-ht)))
    (h--each wc-ht
      (h-put-add! cw-ht value key))
    (with-output-to-string
      (dolist (number (-sort #'> (h-keys cw-ht)))
        (dolist (word (-sort #'string<
                             (-list (h-get cw-ht number))))
          (princ (format "%s %s\n" word number)))))))
;;;;;;; X2. Same as X1, but get the counts sorted earlier on
(defun cw-count-it-X2 (file)
  "Given a FILE, count words."
  (let* ((wc-ht (->> file  f-read  downcase
                     (s-replace-regexp "[^[:alpha:]]+" " ")
                     h-count<-words))
         (sortd (-sort #'> (-uniq (h-values wc-ht))))
         (cw-ht (h-zip-lists sortd '())))
    (h--each wc-ht
      (h-put-add! cw-ht value key))
    (with-output-to-string
      (dolist (number sortd)
        (dolist (word (-sort #'string<
                             (-list (h-get cw-ht number))))
          (princ (format "%s %s\n" word number)))))))
;;;;;;; X3. Open input in a buffer and sort numeric fields
(defun cw-count-it-X3 (file)
  "Given a FILE, count words."
  (with-temp-buffer
    (insert-file-contents file)
    (downcase-region (point-min) (point-max))
    (while (re-search-forward "[^[:alpha:]]+" nil 'noerror)
      (replace-match " "))
    (let ((htbl (h-count<-words (buffer-string)))
          (standard-output (current-buffer)))
      (erase-buffer)
      (h--each htbl
        (princ (format "%s %s\n" key value)))
      (let ((min (point-min))
            (max (point-max)))
        (sort-lines       'rev min max)
        (sort-numeric-fields 2 min max)
        (reverse-region        min max)
        (buffer-string)))))
;;;;;;; X4. Slurp file and sort numeric fields
(defun cw-count-it-X4 (file)
  "Given a FILE, count words."
  (with-temp-buffer
    (let ((standard-output (current-buffer)))
      (--> file  f-read  downcase
           (s-replace-regexp "[^[:alpha:]]+" " " it)
           h-count<-words
           (h--each it
             (princ (format "%s %s\n" key value))))
      (let ((min (point-min))
            (max (point-max)))
        (sort-lines       'rev min max)
        (sort-numeric-fields 2 min max)
        (reverse-region        min max)
        (buffer-string)))))
;;;;;;; X5. Slurp file, convert to alist, sort it
(defun cw-count-it-X5 (file)
  "Given a FILE, count words."
  (--> file  f-read  downcase
       (s-replace-regexp "[^[:alpha:]]+" " " it)
       h-count<-words  h->alist
       (--sort (string< (car it) (car other)) it)
       (--sort (>       (cdr it) (cdr other)) it)
       (with-output-to-string
         (--each it
           (princ (format "%s %s\n" (car it) (cdr it)))))))

They all produce the same result.

Below is a test with the minimal test file foo.txt.

(Why aren’t equal, eq, eql, and string= variadic, just like =?)

;; in/foo.txt:
"  The 12 foozle bar: the
Bars' bar, bar, foo's bar, foo."

(let* ((foo "in/foo.txt")
       (X1 (cw-count-it-X1 foo))
       (X2 (cw-count-it-X2 foo))
       (X3 (cw-count-it-X3 foo))
       (X4 (cw-count-it-X4 foo))
       (X5 (cw-count-it-X5 foo))
       (Xs (list X1 X2 X3 X4 X5)))
  (--reduce (when (string= acc it) acc) Xs))
=> "\
bar 4
foo 2
the 2
bars 1
foozle 1
s 1
"

They likewise work with the exercise’s large kjv×10.txt input file.

So we now have 16+2 Emacs Lisp solutions to the word-counting exercise.

📆 2026-W15-4📆 2026-04-09