Clojure: using the source

by paul, on 11.04.2012

One of the coolest things about Clojure is the readability of the core library. Not only is the source to almost any function available at the REPL, but usually, the code is easy to read and understand. In music, one of the best ways to learn is by recording yourself playing a tune, listening to a master musician play the same tune, and note the differences; there is often much you can learn.

Tonight I was prototyping some data wrangling on CSV files for work, and I wanted to merge the header of a CSV file with each row to form a map. Of course, zipmap in clojure does just this. Since it's sunday night and I'm messing around anyways, why not write my own, probably naive version, and compare it with the canonical version, and see what I can learn?

Here's the implementation I came up with

(defn group-row [header data & {:keys [result] :or {result {}}}]
    (empty? data) result
    :else (assoc 
        (group-row (rest header) (rest data) :result result) 
        (first header) (first data))))

basically, I assoc the current first elements of each list with a recursion on the rest of the map. I use an optional argument to collect the results that defaults to an empty map. Not terrible, but let's see how it's done in clojure core:

(source zipmap)
(defn zipmap
  "Returns a map with the keys mapped to the corresponding vals."
  {:added "1.0"
   :static true}
  [keys vals]
    (loop [map {}
           ks (seq keys)
           vs (seq vals)]
      (if (and ks vs)
        (recur (assoc map (first ks) (first vs))
               (next ks)
               (next vs))

Ok, I actually got pretty close! The author of this function uses a loop/recur to add the additional "result" argument, I used straight recursion even though I should, and do, know better. I didn't know the difference between "rest" and "next", but stackoverflow has a kinda clear explanation. Looking at the at the function definition, I am not sure why the explicit "seq" statements are there, I will need to think about this a bit. in any case, I wanted to clean up my initial function, to re-factor it to use a recur, I needed to move some nested calls around, but I think this looks better:

(defn group-row [header data & {:keys [result] :or {result {}}}]
    (let [hs (seq header) vs (seq data)]
    (empty? vs) result
            (rest hs) (rest vs)
            :result (assoc result (first hs) (first vs) )

It's always nice to write a little code and learn a little something in the process.

Categories: Technology, Clojure