understanding lazy-seq

by paul, on 02.08.2013

I've been working on improving my clojure skills by occasionally doing problems on 4clojure, a great resource. I've been trying to make my code more idiomatic by comparing my solutions against others' answers, or sometimes clojure core, to see where I could have written a nicer solution. This has been really helpful for me.

One of the 4clojure problems is to re-implement iterate. After I did this, I viewed the clojure source code to see how it's done:

=> (source iterate)
(defn iterate
  "Returns a lazy sequence of x, (f x), (f (f x)) etc. f must be free of side-effects"
  {:added "1.0"
   :static true}
  [f x] (cons x (lazy-seq (iterate f (f x)))))

To someone who's recently learned clojure, this might look funny. Aren't we supposed to be using loop/recur to avoid the dreaded java.lang.StackOverflowError? why is explicit recursion ok here?

The answer is lazy-seq macro, which, when used correctly, allows us to avoid realizing the the entire sequence at once. The lazy-seq macro essentially allows the runtime to step through our sequence; keeping only the portion in memory that is being asked for. It's our responsibility to code our function in such a way that we don't ask for the whole thing at once. Remember, the java.lang.StackOverflowError occurs because each function call creates a stack frame, and those frames are stacked and can't be garbage collected until the recursion is unwound. Wrapping the recursive call with lazy-seq allows us to code recursively, without fully realizing the entire sequence all at once at runtime. The key to lazy-seq is understanding that only part of your sequence should be calculated in memory at once, and writing your code in such a way that we don't inadvertently hold on to a reference for to the full sequence, by storing a reference to the head of the sequence, for example.

There are some caveats to lazy-seq, which are outlined in any good clojure text. Basically, it boils down to the earlier recipe:

  1. Don't keep a reference to the head of your sequence in a top level variable. The "Joy of Clojure" and "Programming Clojure" both have great examples of this, I don't want to plagiarize. They're both well worth the money.
  2. The lazy-seq macro should go at the top level of the function.
  3. Don't use functions that force a full realization. Ensure that your function body doesn't force evaluation; keep everything lazy. The example from the clojure docs is to use "rest" instead of "next". When in doubt, check the docs.

I hope this post is helpful to others; I write these mostly as a way of solidifying my own understanding of the language, not because I think we need yet another blog post explaining lazy-seq. However, I do hope that this might be useful to other folks getting familiar with the language.

Categories: Technology, Clojure