Aug 18, 2021 6 min read Lambda Island

A Tale of Three Clojures

By Alys Brooks

Recently, I was helping a coworker debug an issue with loading a Clojure dependency from a Git repository. (If you don’t know you can do this; it’s very handy. Here’s a guide.) I realized that there were really two Clojures at play: the Clojure that clojure was running to generate the classpath and the Clojure that was used by the actual Clojure program.

Taking a step back, there are really three things we might mean when we say “Clojure”:

The language of Clojure, as in “Clojure has immutable data structures.”
The commandline tool, as in clj
The dependency of Clojure, as in [org.clojure/clojure "1.10.1"]

Let’s disambiguate these three different Clojures and learn a little about how Clojure works under the hood along the way.

Clojure the language

Usually when we’re talking about “Clojure,” we’re talking about the language, like when we say things like “Clojure is a Lisp” or “Clojure has multiple lock-free concurrency abstractions.”

But what about statements about stack traces, like “If I see another impenetrable Clojure stack trace today, I’m going to give up programming for good and become a lighthouse operator off the coast of Nova Scotia”? Stack traces have a lot to do with the language—they include the namespace and function for each line of the file in Clojure. However, they also have a lot to do with the implementation. Stack traces of alternate implementations of Clojure, like Clojurescript, reflect their host platform and the resulting decisions the implementers made. For the JVM version of Clojure, the implementation consists of classes (usually bundled as a JAR) specified on the classpath, which we’ll discuss in the final section of this article.

For once, we’re not going to say much about Clojure the language. To actually use Clojure, we need to run it. That’s usually accomplished with Leiningen, Boot, or our next topic, the Clojure CLI tool.

Clojure the CLI tool

Officially, the clojure CLI tool’s purpose is to “declare dependencies, assemble classpaths, and launch Clojure programs with data.”

clojure serves a similar role as interpreters and runtimes like python, node, and java, but as the “with data” above hinted, it does so in a particularly Clojure-y way.

To help understand what it does, let’s grab the JAR, go back in time and start Clojure manually, like Clojure.org recommended in 2008:


java -cp clojure.jar clojure.lang.Repl

One does not simply load a single JAR in the very latest versions because of the new dependency on clojure.spec.alpha, but it does work for Clojure 1.8.0 and earlier versions. If you have it locally, you can update the above command to:


java -cp ~/.m2/repository/org/clojure/clojure/1.8.0/clojure-1.8.0.jar clojure.lang.Repl

What does this do? Well, it starts up a JVM, telling it that we want to load a JAR (basically a zip file of classes), and run the clojure.lang.Repl class from it. All other classes it needs are either in the JAR or are included with the JVM itself.

Say, what if you want to run an actual Clojure program? Simply call:


java -cp clojure.jar clojure.lang.Script script.clj

Easy! If we can start Clojure and run programs in compact one-liners, what’s the point of the clojure CLI tool? Well, if you have any sort of dependencies beyond Clojure itself, manually determining what goes after -cp quickly becomes challenging.

Even before the clojure CLI tool, Leiningen or Boot, people used Java tools to help them manage the classpath. When I asked on Clojurians slack, a few people commented that they used either ant or a Maven plugin. According to Alex Miller, the Clojure Core team still uses Maven plus clojure-maven-plugin. However, once introduced, Leiningen swept the Clojure world. The 2012 State of Clojure survey, the first that asked about build tooling, showed that 95 percent of users used Leiningen. Fast-forward to 2021, and Leiningen is now down to 75 percent, Maven is down to 5 percent, and Ant isn’t even on the list anymore (although two people mentioned it under Other).

Thus, the most important job of clojure (as well as Leiningen and Boot) is generating the part of the Java invocation that comes right after -cp. This illustrates a flaw in my comparison of clojure to python or java. While it is true that a Clojure programmer uses clojure to start a Clojure program in the same way she uses python to run a Python script or java to run a compiled Java program, clojure the CLI tool is a launcher (and with tools.build, a build tool)—it isn’t a runtime or interpreter in itself. (That’s java, plus the core language classes in that clojure.jar we saw above.)

How does it know what dependencies to add to the classpath? Because it’s a Clojure tool, the dependencies are stored in the EDN format, which is basically just the parts of Clojure syntax that are useful for specifying data. (Because of the close correspondence between data and code in Lisps, the “parts of Clojure syntax that are useful for specifying data” turns out to be…most of it.) For example, [1 2 3] is valid EDN. (vector 1 2 3) is also valid EDN, but it won’t be evaluated. In other words, it will be read as a list containing the symbol vector followed by the numbers 1, 2, and 3, but the vector symbol won’t be looked up and called, so it will remain a list. Of course, once read, you can pass the list to eval yourself (resulting in [1 2 3]) or do whatever else you want to with it.

While we’re on the topic, we can also look at the difference between clojureand clj. clj is actually a script rather than a binary, so we can look at it directly in a text editor. Here’s what it contains on my system:

#!/usr/bin/env bash

if type -p rlwrap >/dev/null 2>&1; then
  exec rlwrap -r -q '\"' -b "(){}[],^%#@\";:'" clojure "$@"
else
  echo "Please install rlwrap for command editing or use \"clojure\" instead."
  exit 1
fi

If rlwrap exists, it runs it with a few Clojure-specific settings, like indicating that parentheses are not part of words but separate them. If rlwrap doesn’t exist, it echoes a message telling the user to install it. In short, clj simply wraps clojure with rlwrap, which adds completion, editing, and command history.

So, what’s in that Clojure.jar that clojure loads for us? That brings us to the final of the three Clojures. Clojure the Dependency.

Clojure the Dependency

A lot of Clojure projects, whether they use the Clojure CLI tool (sometimes referred to by its main library, tools.deps), Leiningen, or Boot, include a reference to Clojure itself as a dependency. This Clojure dependency is not the same as the Clojure standard library (although it certainly includes it). More on that later.

However, all three of these tools are themselves written in Clojure, so how does this work?

Here’s an abbreviated version of the process when the user runs clj:

Runs the Clojure CLI tool once to generate the proper classpath. On this run, the Clojure CLI tool’s classpath consists of a hard-coded JAR that contains the classes that make up clojure as well as its dependencies.
Runs the Clojure CLI tool again with the new classpath in order to actually provide the REPL or run the application.

This means java actually gets started twice, unless it can use a cached classpath and skip step 1.

I mentioned earlier that the Clojure dependency is more than the standard library. Let’s open that early Clojure JAR from 2008 to see what’s inside. It is an archive containing over 200 files, so we won’t look at them all. Here are some broad categories of what’s inside:

Actual machinery of the language. For example, lang/LispReader.class is the reader, which takes the characters and puts them into a data structure—the lists Lisp is so famous for. (The reader is responsible for the Read part of Read-Evaluate-Print-Loop (REPL).)
Persistent data structure implementations. For example, lang/PersistentArrayMap.class is one of the classes used for hash maps.
Various interfaces, like clojure/lang/ISeq.class, which is the interface all seqs implement.
Core libraries, like clojure/set/set.clj, which contains the clojure.set namespace.

What does that mean in practice? One example is the addition of namespace map syntax, which was added in Clojure 1.9. In order for this to work, you need a copy of lang/LispReader.classthat supports this syntax. Running clojure -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.10.1"}}}' -e "(keys #::{:a 1, :b 2})" succeeds, while clojure -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.8.0"}}}' -e "(keys #::{:a 1, :b 2})" fails. Syntax changes like this are actually pretty rare because it’s easy to extend Clojure without adding syntax. More often, the Clojure core team adds features to the core libraries. For example, Clojure 1.9 also added a number of predicates, like boolean? and int?, in order to support the new spec library.

To tie it all together, in order to take advantage of the new namespace map syntax added to the language or the new predicates added to the core library in Clojure 1.9, you need to specify the Clojure 1.9 dependencyto the clojure CLI tool.

Clojure the language

Clojure the CLI tool

Clojure the Dependency

admin

Comments ( )

You might also like...

Comments ()