Clojure Gazette 1.80

Lean Clojure: An Interview with Reid McKenzie

Clojure Gazette

Issue 1.80June 15, 2014

Dear Clojurists,

Our next interview of the Google Summer of Code in Clojure is with Reid McKenzie.

Clojure Gazette: Are you a student? Where and what are you studying?

Reid McKenzie: I am an undergraduate at the University of Texas at Austin majoring in Computer Science, technically on the Turing Honors degree plan which places an emphasis on research and near graduate level work. In the fall I will be returning to UT for my 4th and final year of school. I hope to repeat Ambrose B. S.'s success in using GSoC work for my undergraduate thesis, hence part of my interest in GSoC.

CG: How did you get started with Clojure?

RM: I have been playing with lisps since high school when they were recommended to me as an inroad to functional programming. The idea of macros also appealed to me as it was evident that they were tightly coupled with the way that the language compiler worked and I've always been interested by how human readable code was translated to machine code.

Clojure was suggested to me by William Ting, a friend from school, as being a lisp family language which would satisfy the discontent I was feeling in exploring Scheme and Common Lisp. I picked up Clojure in the spring of 2012 through a combination of reading Joy of Clojure, Practical Clojure, Clojure Programming and most importantly asking freenode's #clojure for help. Clojure did indeed satisfy my complaints against other Lisp dialects and was interesting to learn as the data immutability model of computation forced me to move away from the imperative update style I was used to from C, Python, and other Lisps towards a much more functional style. Today Clojure has replaced Python as my go-to language for side projects and other work where I have language flexibility because I find that I'm impractically more productive writing Clojure than I am in C or Python because my bugs are less involved, and that anecdotally I have more fun writing Clojure.

CG: What project are you working on?

RM: My GSoC project is to build an alternative Clojure compiler which takes a whole program optimization approach to emitting efficient Clojure bytecode. In order to enable the dynamic REPL driven development model that we're familiar with and love in Clojure the core Clojure compiler trades off execution speed and memory footprint in favor of enabling dynamic behavior like runtime var redefinition and metadata introspection. While these features are awesome from a developer's standpoint, their costs are at present still felt at deployment time when the benifits of runtime dynamism are for the most part no longer apparent.

My GSoC project is to leverage the work that Nicola Mometto did last year in developing tools.analyzer(.jvm) and tools.emitter.jvm to build an alternative Clojure bytecode emission system which seeks to discard runtime dynamic behavior wherever possible in the hope of reaping performance and memory usage improvements. The final product of my project will likely be changes to leiningen allowing users to specify compilation modes as part of their AOT configuration, at a minimum providing choice between the existing Clojure core compiler, Nicola's tools.emitter compiler and my tools.emitter-derived compiler. My hope is that this helps open the way for building other Clojure emitters and compilation modes playing more advanced tricks such as using core.typed to emit more tightly typed bytecode and compiler introduction of transients where possible but both of those goals are beyond the scope of this summer unless things prove much easier than I expect.

CG: What kinds of optimizations are you planning on building into the compiler?

RM: The two big tricks that I'm looking to play are tree shaking and var elimination.

Tree shaking is simple: if some var "foo" is defined but not used, don't define it. This sounds silly, but it could be really interesting when deploying a production application. I'll pick on clojure.zip here, but the choice of victim is arbitrary. Say I use one function from clojure.zip in my entire application and I never use clojure.pprint. At present, all of the sourcecode for clojure.zip and clojure.pprint will be packaged into your application uberjar because no attempt is made to prove that they will never be used. If you do in fact never use clojure.pprint, all its inclusion does is waste memory and load time, as does the rest of clojure.zip that's just baggage. The first goal of my compiler is to be able to do this analysis of exactly what code can be reached at runtime and simply discard everything else ahead of time. Now this of course relies on your never using clojure.core/eval, tools.emitter.jvm/eval, or clojure.core/symbol at runtime, but by blacklisting those symbols and aborting tree shaking when they occur I expect to be able to eliminate significant subsets of many codebases at compile time. This is what I'm working on at present.

Var (or rather IFn) elimination is more involved. The JVM prior to 1.8 with the introduction of bytecode lambdas has no way to move a "function" around as a value. The workaround that Rich came up with when Clojure was first designed was the IFn interface, which provides a standard for the .invoke() method and defines a "function" to be anything which can be .invoke()ed. When a function is compiled, there is a function object that is created and implements IFn appropriately. The wrapping namespace compiles to a class with an init method which installs a Var in the global Clojure var table under the appropriate name and with an instance of the implementing IFn as its value. Uses of this function as a symbol indirect through the global symbol table to look up the var, dereference the Var, and call the .invoke() of the resulting IFn. This means that even in the same namespace, uses of a Var indirect through the global symbol table. This is reasonable from the context of REPL driven development because it ensures that the "most recent" definition of "foo" is always used however most applications do not redefine symbols at runtime except in the well defined case of dynamic vars.

This means that, for the most part, all these little objects with .invoke() methods are wasteful because the IFn class is needed only when a function can be taken as a value, and the var indirection is needed only when a symbol may be redefined. This means that when the var is not redefined and the function is not taken as a value that there is no reason to build a separate object and incur the load time and memory footprint of an extra object when we can get away with merging multiple fn implementations into a single class. Joining several fns into a single class should yield a load time benefit, but most interestingly should yield a performance benefit because fns implemented by the same class in different methods can simply invoke each other as methods rather than having to do a var table lookup. For those fns which are taken as values we have to preserve the IFn class which can be passed as a value, however the fn value class need not provide implementations: it can defer to the merged (and faster!) merged fn blob for its implementations. This opens the doors to some really fun tricks such as merging all the fns at a namespace level, merging fns at some sort of "module" or "library" level and even trying to merge all the fns in the entire application, which would allow for near total elimination of var indirection.

CG: Wow. That sounds really ambitious. Do you have any idea what kind of speedup that will give?

RM: To be blunt I have no idea. I expect that for large applications with many dependencies I will be able to do at least a 10-25% reduction of code size by eliminating unused parts of Clojure core and of library dependencies. This number may well be larger, but any system for measuring dead code will be at least as complicated as the one I need to build so I plan on just finding out.

As far as t he function squashing goes again I have no scientific way to estimate the speedup. However, the var dereference operation chases through a JVM "transient" field which prevents the JVM from inlining the referenced IFn application. As a memory reference is amazingly expensive compared to a branch to an address in the processor instruction cache I could potentially see a factor of two or more speedup in some applications. The real issue I expect to run into here is that most Clojure applications are memory bound. Without compiler introduced transients or aggressive programmer use of transients the functional update style that Clojure promotes has a huge memory footprint and plays hell with hardware prefetching structures and thus performance. Looking at Clojure benchmarks against say Scala, Java and Haskell it's clear that while we often do well in terms of LoC and total runtime we often "feature" a factor of two or more memory usage increase. So for applications involving complex functional updates and copying of datastructures the speedup may not be huge but again there's only one way to find out.

CG: What are the major challenges of the project?

RM: I expect that the biggest challenge I'll have to deal with this summer is feature creep and keeping myself focused. Most compiler infrastructures have well developed, well defined, type driven structures for what exists in a program tree. Because as Tim Baldridge put it tools.analyzer is data driven and self descriptive I don't have that to work from. That represents a significant architectural challenge for me because I need to unlearn working with a type driven, operation limited program tree.

Figuring out what level various transformations are legitimate is gonna be difficult too. I made function merging sound really easy earlier. Unfortunately that's something that will need to take place at the namespace or whatever the merge unit is emission level. Clojure compilers to date have only ever done emission at the expression level, which means that I'm architecturally on my own. All of this combined means that I get to design the first nontrivial compiler and compiler infrastructure for Clojure with all the consequent opportunities to explore architectural rabbit holes and features that won't help me get my core product delivered. It also doesn't help that I've got a dozen other ideas for tricks my finished compiler could be modified to play which don't contribute to my stated MVP.

CG: Sounds like quite a big task. Do you think this will dig up inconsistencies in Clojure's treatment of namespaces? Will this be able to clean things up?

RM: Almost certainly not. As the saying goes, I already have enough rope to hang myself with. As there is no explicit standard laying out shall we say "common Clojure" as there is for ANSI Common Lisp, any deviations from the Java reference implementation that I introduce will just muddy the waters further. No benefit will come to Clojure as a language standard if I introduce more porting concerns than already exist between CLJ and CLJS. There will likely be some minor deviations, such as making the ^:static annotation meaningful again and making implementation details like what metadata is preserved (if any) user-tunable. However, any code that runs atop Oxcart (the working name for my project) should run without changes and with identical results atop standard JVM Clojure and tools.emitter.jvm.

CG: You mentioned before using types for traversing an AST. What language was that in? Can you describe that project?

RM: C++ unfortunately. For the graduate compilers class which I was involved with this last semester we were assigned to implement a variety of traversals and optimizations over the LLVM C++ API which makes aggressive use of in place mutability and a custom pattern matching construct to achieve very nice programmer semantics and a tight memory footprint as these things go. We did a couple things including implemen t ing constant folding and partial evaluation. We ultimately were assigned to build an iterative dataflow analysis framework using the template library and to implement dead code elimination using it. Wasn't a whole lot of fun, but it was awesome exposure to relatively modern research compilers techniques and some of the formal methods involved therein.

CG: Sounds challenging. Is there anything else you'd like to say before the end of the interview?

RM: I'd like to quickly thank Timothy Baldridge and Nicola Mometto for their previous work in this area and continued support, as well as Daniel Solano Gomez and the rest of the Clojure GSoC team for this opportunity to do something awesome and bloody hard with my summer, but that's all I've got.

CG: Where can people follow along in your progress?

RM: The source code for the current version of Oxcart is online. Once functional, Oxcart will likely be restructured into contrib libraries as the core team deems appropriate. However, I'm delaying the involved architecture decisions until after I have a working product. The above comes with the obvious disclaimer that it's being developed full time by an undergrad, is subject to daily changes and comes with no waranty. At present Oxcart does little more than tools.analyzer.jvm, however it is able to analyze and emit both itself and the Clojure core and build ASTs for loaded code. Documentation on the provisions of the Oxcart compiler and its differences from the standard Clojure runtime is forthcoming and subject to change.

CG: Nice. Thanks, Reid, for a great interview!

RM: Thanks Eric, take care!