Clojure Gazette 1.79

Typed Clojure: Interview with Di Xu

Clojure Gazette

Issue 1.79June 08, 2014

Hi Clojurators,

This week's interview is with Di Xu, who is another Google Summer of Code participant. Di is his given name, Xu is his family name.

Clojure Gazette: Where are you from? What and where are you studying?

Di Xu: I'm from China, currently studying Software Design for my master's degree at USTC.

CG: How did you get interested in Clojure?

DX: Well, before Lisp I learned several programming language such as Python, C, Java, and JS, the first PL I learned is Python, it support several programming paradigms: procedural, OO, and functional. So other languages seem similar to me. But after I read the book Hackers and Painters by Paul Graham, I was interested in Lisp right away. Because that book highly recommended a PL called Lisp, and he quoted from Eric Raymond:

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.

​I was fascinated, I decided to give it a try, so I read several books about Lisp, such as Land of Lisp, ANSI Common Lisp, and SICP, and wrote a blog (in Chinese) to introduce and compare Scheme and Common Lisp. I even wrote a toy interpreter in Scheme and C. But the problem with both Scheme and Common Lisp is there're so few companies using these language in the real world, because the lack of library support for common functionality. And many Lisp programmers did what Eric Raymond said ("never actually use Lisp itself a lot"), and just used Lisp to build another Lisp interpreter just for fun and threw it away (I was one of them).

After that, I graduated from my undergraduate school, and accepted an internship in a company in Beijing, at that time I took a lisp meetup in Beijing, and one of the speakers talked about writing web in Clojure without heavy weight framework, just using ring. I didn't know much about Clojure that time, just heard some people mention it. So I read Programming Clojure to learn it, and found the design of Clojure very clean. What I like most is the mature ecosystem around Clojure (lein, ring). You don't have to worry about them while developing. Also the literal representation of basic data structures is very sweet.

What excited me most in Clojure was the ability to do real work in Lisp. You can write Java code as your library or program infrastructure, and provide a clean interface to the user in REPL, just like what cascalog did. So for me, Java is like the assembly code in JVM, and Clojure could take advantage of many pre-existing Java libraries. This makes Clojure a real world PL.

I also found clojure.lang.Persistent* to be also really useful alone. I used it in my school project written purely in Java, the functionality I needed was the ability to share data among many threads to reduce memory usage, and no one thread can interfere others. These data structures is really awesome.

CG: What project are you working on?

DX: I'm currently working on Typed Clojure as my GSOC project. You can read my proposal. The idea I'm trying to implement is adding more mature support for Assoc type to make it more useful and make it so users could use this type to accurately annotate their own functions, because Typed Clojure uses special handler for Assoc currently just like what the proposal mentioned.

But the tentative solution mentioned in the proposal will not be used anymore. Because Ambrose Bonnaire-Sergeant and I found a better solution for this problem, I'm currently working on that solution.

CG: Can you briefly describe the better solution?

DX: Well, it's not decided either, but I'll have a try.

The tentative solution is to annotate clojure.core/assoc as:

(All [a b c d ...2]{"\n"} [a b c d1 d2 ...2 d -> (Assoc a b c d1 d2 ...2 d)])

Let me explain the meaning of it:

All introduces several type variable into the type scope, the vector in that scope means a function type, it accepts a, b, c, d1 and d2 as its arguments. a, b, and c is a normal type variable, these variables can be assigned with any type like Number, String, and so on. But d1 and d2 is a special type variable called dotted type variable, they are much like * in regular expression, can be matched with 0, 2, 4 ... variables which increased by 2 as hinted by ...2. And the function returns Assoc type. It also has several type variables in its body. And we'll do the type check while constructing it.

The problem with it is it hardcodes ...2 into syntax. If we want to extend it to some other function like ...3 we have to hardcode ...3 into syntax.

So we came up with a better solution for this, which is inspired by Sam Tobin-Hochstadt and Asumu Takikawa, who are Typed Racket guys.

It solves this problem in a very Lisp way: use a list. I'll give you an example of clojure.core/hash-map which annotated as:

(All [k v]{"\n"}{"  "}[(HSeq [k v] :repeat true) *@ -> (Map k v)])

HSeq here means HeterogeneousSeq which essentially is just a List. It has the attribute :repeat as true, this is just like ...2 which can be matched with 2, 4 ... types, the innovation here is the *@ syntax, it means wrap all arguments into a list, and try to match with it. In this way we don't have to hard code anything.

CG: I see. So you could actually have [a b c] instead of [k v] and it would repeat the three variables.

DX: Yes.​

CG: It sounds like the type language is getting very expressive. Do you ha ve a background in type theory?

DX: No,​ actually I just read a couple of papers given by Ambrose Bonnaire-Sergeant while I was preparing the GSOC. I chose this project because it's similar to static analysis in which I have some experience.

CG: What has been your experience using Typed Clojure on your own projects?

DX: Well, just like I don't have a background in type theory, I had never heard of Typed Clojure before. I learned to use clojure just for a year, so I'm quite a freshman for Clojure.

CG: Wow! I'm sure you'll learn a lot. I'd love to hear about your experiences working with the Typed Clojure codebase. Do you have a place where people can follow your progress?

DX: Yes, I've learned a lot while reading and changing the Typed Clojure codebase. I've never read/changed such big codebase (around 30K line of code) before.

These are things I've learned:

  • Use :pre and :post to check the assumption of a function, this is really useful when others are trying to contribute code. The assertion can help them use stacktrace to find out what's wrong.

  • Type annotation can not only check code statically, but also serve as a comment for a function. Typed Clojure itself uses its type system to do type checking, so there are many type annotations in its code. I found this is really helpful when I'm trying to figure out the meaning of the function.

  • Use keyword arguments when you need to extend a function in the future. It is really hard to refactor code in Clojure, especially for those functions not using keyword arguments, I must grep around and run tests to find a missing argument when I add extra parameters for a function. It'll be really easier when using keyword arguments.

  • Accumulate test-cases to do regression testing. Whenever I add some feature, I'll do a test before I commit. This ensures your features don't change the code in an unexpected way.

You can follow my progress in my [repo](https://github.com/x u%20difsd/core.typed/commits/repeat-support) in Github. I was going to push the features I implemented into several branches, so I named that branch as "repeat-support", but that may be difficult to manage and get merged, so I'm going to use that branch to store all my code.

CG: Great. Is there anything else you'd like to say before the end of the interview?

DX: I'd like to thank Rich for creating such an awesome programming language, and thank Ambrose for creating Typed Clojure and choosing me to take participate in GSOC.

CG: Thanks, Di, for a great interview.

DX: Thanks, it's very interesting experience.