Clojure Gazette 177: The Hidden Costs of Abstraction

Written by Eric Normand. Published: June 13, 2016.

The Hidden Costs of Abstraction

Clojure
Gazette

Issue 177 - June 13, 2016

Read more about this week's sponsor, Takipi , at the end of this letter.

Hi Clojurists,

We started exploring the costs of abstraction with code size. The bigger your code, the more bugs and the more money it costs to maintain. Then we looked at runtime costs, which cost money in server usage. If we choose the right tools, we minimize those costs.However, there are some real mental costs to abstraction that we have not addressed.

Whenever we write new code, we have to decide what file to put it in. It may be an easy decision, in which case it's a low cost. But it may be difficult. In fact, in the worst case, we have to look at all of the filenames in our project, evaluate them all, and decide it has to be in a new file. Then we need to create a new directory and name it, create a new file in it and name it, put all of the ceremony required at the top of the file, then write our code, then put the incantation to refer to code in that file in the file we need to call it from. And what if you want test coverage? You'd need to create a new test file, with all the required ceremony, and then you could start writing tests, which I'm not even going to count.

All told, the cost of organizing our code into files is enormous. I wonder how much it influences us to avoid abstracting more. When a new abstraction requires so much work just to make a home for it, it may not be worth writing.

In the best case, the file already exists and all we have to do is figure out where in the file it goes. Sometimes that doesn't matter so much, but decisions have real mental cost.

Object Oriented languages often make deciding where to put something easy. The class of the receiver determines the file a method should go in. Functional languages don't have such a luxury. I think the overhead of creating files is paid for with the first abstraction that goes in that file. It may, in the long run, be averaged into all of the abstractions that eventually go there. It's a big cost to pay up front. And we don't even know how many abstractions are going to be in there. How many potential abstractions never reach reality?

Another big cost is finding a good, descriptive name for the abstraction. Most of the reason for making the abstraction in the first place is that you can attach a name to it. Names help cement meaning to a chunk of code. But we know naming is hard---much harder than writing the line of code it describes. Choosing the name is a huge cost. I'm certain I've avoided making an abstraction just to avoid thinking about the name.

A lot of the cost of naming is not in thinking up the name, but in changing the name of an existing thing. Sure, in the best case, that's just deleting the name and writing a new one. But in the worst case that abstraction is referred to by name all over you code. You'll have to change all of them. How many abstractions keep their crappy names because of the cost of changing it?

When you've got a large number of abstractions, how do you know if a particular one is being used? Sometimes an abstraction is being used from a different file. We could delete the last usage and not know it without analyzing the entire codebase. But those analysis tools are rarely available for the language we're using. We have to keep our eyes open for unused code. And as we get more of them, that becomes harder.

There's one last cost: if you need a new abstraction, how do you figure out if you've already got the one you need somewhere in the code? You may have already paid all of the costs for that abstraction. You shouldn't pay again if you don't have to. But you don't even know if it's there, hiding in plain sight, in one of the many files in your project. Happy hunting! I don't know of any tools to find functions by purpose other than the Haske ll search engine Hoogle . In Hoogle, you enter the type signature of what you're looking for and it tells you possible matches.

I had never written all of these costs down. I've experienced them all, but organizing them has given me new perspective on it. With all of these costs, we should get some help. Our tools should help us minimize all of these costs. Luckily, Cider and Cursive handle some of this and could be extended. We need to abstract more and any barrier to that should be systematically eliminated. Could all of these costs be why we don't abstract more? Who is to blame for our messy, unabstracted code? Next time.

Rock on!

PS Want to get this in your email? Subscribe !
PPS Want to advertise to smart and talented Clojure devs ?

Sponsor: Takipi

How many exceptions are your users seeing in production? Takipi can help you answer that. Takipi analyzes exceptions from your production servers and shows them with lots of context --- local variables and source code. They approached me about a sponsorship and within 5 minutes (an install and one line in my project.clj) I had it set up. After an initial code analysis, it was seeing exceptions happening inside of my existing application. Show them your support for the Gazette. Give it a try . It's free and easy to set up. Then let them know how it went and thank Takipi for sponsoring the issue.