Sunday, 23 December 2012

Lessons learning Haskell



Lessons learning Haskell

It's often claimed that learning Haskell will make you a better programmer in other languages. I like the idea that there's no such thing as a good programmer, just a programmer who follows good practices. As soon as we stop following good practices  we suck again. So, Haskell must introduce and indoctrinate better practices that we carry back to our other languages. Right? I think it's true but it's not obvious, so I've written this article to outline some of the habits and practices that I think changed after I used Haskell for a while.

Purity

Haskell makes your functions pure by default. You have to change the type signature if you want to do IO, and mark every function between your function and main() as tainted with IO. This forces you to be conscious of IO. It encourages you to keep functions that do IO as high up the stack, close to main, as possible. Purity also means you can't read or write global variables, that's just another type of IO. If your function needs some data, you pass it as a parameter. So the type signature of a pure function is a complete itinerary of everything it can access, and therefore is a very good spec for what the function does in most cases.

Experienced programmers who pay attention already know that IO and global vars mustn't be taken lightly. Every IO operation is a potential source of errors, exceptions, and failures. Functions that do IO are difficult or impossible to test. Programmers know this, but Haskell makes sure you never forget it when it matters. It incessantly shunts you in the direction of keeping the call-stack of IO-doers as small as possible.

When I go back to my other languages, I now put all my IO in top-level functions that are called directly from main or the event loop. I gather every scrap of data I need to do the computation. I marshal the data into typed structures and pass it all into a pure function that does the work. Then the structure that's returned is demarshalled and transmitted, displayed or stored as required. If I need to do some computation to determine what data I need to fetch, I make sure this is not commingled with the IO functions, so my code fetches data, calculates what else needs to be fetched, fetches that data, and so on.

This has some nice effects. The IO doers are isolated and distinct. Error handling and exception catching is clearer and simpler. The compute code is pure. This makes it much easier to test, debug, and understand.


Clean Syntax for Static Types


boo :: Map Integer String -> String -> Integer

I have no idea what the word "boo" is supposed to mean when used as a function name. But I can be almost certain what this "boo" does. It takes a map that has Integer keys and String values, and a second argument that is just a String, and it returns an Integer. So I'm fairly sure that this function does a reverse mapping - you give it a String value and a Map, and it finds the Integer key for that value.

A lot of this depends on the fact that the signature tells me that there is no IO going on. If the bit at the end of the line was "-> IO Integer" instead of "-> Integer", all bets are off. The function could be sending the Map and String to launch control, and -> IO Integer could be the number of seconds it took to get a response, or the price of a gallon of gas in pennies (hence "boo", perhaps). The point is, you can't confidently reason about a function from its signature if IO is involved.

The Haskell type signature of a function is particularly clear and easy to follow. Functions just map one type to another "foo :: Author -> DateOfBirth". Parameterized types just list the parameter types "Map Integer String". There are very few boilerplate tokens for the eye to scan.

But how has this changed what I do in other languages? I now sketch out the design for larger components in this Haskell signature notation. Particularly if I'm writing a library with a public API. I've shown these sketches to other developers as we discuss the design of a program, and they get it. Most of the time, I don't mention that the notation is Haskell. The only slight oddity for them is the use of -> between function "inputs". They expect foo :: A,B -> C instead of foo :: A -> B -> C. But they get over it immediately, and I have never had to mention currying or partial application, since they're usually just pleased that the notation is clearer than anything else we've ever used.

Container Operations


I think one of the reasons I started using Lisp, then Erlang and then Haskell was that I must have typed "for (size_t i=0; i<..." just about a million times and I was sick of it. C++ teases with approximations to map, filter, fold, scan, just enough so that you'll try them for a few months until you eventually give up or your colleagues smack you. When I want to filter items from a container, I don't want to start by saying "for(size_t i=0...". I want to say "filter f xs" and I want my colleagues to read that too.

It might seem like an overreaction. But even in big classfull C++ projects, where I was senior developer, I spent my days writing functions. Functions consisting of loops and branches, because C++ didn't do a great job of accommodating "operations on containers". Despite all the guff written about STL separating iterators from algorithms from containers (from allocators...ahem), nobody provided a simple set of primitive container operations that regular programmers would use.  

Using Haskell for a while, the effect goes further. It forced me to think of every such problem as a chain of the primitive list operations, maps, folds, filters and scans. Now I always think in these terms. I "see" the transformation of a container as a simple sequence of these operations. Before, I would have thought in terms of munging multiple actions into the body of a single loop.

I see things using higher level concepts, and I write my comments and code with these in mind. I usually still reach for the trusty for loop in C++, but I'll factor together common container operations where appropriate into higher level functions. Filtering is a really common example that seems to come up all the time.

What's Changed


Using Haskell definitely gives you a lot of warm fuzzy feelings, (until your filehandle is closed before you've actually read the data because you didn't ask for a result, silly). Part of the joy of the language is that it forces you to take a new approach to problems you've solved conventionally before. When the answer clicks and you see that the new approach is more elegant and powerful and general than what you've been using all these years, it's hard not to smile with sheer pleasure.

In real world commercial software projects, if you don't properly test your code, and do code reviews, it doesn't matter what language you use, you're leaving the big wins on the table. A team that does these things well consistently will beat any team that does not, regardless of what language or technology stack they're using.

Using Haskell changed my practices so that the code I write is easier to test and easier to code review.  There's a bunch of other stuff too, some of it can be articulated, some will probably always be just a "warm fuzzy".

8 comments:

  1. And in a parallel universe, programmers write programs that must run with predictable performance or throughput. Haskell kills those with a fallacy: that you don't need to know the "machinery" to write code. Taken to the extreme, programmers that know only Haskell would be blind to the choices of the GHC/whathaveyou implementers in how their code is mapped to actual hardware. They would not be able to make programs that practically work any more.

    In other words Haskell only makes you a better logical mind, by helping you organize your thoughts and make fewer logic mistakes in your code. Whereas it makes you a better *programmer* is another discussion entirely, and I still have my doubts about the quality of a programmer who can't reason comfortably about time and space complexity because their language don't let them.

    ReplyDelete
    Replies
    1. This argument can be made against Java, Ruby, Python... With Haskell, you certainly reason a little differently -- one the one hand, the runtime is much smaller than in those languages and creating new objects is much cheaper; on the other, laziness can be strange and the tricks to control it, even stranger.

      There is a growing (though small) collection of lore about Haskell performance and it is taken very seriously by the community. Thus the plethora of tooling constant-memory stream processing in recent years (I do hope it is consolidated soon).

      It is relatively easy in Haskell to drop to C, too.

      Delete
    2. The point of the article is, what he learned from Haskell and ended up applying to his programming in other languages, not that Haskell is THE language to learn if you want to be a good programmer.

      Delete
    3. It's a bit silly to suggest that it's impossible to reason about time and space behaviour of GHC Haskell programs. There are excellent profiling and thread monitoring tools, and if you make the simplifying assumption that things are going to be evaluated using lazy evaluation, then the performance is analysable.

      Occasionally GHC will improve performance by making things stricter when it can prove that something will always need to be evaluated anyway, but the effect of this on your program is so rarely negative (the compiler authors will consider it a bug if so), that it's ignorable.

      In practice, the mental model where you just think of expressions being reduced in outermost-first order with sharing, and where you regard the number of reduction steps being proportional to time usage, and the textual length of the expressions as reduction occurs as proportional to space usage has been more than sufficient for me to understand the time and space behaviour of real world programs.

      It definitely takes some getting used to, but then again, nobody is born with intuition for the time and space behaviour of imperative programs either. That requires years of learning and getting-used-to, and lazy evaluation turns things inside out and requires perhaps a similar amount of learning. In the end, the analysis can at times be slightly more subtle, but it's really not much harder to be honest.

      In either case, real world programs are complicated enough that you're better off just running the profiler and seeing where the hotspots are that way, and then usually the real solution amounts to just choosing a better data structure or a better algorithm rather than micro-optimisations.

      Delete
  2. knz42, you are talking about a tiny fraction of the world's code that has harsh performance requirements. Consider the huge amount of code in GC'd languages that gets no guarantees about latency, and used anyway.

    Inside a performance-critical application, more than 90% of the code will typically be uncritical.

    The claim that Haskell doesn't *let* you reason about space or time is a fallacy. It is indeed *harder* (the corollary of the ease to reason about correctness), but definitely possible.

    So instead of an uphill struggle for correctness in the entire program, and an easier battle for performance in the tiny part that is critical - Haskell gives you an easy time for the entire program, and a slightly harder time for the tiny part that is critical.

    Haskell also has excellent profiling facilities (time profilers, visual heap profilers, high-performance thread event profilers), and excellent parallelism. These together also make it easier to beat the performance of other languages, especially in a multi-core environment.

    ReplyDelete
  3. And another word about GHC: It makes it pretty easy to actually see how it mapped to the hardware.

    GHC can list all of the optimizations it actually ran, show you the intermediate optimized representations (Core), and that is definitely used by Haskellers when they need to squeeze performance out of their programs.

    In my experience using Haskell, though, I almost never have to worry about performance and correctness is almost free as well. It is an extremely great trade-off.

    ReplyDelete
  4. "if you don't properly test your code, and do code reviews, it doesn't matter what language you use, you're leaving the big wins on the table. A team that does these things well consistently will beat any team that does not, regardless of what language or technology stack they're using."

    Really liked this section of the post - Truer words haven't been spoken.

    ReplyDelete