Sunday, October 19, 2014

HLint now spots bad unsafePerformIO

Summary: I've just released a new version of HLint that can spot an unsafePerformIO which should have NOINLINE but doesn't.

I've just released HLint v1.9.10, a tool to suggest improvements to Haskell code. I don't usually bother with release announcements of HLint, as each version just fixes a few things and adds a few hints, it's all in the changelog (plus there have now been 102 releases). The latest release attempts to make using unsafePerformIO a little bit safer. A common idiom for top-level mutable state in Haskell is:

myGlobalVar :: IORef Int
myGlobalVar = unsafePerformIO (newIORef 17)

That is, define a top-level CAF (function with no variables) and bind it to unsafePerformIO of some created mutable state. But the definition above is unsafe. GHC might decide myGlobalVar is cheap and inline it into several uses, duplicating the variable so that some functions update different versions. Running this code through the latest version of HLint we see:

Sample.hs:2:1: Error: Missing NOINLINE pragma
Found:
  myGlobalVar = unsafePerformIO (newIORef 17)
Why not:
  {-# NOINLINE myGlobalVar #-}
  myGlobalVar = unsafePerformIO (newIORef 17)

HLint has spotted the problem, and suggested the correct fix.

Trying it for real

Let's take the package slave-thread-0.1.0 and run HLint on it. Slave thread is a new package that helps you ensure you don't end up with ghost threads or exceptions being thrown but missed - a useful package. Running HLint we see:

Sample.hs:19:1: Error: Missing NOINLINE pragma
Found:
  slaves = unsafePerformIO $ Multimap.newIO
Why not:
  {-# NOINLINE slaves #-}
  slaves = unsafePerformIO $ Multimap.newIO

Sample.hs:20:3: Warning: Redundant $
Found:
  unsafePerformIO $ Multimap.newIO
Why not:
  unsafePerformIO Multimap.newIO

Sample.hs:25:1: Error: Eta reduce
Found:
  fork main = forkFinally (return ()) main
Why not:
  fork = forkFinally (return ())

Sample.hs:55:28: Warning: Redundant $
Found:
  PartialHandler.totalizeRethrowingTo_ masterThread $ mempty
Why not:
  PartialHandler.totalizeRethrowingTo_ masterThread mempty

Sample.hs:72:5: Error: Use unless
Found:
  if null then return () else retry
Why not:
  Control.Monad.unless null retry

HLint has managed to spot the missing NOINLINE pragma, and also a handful of tiny cleanups that might make the code a little more readable. Fortunately (and independent of HLint), the NOINLINE pragma was added in slave-thread-0.1.1, so the library no longer suffers from that bug.

Friday, October 17, 2014

Fixing Haddock docs on Hackage

Summary: A few weeks ago Hackage stopped generating docs. I have a script that generates the docs, and also fixes some Haddock bugs.

Update: The Haddock documentation generators are running once again, but may be somewhat behind for a little while. A few weeks ago Hackage stopped generating documentation, so if you look at recently uploaded pages they tend to either lack docs, or have very alert maintainers who did a manual upload. I've packaged up my solution, which also fixes some pretty annoying Haddock bugs. Given that I can now get docs faster and better with my script, I'll probably keep using it even after Haddock on Hackage gets restarted.

How to use it

  • You are likely to get better results if you always install the packages you use with documentation.
  • Ensure you have tar, curl, cp and cabal on your $PATH.
  • cabal update && cabal install neil
  • Make a release, don't touch any other code, then make sure you are in the project directory.
  • neil docs --username=YourHackageUsername
  • Type in your Hackage password at the prompt.

And like that, your docs are online. To see an example of something that was generated with this process, look at Shake.

What I fixed

I automated the process using scripts originally taken from the lens library, supplemented with suggestions from Reddit. I then do a number of manual fixups.

  • Haddock now makes cross-module links where it doesn't know what the target is default to types. Concretely, if I write 'Development.Shake.need' in Haddock it generates a link to #t:need, which doesn't exist, when it should be #v:need - I fix that.
  • Update: fixed in Haddock 1.14 or above, as per this ticket.
  • On Windows, if you use CPP and multiline bird-tick (>) Haddock blocks you get a blank line between each line. I fix that.
  • If you follow some of the simpler scripts links outside your package won't work (or at least, didn't for me). I fix that.

The neil tool

The neil tool is my personal set of handy Haskell scripts. I make all my releases with it (neil sdist), and do lots of checks that my packages conform to my rules (neil check). I also use it for driving my Travis instances. It's in fairly regular flux. Until now, I've always kept it in Darcs/Git and never released it - it's personal stuff tuned to how I work.

You might also notice that neil provides a library. Don't use that, I intend to delete it in a few weeks. Update: library removed.

Monday, October 13, 2014

Shake's Internal State

Summary: Shake is not like Make, it has different internal state, which leads to different behaviour. I also store the state in an optimised way.

Update: I'm keeping an up to date version of this post in the Shake repo, which includes a number of questions/answers at the bottom, and is likely to evolve over time to incorporate that information into the main text.

In order to understand the behaviour of Shake, it is useful to have a mental model of Shake's internal state. To be a little more concrete, let's talk about Files which are stored on disk, which have ModTime value's associated with them, where modtime gives the ModTime of a FilePath (Shake is actually generalised over all those things). Let's also imagine we have the rule:

file *> \out -> do
    need [dependency]
    run

So file depends on dependency and rebuilds by executing the action run.

The Make Model

In Make there is no additional state, only the file-system. A file is considered dirty if it has a dependency such that:

modtime dependency > modtime file

As a consequence, run must update modtime file, or the file will remain dirty and rebuild in subsequent runs.

The Shake Model

For Shake, the state is:

database :: File -> (ModTime, [(File, ModTime)])

Each File is associated with a pair containing the ModTime of that file, plus a list of each dependency and their modtimes, all from when the rule was last run. As part of executing the rule above, Shake records the association:

file -> (modtime file, [(dependency, modtime dependency)])

The file is considered dirty if any of the information is no longer current. In this example, if either modtime file changes, or modtime dependency changes.

There are a few consequences of the Shake model:

  • There is no requirement for modtime file to change as a result of run. The file is dirty because something changed, after we run the rule and record new information it becomes clean.
  • Since a file is not required to change its modtime, things that depend on file may not require rebuilding even if file rebuilds.
  • If you update an output file, it will rebuild that file, as the ModTime of a result is tracked.
  • Shake only ever performs equality tests on ModTime, never ordering, which means it generalises to other types of value and works even if your file-system sometimes has incorrect times.

These consequences allow two workflows that aren't pleasant in Make:

  • Generated files, where the generator changes often, but the output of the generator for a given input changes rarely. In Shake, you can rerun the generator regularly, and using a function that writes only on change (writeFileChanged in Shake) you don't rebuild further. This technique can reduce some rebuilds from hours to seconds.
  • Configuration file splitting, where you have a configuration file with lots of key/value pairs, and want certain rules to only depend on a subset of the keys. In Shake, you can generate a file for each key/value and depend only on that key. If the configuration file updates, but only a subset of keys change, then only a subset of rules will rebuild. Alternatively, using Development.Shake.Config you can avoid the file for each key, but the dependency model is the same.

Optimising the Model

The above model expresses the semantics of Shake, but the implementation uses an optimised model. Note that the original Shake paper gives the optimised model, not the easy to understand model - that's because I only figured out the difference a few days ago (thanks to Simon Marlow, Simon Peyton Jones and Andrey Mokhov). To recap, we started with:

database :: File -> (ModTime, [(File, ModTime)])

We said that File is dirty if any of the ModTime values change. That's true, but what we are really doing is comparing the first ModTime with the ModTime on disk, and the list of second ModTime's with those in database. Assuming we are passed the current ModTime on disk, then a file is valid if:

valid :: File -> ModTime -> Bool
valid file mNow =
    mNow == mOld &&
    and [fst (database d) == m | (d,m) <- deps]
    where (mOld, deps) = database file

The problem with this model is that we store each File/ModTime pair once for the file itself, plus once for every dependency. That's a fairly large amount of information, and in Shake both File and ModTime can be arbitrarily large for user rules.

Let's introduce two assumptions:

Assumption 1: A File only has at most one ModTime per Shake run, since a file will only rebuild at most once per run. We use Step for the number of times Shake has run on this project.

Consequence 1: The ModTime for a file and the ModTime for its dependencies are all recorded in the same run, so they share the same Step.

Assumption 2: We assume that if the ModTime of a File changes, and then changes back to a previous value, we can still treat that as dirty. In the specific case of ModTime that would require time travel, but even for other values it is very rare.

Consequence 2: We only use historical ModTime values to compare them for equality with current ModTime values. We can instead record the Step at which the ModTime last changed, assuming all older Step values are unequal.

The result is:

database :: File -> (ModTime, Step, Step, [File])

valid :: File -> ModTime -> Bool
valid file mNow =
    mNow == mOld &&
    and [sBuild >= changed (database d) | d <- deps]
    where (mOld, sBuilt, sChanged, deps) = database file
          changed (_, _, sChanged, _) = sChanged

For each File we store its most recently recorded ModTime, the Step at which it was built, the Step when the ModTime last changed, and the list of dependencies. We now check if the Step for this file is greater than the Step at which dependency last changed. Using the assumptions above, the original formulation is equivalent.

Note that instead of storing one ModTime per dependency+1, we now store exactly one ModTime plus two small Step values.

We still store each file many times, but we reduce that by creating a bijection between File (arbitrarily large) and Id (small index) and only storing Id.

Implementing the Model

For those who like concrete details, which might change at any point in the future, the relevant definition is in Development.Shake.Database:

data Result = Result
    {result    :: Value   -- the result when last built
    ,built     :: Step    -- when it was actually run
    ,changed   :: Step    -- when the result last changed
    ,depends   :: [[Id]]  -- dependencies
    ,execution :: Float   -- duration of last run
    ,traces    :: [Trace] -- a trace of the expensive operations
    } deriving Show

The differences from the model are:

  • ModTime became Value, because Shake deals with lots of types of rules.
  • The dependencies are stored as a list of lists, so we still have access to the parallelism provided by need, and if we start rebuilding some dependencies we can do so in parallel.
  • We store execution and traces so we can produce profiling reports.
  • I haven't shown the File/Id mapping here - that lives elsewhere.
  • I removed all strictness/UNPACK annotations from the definition above, and edited a few comments.

As we can see, the code follows the optimised model quite closely.

Tuesday, October 07, 2014

Bake - Continuous Integration System

Summary: I've written a continuous integration system, in Haskell, designed for large projects. It works, but probably won't scale yet.

I've just released bake, a continuous integration system - an alternative to Jenkins, Travis, Buildbot etc. Bake eliminates the problem of "broken builds", a patch is never merged into the repo before it has passed all the tests.

Bake is designed for large, productive, semi-trusted teams:

  • Large teams where there are at least several contributors working full-time on a single code base.
  • Productive teams which are regularly pushing code, many times a day.
  • Semi-trusted teams where code does not go through manual code review, but code does need to pass a test suite and perhaps some static analysis. People are assumed to be fallible.

Current state: At the moment I have a rudimentary test suite, and it seems to mostly work, but Bake has never been deployed for real. Some optional functionality doesn't work, some of the web UI is a bit crude, the algorithms probably don't scale and all console output from all tests is kept in memory forever. I consider the design and API to be complete, and the scaling issues to be easily fixable - but it's easier to fix after it becomes clear where the bottleneck is. If you are interested, take a look, and then email me.

To give a flavour, the web GUI looks of a running Bake system looks like:

The Design

Bake is a Haskell library that can be used to put together a continuous integration server. To run Bake you start a single server for your project, which coordinates tasks, provides an HTTP API for submitting new patches, and a web-based GUI for viewing the progress of your patches. You also run some Bake clients which run the tests on behalf of the server. While Bake is written in Haskell, most of the tests are expected to just call some system command.

There are a few aspects that make Bake unique:

  • Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.
  • Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.
  • Bake allows multiple clients to run tests, even if some tests are only able to be run on some clients, allowing both parallelisation and specialisation (testing both Windows and Linux, for example).
  • Bake can detect that tests are no longer valid, for example because they access a server that is no longer running, and report the issue without blaming the submitted patches.

An Example

The test suite provides both an example configuration and commands to drive it. Here we annotate a slightly simplified version of the example, for lists of imports see the original code.

First we define an enumeration for where we want tests to run. Our server is going to require tests on both Windows and Linux before a patch is accepted.

data Platform = Linux | Windows deriving (Show,Read)
platforms = [Linux,Windows]

Next we define the test type. A test is something that must pass before a patch is accepted.

data Action = Compile | Run Int deriving (Show,Read)

Our type is named Action. We have two distinct types of tests, compiling the code, and running the result with a particular argument. Now we need to supply some information about the tests:

allTests = [(p,t) | p <- platforms, t <- Compile : map Run [1,10,0]]

execute :: (Platform,Action) -> TestInfo (Platform,Action)
execute (p,Compile) = matchOS p $ run $ do
    cmd "ghc --make Main.hs"
execute (p,Run i) = require [(p,Compile)] $ matchOS p $ run $ do
    cmd ("." </> "Main") (show i)

We have to declare allTests, then list of all tests that must pass, and execute, which gives information about a test. Note that the test type is (Platform,Action), so a test is a platform (where to run the test) and an Action (what to run). The run function gives an IO action to run, and require specifies dependencies. We use an auxiliary matchOS to detect whether a test is running on the right platform:

#if WINDOWS
myPlatform = Windows
#else
myPlatform = Linux
#endif

matchOS :: Platform -> TestInfo t -> TestInfo t
matchOS p = suitable (return . (==) myPlatform)

We use the suitable function to declare whether a test can run on a particular client. Finally, we define the main function:

main :: IO ()
main = bake $
    ovenGit "http://example.com/myrepo.git" "master" $
    ovenTest readShowStringy (return allTests) execute
    defaultOven{ovenServer=("127.0.0.1",5000)}

We define main = bake, then fill in some configuration. We first declare we are working with Git, and give a repo name and branch name. Next we declare what the tests are, passing the information about the tests. Finally we give a host/port for the server, which we can visit in a web browser or access via the HTTP API.

Using the Example

Now we have defined the example, we need to start up some servers and clients using the command line for our tool. Assuming we compiled as bake, we can write bake server and bake client (we'll need to launch at least one client per OS). We can view the state by visiting http://127.0.0.1:5000 in a web browser.

To add a patch we can run bake addpatch --name=cb3c2a71, using the SHA1 of the commit, which will try and integrate that patch into the master branch, after all the tests have passed.

Sunday, October 05, 2014

How to Rewrite the Prelude

Summary: Rewriting the Prelude should be done in a separate package and take a few years to complete, hopefully building a consensus and improving the result.

My recent suggestion that the Prelude shouldn't be rewritten to incorporate Foldable/Traversable generated a lot of feedback. It's clear some people strongly agree, and some strongly disagree. So instead of continuing the argument, in this post I'm going to describe how I think the people who want to change the Prelude should go about it.

There were a lot of comments that we should optimise Haskell for practitioners, not beginners - in particular that there should be Prelude (for advanced users) and Prelude.Simple (for beginners). My personal opinion is that if we have two Prelude's, why not make the beginner one the default one? Switching to the non-default one probably takes one line of moderately advanced syntax (currently import Prelude(); import Prelude.Advanced). For a beginner, writing a simple one line program, that more than doubles the complexity. For a moderate user, that's almost certainly outweighed by lines of LANGUAGE pragmas. (I'm also not sure that I want the "Advanced" Prelude, but that's not relevant to this post.)

Step 1: Write the new Prelude in a package

The first step to proposing a new Prelude should be to write Prelude.Advanced in a separate package on Hackage. Let's assume we have a package base-advanced with a module Prelude.Advanced. The Prelude is not trivial, and translating high-level design goals (e.g. move Foldable into the Prelude) requires some design choices to translate into concrete code.

Step 2: Gain users and feedback

Rather than roll out the Prelude to everyone at once, slowly try and persuade people that your Prelude is superior to the existing one. In doing so, advanced users can start trying out the Prelude, and describing their experiences with it. Does it require Foldable to gain a size method? Does it show up a GHC error message that could be better? Do we need some new type of defaulting? Does profiling give poor results without some manually inserted cost centre? Is a direct list isomorphism a better type class to start from (as F# does with IEnumerable)? Given the number of enthusiastic supporters, this step should be easy.

Step 3: Incorporate feedback

The feedback from step 2 needs incorporating into the package, and potentially into GHC itself. I expect that will result in a significant number of revisions, and perhaps significant departures from the original design. Note that the classy-prelude package is following the steps above, and has had 38 revisions on Hackage so far.

Step 4: Gain acceptance

Before going any further, some critical mass of people need to agree the new way is preferable. There may turn out to be competing proposals, and hopefully as a community we can find consensus and identify the best ideas (as we are currently doing with IO streaming libraries).

Step 5: Move to base

Once we have agreement on what the new Prelude should look like, it should probably be moved into base. At this point, based on all the experience we've got, we can decide whether it becomes Prelude or Prelude.Advanced. At the same time as moving into base, we should decide on the simpler story, but what that decision looks like, will depend on what the proposed Prelude looks like.

The current approach

There are a number of areas that current approach worry me. The change was made directly in the base Prelude, based on some agreement of high-level design decisions, and has already resulted in unexpected additions to the Foldable class. The Prelude will first be exposed to lots of users once GHC 7.10 ships, by which point iterating on feedback will be slow and difficult. The plan for the "beginner" version is to encourage beginners to use the haskell2010 package, which I don't think has been thought out (as someone who tried to use the haskell98 package for a short while, the exception change meant you couldn't use haskell98 and base in the same package, which is basically fatal).

The Haskell community changed Applicative to be a superclass of Monad. That change was simple and thoroughly thought out over a period of years. Warnings were added to GHC, packages were analysed, people experimented with what it would mean, and once decided the change took a number of GHC releases to fully roll out. I consider this change to have been done the right way. In contrast, the complete rewrite of the Prelude is happening far more quickly, and I would argue too quickly. By taking a bit more time hopefully we can come up with something even better.

Wednesday, October 01, 2014

Why Traversable/Foldable should not be in the Prelude

Summary: For GHC 7.10, Traversable and Foldable are going to be in the Prelude. I missed the original discussion, but I suspect it's a bad idea.

Types are how Haskell programmers communicate their intentions to each other. Currently, the Haskell Prelude contains:

mapM :: Monad m => (a -> m b) -> [a] -> m [b]

As of GHC 7.10, as part of something known as the Burning Bridges Proposal (ticket, discussion, I can't actually find a full proposal...), that will become:

mapM :: (Traversable t, Monad m) => (a -> m b) -> t a -> m (t b)

Surely that's a good thing? Aren't more general types always better? Isn't the Prelude an archaic beast from the time before? I'd argue functions which are highly polymorphic are hard to use, and hard to think about, especially for beginners. I'd also argue the Prelude is remarkably well designed, not perfect, but quite an impressive feat.

What makes a type signature complex?

I've been thinking recently about what makes type signatures complex, both to practitioners, and to relative beginners. My rough metric is:

  • Fully concrete types are usually simple, as long as they aren't too long. The longer a type gets, the more complex it gets.
  • Types with functions in them aren't too bad (order-1 types), but as you go up to order-2 types things start to get more complex.
  • Fully polymorphic functions can be simpler than concrete functions, since they declare what you don't need to worry about.
  • Functions with type classes are more complex, since you need to read the type signature while looking at the context, and need to know each class being used.
  • Simple type classes (Eq, Show) aren't too bad, but custom type classes impose more of a burden.
  • As you add more type classes, the complexity grows faster than linearly. Three type classes are not three times as complex as one, but quite a bit harder.
  • Higher kinded type classes are significantly more complex than kind * type classes, e.g. Monad, Functor. The reason is that instead of having a hole you fill in, you now have a hole which itself has a hole.
  • The higher-kinded type classes Monad and Functor aren't as bad as the others, since Functor is really the "simplest" higher-kinded type class, and Monad is required knowledge for IO.
  • As you have more higher kinded type classes, the complexity burden grows even worse than for kind * type classes. Two is significantly more complex than one.

By that metric, the old mapM isn't too bad, but the new mapM is quite complex. It has two higher-kinded type classes, and one of them is not one of the common ones. I appreciate that making Foldable and Traversable key to Haskell will probably lead to them being more used, but now all beginners are going to have to wade through the Monad tutorial, their Foldable tutorial and their Traversable tutorial before they start programming (or just give up).

Why generality hurts

There are two main reasons why generality hurts:

Reading type signatures becomes difficult/impossible. We already have that problem with the Control.Arrow module, which (as far as most people use it), is just a pile of tuple combinators. But unlike other tuple combinators, these are ones whose type signature can't be understood. When I want to use &&& or *** I just pick randomly, see if it type checks, then try again. When other people I know what to use these functions they just use an explicit lambda. No one thinks of referring to the documentation, since the documentation presents a unification problem (which most of the people I know could solve), not an intuition.

Reading code becomes difficult. Haskell is brilliant for letting you write a composable pipeline of code that takes some input, does some processing, and produces some output. But that only works if you have enough concrete pieces in each function to read each piece in isolation. As an example:

test = foo . mapM baz . bar

Using the current mapM definition I can, in a fraction of a second, know the approximate shape of what foo consumes, and what bar produces. With the new mapM I don't, and have to keep more context in my head to reason about the code.

Who it hurts

Generality of this nature tends to hurt two types of people:

Beginners are hurt because they need to know more concepts just to get going. As a beginner I read through Data.List regularly to build up weapons in my arsenal to attack larger problems. The new Data.List will be generalised, and reading it won't give the insights I enjoyed. Maybe the beginner can instantiate all Foldable things to [], but that adds a mental burden to exactly those people who can bear it least.

Practitioners, those who are paid to code for a living, will have greater problems with maintenance. This isn't an unsubstantiated guess... I have taken over a project which made extensive use of the generalised traverse and sequence functions. Yes, the code was concise, but it was read-only, and even then, required me to "trust" that the compiler and libraries snapped together properly.

Who it benefits

The benefit probably comes from those who are already using the Applicative/Traversable classes regularly. For these people, they can probably avoid an import Prelude(). I am not against ever changing the Prelude, but I do think that for changes of this magnitude the ideas should probably be prototyped as a separate package, widely accepted, and only then should significant surgery be attempted on the Prelude. The classy-prelude work has gone in that direction, and I wish them luck, but the significant changes they've already iterated through suggest the design space is quite large.

Concluding remarks

I realise that I got to this discussion late, perhaps too late to expect my viewpoint to count. But I'd like to leave by reproducing Henning Thielemann's email on the subject:

David Luposchainsky wrote:

+1. I think the Prelude should be a general module of the most commonly
needed functions, which (generalized) folds and traversals are certainly
part of; right now it feels more like a beginner module at times.

It is certainly a kind of beginner module, but that's good. Experts know
how to import. Putting the most general functions into Prelude does not
work because:

1. There are often multiple sensible generalizations of a Prelude
function.

2. You have to add more type annotations since types cannot be infered
from the functions.

There is simply no need to change Prelude and all packages that rely on
specific types. Just don't be lazy and import the stuff you need!

I should change my vote to:

-10