tag:blogger.com,1999:blog-12286980152665477372024-03-13T06:52:16.201-07:00Computational ThoughtsThe ramblings of a passionate computer scientistJosefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-1228698015266547737.post-36912630723853660832010-10-01T12:47:00.000-07:002010-10-01T12:51:29.395-07:00Powers of minus one and some bit twiddling<p>Since the start of this year I'm employed as a postdoc on a project for designing a new domain specific language for writing dsp algorithms. The language is called Feldspar and if you want to check it out you can look at the <a href="http://feldspar.inf.elte.hu/feldspar/">official homepage</a> or download it from <a href="http://hackage.haskell.org/package/feldspar-language">hackage</a>. As you can see on these pages Feldspar is a collaboration between <a href="http://www.chalmers.se/cse">Chalmers</a> (where I'm employed), <a href="http://www.ericsson.com">Ericsson</a> and <a href="http://www.inf.elte.hu">Elte University in Budapest</a>.</p>
<p>There are two main goals for the design of Feldspar: first of all it should be possible to program at a very high level, close to how dsp algorithms are normally specified. Secondly, the generated code needs to be very efficient as the kind of applications that Ericsson has for dsp applications are performance critical.</p>
<p>I'm involved in various parts of Feldspar but one particular thing on my plate is making sure that the generated code is fast.</p>
<p>Recently the language has been used in a pilot project within Ericsson to implement part of the 3gpp standard (a mobile broadband standard, unsurprisingly, given Ericssons involvement). Having people using the language is tremendously useful for us language implementors as we really get a chance to see how well the language works in its intended environment.</p>
<p>The Feldspar code that was written in this pilot project was kept very close to the standard and was very similar to the mathematical specification of the algorithms. It's a very nice feature of Feldspar that this is possible, but it poses a challenge for us language implementers. While eyeballing some of the code I notice a little piece of code that I would like to discuss a little:</p>
<pre><code>(-1) ^ v30</code></pre>
<p>Just to clarify, in Feldspar this means minus one to the power of <code>v30</code> and it has nothing to do with xor. <code>v30</code> is just a variable name.</p>
<p>Powers of minus one is a common idiom in mathematics for saying that a value should change sign. If the exponent is even the result will be positive and if the exponent is odd the result is negative. This kind of thing is useful in various places and apparently also in the 3gpp standard.</p>
<p>However, actually performing exponentiation would in this case be ridiculously inefficient. But the question is what kind of code we should generate for this? Currently our compiler generates C so the code examples I will show from here on will be written in C.</p>
<p>Remember what I wrote above that the result only depends on whether the exponent is even or not. That is very easy to check, it's just the least significant bit! So we might generate the following code (assuming that the exponent is <code>v30</code> as in the example above):</p>
<pre><code>v30 & 1 ? -1 : 1;</code></pre>
<p>This is very short and nice and most likely as good as we can hope for, at least for the kind of processors we are targeting.</p>
<p>However, I started programming in the 80's and I still instinctively flinch when I see branches in performance critical code. So I got curious to see if I could write the above as straight line code.</p>
<p>A straight line code solution which computes the above function would most likely involve some bit twiddling. I spent some time trying to come up with a solution on my own but wasn't very happy with what I managed to produce. I was aiming for a three instruction solution and my solutions were nowhere near that. So I decided to look around, maybe someone else had solved the problem before me.</p>
<p><a href="http://graphics.stanford.edu/~seander/bithacks.html">Bit Twiddling Hacks</a> to the rescue! This wonderful list of various bit twiddling tricks doesn't have anything which solves my particular problem but there is one little nugget which is close enough that I could make good use of it: <a href="http://graphics.stanford.edu/~seander/bithacks.html#ConditionalNegate">"Conditionally negate a value without branching"</a>. The value that we will be negating is <code>1</code>, we want to negate it depending on the least significant bit of our input value (<code>v30</code> in our example above). I will not reproduce the code from Bit Twiddling Hacks, you can check it out yourself via the link. Instead I'm just going to present the final result of using that code on my particular problem (using C code):</p>
<pre><code>int v; //The exponent
int r; //Will contain -1 to the power of
int isOdd = v & 1; //Is the exponent odd?
r = (1 ^ (-isOdd)) + isOdd;
</code></pre>
<p>This is a fairly clever piece of code and I'm quite happy with it. However, it will compile to four instructions on typical architectures and I was really hoping for a three instruction solution. Anyone out there who knows of a shorter solution?</p>Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com5tag:blogger.com,1999:blog-1228698015266547737.post-13159078472280167172010-07-26T09:18:00.000-07:002010-07-26T09:31:36.421-07:00Using Stow<h1 id="stow">Stow</h1>
<p>Sometimes I've found myself in the position where I would like to have access to several versions of the same programs. Maybe not all at once but at least a convenient way to switch between versions. Especially as a developer this need comes up every now and then, for instance trying to compile a piece of source code with several different versions of a compiler.</p>
<p>In the past when I wanted to try out an alternative version of a program I installed it in a temporary directory and ran it from there. It works but it's not terribly convenient. One has to be very careful to always specify the correct path, or chaos ensues.</p>
<p>So a while ago I started looking for a saner solution. This is where I came upon <a href="http://www.gnu.org/software/stow/"><code>stow</code></a>,
a little command line tool for switching between different versions of the same program. I've found it quite useful, it's simple but does it the job.</p>
<p>However, I use <code>stow</code> with very irregular intervals and sometimes I forget some detail about how it works. The natural thing then is to turn to the <code>stow</code> manual. Unfortunately, the <code>stow</code> manual is not the most helpful of documents. It describes in some detail what <code>stow</code> does to the file system when it is invoked. While this is good to know if you want to implement your own version of <code>stow</code> it leaves you to infer yourself how to invoke it if you're an ordinary user.</p>
<p>Therefor I've written down some notes on how <code>stow</code> works. These notes are mostly for myself but I've put them on this blog in the hope that others might find them useful. This short user manual is by no means complete, but it cover my own usecase and I think that's the one that most users care about.</p>
<h2 id="a-short-user-manual-for-stow">A short user manual for <code>stow</code></h2>
<h3 id="preliminaries">Preliminaries</h3>
<p>Before going in to how to use <code>stow</code> it is useful to know a little bit about how <code>stow</code> expects things to be organized.</p>
<p>To begin with I'm going to use a running example. Suppose you have a program called <code>frobnitz</code> and you would like to have several versions of it installed and be able to switch between them seamlessly.</p>
<p>Normally when programs like <code>frobnitz</code> are installed they end up somewhere under the directory <code>/usr/local</code>. There's where the shell looks for programs when you try to execute them on
the command line. All programs are stored together in one big lump there. But <code>stow</code> likes things differently. It expects a new directory <code>/usr/local/stow</code>. This is a directory which you have to create yourself. Each version of each program will have its own directory in the <code>/usr/local/stow</code> directory. So, if you have versions <code>1.2</code>, <code>1.3</code> and <code>1.4b</code> of <code>frobnitz</code> installed then they will be installed in directories <code>/usr/local/stow/frobnitz-1.2</code>, <code>/usr/local/stow/frobnitz-1.3</code> and <code>/usr/local/stow/frobnitz-1.4b</code> respectively.</p>
<h3 id="installing-programs-for-use-stow">Installing programs for use <code>stow</code></h3>
<p>The first step is to install a program that you which to have under <code>stow</code>'s control. Using <code>stow</code> to control the installation of your programs is not compatible with other package managers such as those that come with your distribution. So if you have a program which is already installed via your distribution you must first uninstall it.</p>
<p>Recall from above that <code>stow</code> expects each version of each program to be in a separate directory. Installing a program into its own directory like this is typically just a matter of setting the right prefix when configuring the software. Below is a fictional example of how to install version <code>1.3</code> of <code>frobnitz</code>.</p>
<pre class="bash"><code>./configure --prefix=/usr/local/stow/frobnitz-1.3
make
sudo make install
</code></pre>
<p>I've seen more complicated ways of commicating with both <code>configure</code> and <code>make</code> to tell them exactly where files are expected to exist. But the above method has worked without problems for me so I'm sticking with it.</p>
<h3 id="enabling-a-program">Enabling a program</h3>
<p>Installing a program in its own directory like I showed above means that it is not automatically available from the command line and other things like man pages will not show up either. So to enable the program we have to tell <code>stow</code> that we want to use it. This is done by cd:ing to the <code>stow</code> directory and invoking <code>stow</code> with the program that we want to enable.</p>
<p>Here's how to enable version <code>1.2</code> of the frobnitz software (assuming it's already installed).</p>
<pre class="bash"><code>cd /usr/local/stow
sudo stow frobnitz-1.2
</code></pre><p>After executing the above commands you will have access to the program <code>frobnitz</code> and you will be using version <code>1.2</code>.</p>
<h3 id="switching-programs">Switching programs</h3>
<p>The whole purpose of using <code>stow</code> is so that we can switch between different versions when we want to. Here's how to do that.</p>
<p>Following our running example, suppose you have version <code>1.2</code> of the program frobnitz enabled and you would like to switch to version <code>1.4b</code>. This is done by first disabling <code>1.2</code>. This is done with the delete command in <code>stow</code>. It might seem a little scary to invoke a command called delete but <code>stow</code> will never destroy things for you. After disabling version <code>1.2</code> it's a simple matter to enable <code>1.4b</code> afterwards.</p>
<p>Here's how you switching programs is done:</p>
<pre class="bash"><code>cd /usr/local/stow
sudo stow --delete frobnitz-1.2
sudo stow frobnitz-1.4b
</code></pre>
<h2 id="wrapping-up">Wrapping up</h2>
<p>It's not hard to use <code>stow</code> once you know the basic use cases. I hope you've found this little mini-guide useful.</p>
<p>There is at least one other tool which helps having multiple versions of the same program installed at once. The package manager <a href="http://nixos.org/">nix</a> can be tricked into doing this but last time I checked it involved to edit some configuration files and I prefer a tool which is a little more user friendly. It is also overkill if all you're after is the functionality that <code>stow</code> offers. That being said, nix looks really nice and I've been meaning to try it out. Any decade now.</p>Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com1tag:blogger.com,1999:blog-1228698015266547737.post-43725670539517192922009-02-15T05:48:00.000-08:002009-02-15T05:56:48.603-08:00A comment on Parameterized MonadsThis post was really intended to be just a comment to another blog post, but since I wanted to include some code it turned out to be better to write a little post myself.
sigfpe has a nice post about <a href="http://blog.sigfpe.com/2009/02/beyond-monads.html">Parameterized Monads</a>, which pop up naturally ones you've done a bit of monadic programming. He argues that the ordinary Monad concept we have in Haskell is the wrong one, Paramterized Monads fall out so naturally that they ought to be the default. He then goes on to say "At the very least, do-notation needs to be adapted to support ParameterisedMonad." But GHC already supports do-notation for Parameterized Monads! Here's a proof:
<pre>
> {-# LANGUAGE NoImplicitPrelude #-}
> module PMonad where
> import Prelude (fromInteger,(+),(*),show)
> class PMonad m where
> return :: a -> m s s a
> (>>=) :: m s1 s a -> (a -> m s s2 b) -> m s1 s2 b
> (>>) :: m s1 s2 a -> m s2 s3 b -> m s1 s3 b
> fail :: m s1 s2 a
> data State s1 s2 a = State { runState :: s1 -> (a,s2) }
> instance PMonad State where
> return a = State (\s -> (a,s))
> f >>= m = State (\s1 ->
> case runState f s1 of
> (a,s2) -> runState (m a) s2)
> m1 >> m2 = m1 >>= \_ -> m2
> fail = fail
> get = State (\s -> (s,s))
> put s = State (\_ -> ((),s))
> test1 = do x <- return 1
> y <- return 2
> z <- get
> put (x+y*z)
> return z
> go1 = runState test1 10
> test2 = do x <- return 1
> y <- return 2
> z <- get
> put (show (x+y*z))
> return z
> go2 = runState test2 10
</pre>
<pre>
[1 of 1] Compiling PMonad ( PMonad.hs, interpreted )
Ok, modules loaded: PMonad.
*PMonad> go2
(10,"21")
</pre>
And there you go. It's the NoImplicitPrelude that does the trick.Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com4tag:blogger.com,1999:blog-1228698015266547737.post-61207653998803530772008-03-12T08:30:00.000-07:002008-03-12T08:46:01.383-07:00Some examples of Software Transactional Memory in HaskellI've just finished teaching a course on Concurrent Programming. It's been fun. It's an increasingly important subject as software becomes more complex and processors get more cores. Most of the course has focused on lock-based shared memory synchronization. It's a tricky subject, you have to be very careful to avoid all the possibilities for deadlock. Fairness is also a tricky issue.
<p>
In the course I went through a couple of language constructs for
synchronization, for example semaphores, monitors with condition
variables and message passing. But at the end of the course I had time
for one extra lecture and so I thought it would be nice to talk about
transactional memory, which is the new hot language construct for
simplifying concurrent programming. Sure, it can be implemented as a
library as well, and that's how it's made in most languages, but that's
a very fragile solution.
<p>
So, despite the fact that the languages used in the course are Java,
<a href="http://www.cs.ucdavis.edu/~olsson/research/jr/">JR</a> and
Erlang I chose to use Haskell in the lecture, simply because GHC
provides better support for software transactional memory (STM) than
any other system that I know of. I think it worked well, the kind of
Haskell I used in the lectures is pretty imperative looking, simply
because STM in Haskell uses monads, and so the code doesn't look to
foreign even for someone not used to functional programming.
<p>
In this post I thought I'd share some snippets of code that I wrote
for the lecture. They're all just standard examples in concurrent
programming. Most of them look rather unimpressive but that's the
whole point: programming with transactional memory is such a relief
compared to fiddling with lock-based solutions.
<p>
This post is a literate Haskell file. Just copy-n-paste it into a file
and you can try it out yourself.
<pre><span class='varop'>></span> <span class='keyword'>module</span> <span class='conid'>STM</span> <span class='keyword'>where</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='keyword'>import</span> <span class='conid'>Random</span>
<span class='varop'>></span> <span class='keyword'>import</span> <span class='conid'>Control</span><span class='varop'>.</span><span class='conid'>Monad</span>
<span class='varop'>></span> <span class='keyword'>import</span> <span class='conid'>Control</span><span class='varop'>.</span><span class='conid'>Concurrent</span>
<span class='varop'>></span> <span class='keyword'>import</span> <span class='conid'>Control</span><span class='varop'>.</span><span class='conid'>Concurrent</span><span class='varop'>.</span><span class='conid'>STM</span>
</pre>
Let's start with something easy: binary semaphores. Or locks as
they're also called. I've used the traditional names p and v here for
acquiring and releasing the lock. They're really awful names though.
<pre><span class='varop'>></span> <span class='keyword'>type</span> <span class='conid'>Semaphore</span> <span class='keyglyph'>=</span> <span class='conid'>TVar</span> <span class='conid'>Bool</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>newSem</span> <span class='keyglyph'>::</span> <span class='conid'>Bool</span> <span class='keyglyph'>-></span> <span class='conid'>IO</span> <span class='conid'>Semaphore</span>
<span class='varop'>></span> <span class='varid'>newSem</span> <span class='varid'>available</span> <span class='keyglyph'>=</span> <span class='varid'>newTVarIO</span> <span class='varid'>available</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>p</span> <span class='keyglyph'>::</span> <span class='conid'>Semaphore</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>p</span> <span class='varid'>sem</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>b</span> <span class='keyglyph'><-</span> <span class='varid'>readTVar</span> <span class='varid'>sem</span>
<span class='varop'>></span> <span class='keyword'>if</span> <span class='varid'>b</span>
<span class='varop'>></span> <span class='keyword'>then</span> <span class='varid'>writeTVar</span> <span class='varid'>sem</span> <span class='conid'>False</span>
<span class='varop'>></span> <span class='keyword'>else</span> <span class='varid'>retry</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>v</span> <span class='keyglyph'>::</span> <span class='conid'>Semaphore</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>v</span> <span class='varid'>sem</span> <span class='keyglyph'>=</span> <span class='varid'>writeTVar</span> <span class='varid'>sem</span> <span class='conid'>True</span>
</pre>
The most impressive thing with the transactional memory implementation
in GHC is in my opinion the <kbd>retry</kbd> combinator. It handles
all conditional synchronization by complete magic. Sure, I happen to
know how it works under the hood but the simplicity of the API is just
marvellous. Whenever we have a condition that is not fulfilled, in the
above code the condition is that the lock happens to be already taken,
we just call <kbd>retry</kbd>. The runtime system will take care of
waking up the process at the appropriate time. No messing around with
condition variables or anything, like with monitors. It just couldn't
be simpler. We'll see more of this in the examples that follow.
<p>
A slightly more interesting example: an unbounded buffer which
processes can insert stuff into and then take out. It's written with
clarity in mind, not efficiency. It's easy to make it more efficient
but I didn't want to introduce more data types to the students so I
used a naive list implementation. I leave an efficient implementation
as an exercise to the reader.
<pre><span class='varop'>></span> <span class='keyword'>type</span> <span class='conid'>Buffer</span> <span class='varid'>a</span> <span class='keyglyph'>=</span> <span class='conid'>TVar</span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>newBuffer</span> <span class='keyglyph'>::</span> <span class='conid'>IO</span> <span class='layout'>(</span><span class='conid'>Buffer</span> <span class='varid'>a</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>newBuffer</span> <span class='keyglyph'>=</span> <span class='varid'>newTVarIO</span> <span class='conid'>[]</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>put</span> <span class='keyglyph'>::</span> <span class='conid'>Buffer</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>put</span> <span class='varid'>buffer</span> <span class='varid'>item</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>ls</span> <span class='keyglyph'><-</span> <span class='varid'>readTVar</span> <span class='varid'>buffer</span>
<span class='varop'>></span> <span class='varid'>writeTVar</span> <span class='varid'>buffer</span> <span class='layout'>(</span><span class='varid'>ls</span> <span class='varop'>++</span> <span class='keyglyph'>[</span><span class='varid'>item</span><span class='keyglyph'>]</span><span class='layout'>)</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>get</span> <span class='keyglyph'>::</span> <span class='conid'>Buffer</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='varid'>a</span>
<span class='varop'>></span> <span class='varid'>get</span> <span class='varid'>buffer</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>ls</span> <span class='keyglyph'><-</span> <span class='varid'>readTVar</span> <span class='varid'>buffer</span>
<span class='varop'>></span> <span class='keyword'>case</span> <span class='varid'>ls</span> <span class='keyword'>of</span>
<span class='varop'>></span> <span class='conid'>[]</span> <span class='keyglyph'>-></span> <span class='varid'>retry</span>
<span class='varop'>></span> <span class='layout'>(</span><span class='varid'>item</span><span class='conop'>:</span><span class='varid'>rest</span><span class='layout'>)</span> <span class='keyglyph'>-></span> <span class='keyword'>do</span> <span class='varid'>writeTVar</span> <span class='varid'>buffer</span> <span class='varid'>rest</span>
<span class='varop'>></span> <span class='varid'>return</span> <span class='varid'>item</span>
</pre>
Again the code is so clear and easy to understand that it makes you
wonder how is could be any different. It should be said though that
this implementation gives priority to processes writing to the
buffer. If there is a single process writing to the buffer he can make
an arbitrary amount of readers retry. A lock-based implementation
provides mutual exclusion so only one process at a time has access to
the buffer. In such an implementation processes get access in the same
order as they requested it so writers do not have priority over
readers. But that implementation of course has all the usual problems
that a lock-based implementation has.
<p>
One standard pattern that I used a lot in the course was resource
allocation. I used a distilled version where there is a counter
keeping track of the amount of resources available. Processes may ask
for an arbitrary amount of resources and block if they're not
available. This interface has the advantage that it's resource
agnostic and can be implemented in a language like Java. I'm sure
there are nicer ways to do it in Haskell, depending on the resource.
<pre><span class='varop'>></span> <span class='keyword'>type</span> <span class='conid'>Resource</span> <span class='keyglyph'>=</span> <span class='conid'>TVar</span> <span class='conid'>Int</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>acquire</span> <span class='keyglyph'>::</span> <span class='conid'>Resource</span> <span class='keyglyph'>-></span> <span class='conid'>Int</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>acquire</span> <span class='varid'>res</span> <span class='varid'>nr</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>n</span> <span class='keyglyph'><-</span> <span class='varid'>readTVar</span> <span class='varid'>res</span>
<span class='varop'>></span> <span class='keyword'>if</span> <span class='varid'>n</span> <span class='varop'><</span> <span class='varid'>nr</span>
<span class='varop'>></span> <span class='keyword'>then</span> <span class='varid'>retry</span>
<span class='varop'>></span> <span class='keyword'>else</span> <span class='varid'>writeTVar</span> <span class='varid'>res</span> <span class='layout'>(</span><span class='varid'>n</span> <span class='comment'>-</span> <span class='varid'>nr</span><span class='layout'>)</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>release</span> <span class='keyglyph'>::</span> <span class='conid'>Resource</span> <span class='keyglyph'>-></span> <span class='conid'>Int</span> <span class='keyglyph'>-></span> <span class='conid'>STM</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>release</span> <span class='varid'>res</span> <span class='varid'>nr</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>n</span> <span class='keyglyph'><-</span> <span class='varid'>readTVar</span> <span class='varid'>res</span>
<span class='varop'>></span> <span class='varid'>writeTVar</span> <span class='varid'>res</span> <span class='layout'>(</span><span class='varid'>n</span> <span class='varop'>+</span> <span class='varid'>nr</span><span class='layout'>)</span>
</pre>
As a final example I will give an implementation of the dining
philosophers. It's one of the classic problems in concurrent
programming and so it's interesting to see how STM performs.
<p>
To make this program interesting we have to output what the
philosophers are doing so that we can verify that the simulation is
correct. But the standard I/O primitives in Haskell are not thread
safe so we have to take a bit of care when writing stuff to standard
out. The way I've solved it is to have a buffer which all the
philosopher processes writes to and then I have a separate thread
which reads from the buffer and writes to the screen. That way only
one process does the outputting and there is no risk of weird output.
<p>
I'm not going to go through the exact formulation of the dining
philosophers problem. If you're unsure how it goes I suggest you read
<a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem">the
wikipedia article</a> on it. I've chosen to implement the forks as the
binary semaphores that we defined above.
<pre><span class='varop'>></span> <span class='varid'>simulation</span> <span class='varid'>n</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>forks</span> <span class='keyglyph'><-</span> <span class='varid'>replicateM</span> <span class='varid'>n</span> <span class='layout'>(</span><span class='varid'>newSem</span> <span class='conid'>True</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>outputBuffer</span> <span class='keyglyph'><-</span> <span class='varid'>newBuffer</span>
<span class='varop'>></span> <span class='varid'>for</span> <span class='keyglyph'>[</span><span class='num'>0</span><span class='keyglyph'>..</span><span class='varid'>n</span><span class='comment'>-</span><span class='num'>1</span><span class='keyglyph'>]</span> <span class='varop'>$</span> <span class='keyglyph'>\</span><span class='varid'>i</span> <span class='keyglyph'>-></span>
<span class='varop'>></span> <span class='varid'>forkIO</span> <span class='layout'>(</span><span class='varid'>philosopher</span> <span class='varid'>i</span> <span class='varid'>outputBuffer</span>
<span class='varop'>></span> <span class='layout'>(</span><span class='varid'>forks</span><span class='varop'>!!</span><span class='varid'>i</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='layout'>(</span><span class='varid'>forks</span><span class='varop'>!!</span><span class='layout'>(</span><span class='layout'>(</span><span class='varid'>i</span><span class='varop'>+</span><span class='num'>1</span><span class='layout'>)</span><span class='varop'>`mod`</span><span class='varid'>n</span><span class='layout'>)</span><span class='layout'>)</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>output</span> <span class='varid'>outputBuffer</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>output</span> <span class='varid'>buffer</span> <span class='keyglyph'>=</span>
<span class='varop'>></span> <span class='keyword'>do</span> <span class='varid'>str</span> <span class='keyglyph'><-</span> <span class='varid'>atomically</span> <span class='varop'>$</span> <span class='varid'>get</span> <span class='varid'>buffer</span>
<span class='varop'>></span> <span class='varid'>putStrLn</span> <span class='varid'>str</span>
<span class='varop'>></span> <span class='varid'>output</span> <span class='varid'>buffer</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>for</span> <span class='keyglyph'>=</span> <span class='varid'>flip</span> <span class='varid'>mapM_</span>
</pre>
This first bit of code just sets everything up for the simulation. The
function <kbd>simulation</kbd> take the number of philosophers as an
argument. Then it creates the required number of forks, the output
buffer and spawns off all the philosopher processes which are given
their corresponding forks. Finally the main thread goes into a loop
which reads of the strings from the output buffer and prints them.
<pre><span class='varop'>></span> <span class='varid'>philosopher</span> <span class='keyglyph'>::</span> <span class='conid'>Int</span> <span class='keyglyph'>-></span> <span class='conid'>Buffer</span> <span class='conid'>String</span> <span class='keyglyph'>-></span> <span class='conid'>Semaphore</span> <span class='keyglyph'>-></span> <span class='conid'>Semaphore</span> <span class='keyglyph'>-></span> <span class='conid'>IO</span> <span class='conid'>()</span>
<span class='varop'>></span> <span class='varid'>philosopher</span> <span class='varid'>n</span> <span class='varid'>out</span> <span class='varid'>fork1</span> <span class='varid'>fork2</span> <span class='keyglyph'>=</span>
<span class='varop'>></span> <span class='keyword'>do</span> <span class='varid'>atomically</span> <span class='varop'>$</span> <span class='varid'>put</span> <span class='varid'>out</span> <span class='layout'>(</span><span class='str'>"Philosopher "</span> <span class='varop'>++</span> <span class='varid'>show</span> <span class='varid'>n</span> <span class='varop'>++</span> <span class='str'>" is thinking."</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>randomDelay</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>atomically</span> <span class='varop'>$</span> <span class='keyword'>do</span>
<span class='varop'>></span> <span class='varid'>p</span> <span class='varid'>fork1</span>
<span class='varop'>></span> <span class='varid'>p</span> <span class='varid'>fork2</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>atomically</span> <span class='varop'>$</span> <span class='varid'>put</span> <span class='varid'>out</span> <span class='layout'>(</span><span class='str'>"Philosopher "</span> <span class='varop'>++</span> <span class='varid'>show</span> <span class='varid'>n</span> <span class='varop'>++</span> <span class='str'>" is eating."</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>randomDelay</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>atomically</span> <span class='varop'>$</span> <span class='keyword'>do</span>
<span class='varop'>></span> <span class='varid'>v</span> <span class='varid'>fork1</span>
<span class='varop'>></span> <span class='varid'>v</span> <span class='varid'>fork2</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>philosopher</span> <span class='varid'>n</span> <span class='varid'>out</span> <span class='varid'>fork1</span> <span class='varid'>fork2</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>randomDelay</span> <span class='keyglyph'>=</span> <span class='keyword'>do</span> <span class='varid'>r</span> <span class='keyglyph'><-</span> <span class='varid'>randomRIO</span> <span class='layout'>(</span><span class='num'>100000</span><span class='layout'>,</span><span class='num'>500000</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>threadDelay</span> <span class='varid'>r</span>
</pre>
Again, this implementation is so simple that you wonder what the
problem was in the first place. The power comes from being able to
compose several transactions together sequentially and perform them
atomically.
<p>
Note that the philosophers execute in the IO monad and only invoke the
transactional memory bit when it has to do synchronization. This is
the typical usage of transactional memory, something we didn't see
with the earlier examples.
<p>
There's one thing that's not so nice with GHC's implementation of
STM though. You can see it for yourself if you change the call to
<kbd>randomDelay</kbd> to a call to <kbd>threadDelay</kbd> with a
fixed time. What will happen is that one of the philosophers will
starve. I will not say more about that here though, it's the topic of
another blog post.
<p>
That's all for this post. I hope you find the examples useful.Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com6tag:blogger.com,1999:blog-1228698015266547737.post-5803758687149474622007-08-13T08:28:00.000-07:002008-01-09T10:03:01.250-08:00Fun Functions: FlattenThis post is going to be about a function which may not seem like much, it takes a tree and returns the elements of the tree in a list the way an inorder traversal would visit them. While this is not an uncommon task in programming it is so mundane that it hardly deserves a blog post on its own. So why am I writing this? The thing that have captured me is a particular implementation of this function.
Let us first set the scene.
This blog post is a literate Haskell file and you should be able so save it to a file and run it yourself.
<pre><span class='varop'>></span> <span class='keyword'>module</span> <span class='conid'>Flatten</span> <span class='keyword'>where</span>
<span class='varop'>></span> <span class='keyword'>import</span> <span class='conid'>Test</span><span class='varop'>.</span><span class='conid'>QuickCheck</span>
<span class='varop'>></span> <span class='keyword'>data</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>=</span> <span class='conid'>Node</span> <span class='layout'>(</span><span class='conid'>Tree</span> <span class='varid'>a</span><span class='layout'>)</span> <span class='varid'>a</span> <span class='layout'>(</span><span class='conid'>Tree</span> <span class='varid'>a</span><span class='layout'>)</span> <span class='keyglyph'>|</span> <span class='conid'>Leaf</span> <span class='keyword'>deriving</span> <span class='conid'>Show</span>
</pre>
So we have a data type for trees, it's the standard one. Nothing surprising here. The scene is set.
How would we go about implementing flatten then? There are in fact a whole number of ways to implement it and I'm going to show several of them. The last version will be my favorite one and after having seen all the other versions I hope you will appreciate it as I do.
Let us then start with something rather naive. A straightforward recursive function which turns a tree into a list:
<pre><span class='varop'>></span> <span class='varid'>flatten1</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>flatten1</span> <span class='conid'>Leaf</span> <span class='keyglyph'>=</span> <span class='conid'>[]</span>
<span class='varop'>></span> <span class='varid'>flatten1</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='varid'>t2</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>flatten1</span> <span class='varid'>t1</span> <span class='varop'>++</span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span> <span class='varop'>++</span> <span class='varid'>flatten1</span> <span class='varid'>t2</span>
</pre>
That's very concise and easy to understand. But this solution has a slight problem. It takes quadratic time. This is due to the fact that <kbd>++</kbd> take linear time in it's left argument. It is a well known problem with many solutions. The standard solution, due to John Hughes, is to represent lists as functions from lists to lists. This way we get constant time concatenation.
Incidentally, some people seem to think that this version is fine. It is used on the type level <a href="http://www.alpheccar.org/en/posts/show/88">by alpheccar in his blog</a>.
Implementing lists the Hughes way gives us our second variation of flatten.
<pre><span class='varop'>></span> <span class='varid'>flatten2</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>flatten2</span> <span class='varid'>tree</span> <span class='keyglyph'>=</span> <span class='varid'>help2</span> <span class='varid'>tree</span> <span class='conid'>[]</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>help2</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>help2</span> <span class='conid'>Leaf</span> <span class='keyglyph'>=</span> <span class='varid'>id</span>
<span class='varop'>></span> <span class='varid'>help2</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='varid'>t2</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>help2</span> <span class='varid'>t1</span> <span class='varop'>.</span> <span class='layout'>(</span><span class='varid'>a</span> <span class='conop'>:</span><span class='layout'>)</span> <span class='varop'>.</span> <span class='varid'>help2</span> <span class='varid'>t2</span>
</pre>
Hmmmm. I don't know what you think of this version but I don't like it. For two reasons. It uses higher order programming when it isn't necessary, and that just obfuscates things. Secondly, it needs a helper function. It would be nice if we could do without that.
Let's start by tackling the first problem of higher order programming. One way to remove the higher orderness is to give the function <kbd>help2</kbd> an extra argument (called eta expansion, if that means anything to you). This gives us the following version.
<pre><span class='varop'>></span> <span class='varid'>flatten3</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>flatten3</span> <span class='varid'>tree</span> <span class='keyglyph'>=</span> <span class='varid'>help3</span> <span class='varid'>tree</span> <span class='conid'>[]</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>help3</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>help3</span> <span class='conid'>Leaf</span> <span class='varid'>rest</span> <span class='keyglyph'>=</span> <span class='varid'>rest</span>
<span class='varop'>></span> <span class='varid'>help3</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='varid'>t2</span><span class='layout'>)</span> <span class='varid'>rest</span> <span class='keyglyph'>=</span> <span class='varid'>help3</span> <span class='varid'>t1</span> <span class='layout'>(</span><span class='varid'>a</span> <span class='conop'>:</span> <span class='varid'>help3</span> <span class='varid'>t2</span> <span class='varid'>rest</span><span class='layout'>)</span>
</pre>
I call the extra argument <kbd>rest</kbd> because it contains the rest of the elements that we should put in the list.
This solution still uses a helper function which is a bit of a nuisance but unfortunately I don't know an easy way to remove it. Instead I'm going to present another way to think about flattening that will take us down another road.
How would you go about implementing an inorder traversal with a while loop instead of recursion? The solution would be to use a stack to remember the parts of the tree that we haven't visited yet. We can use the same idea to write a new version of flatten, using a list as the
stack. The top of the stack will be the part of the tree that we are visiting at the moment. We will need a little trick to remember when we have visited a subtree and that is to simply remove it from its parent. The final code looks like this.
<pre><span class='varop'>></span> <span class='varid'>flatten4</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>flatten4</span> <span class='varid'>tree</span> <span class='keyglyph'>=</span> <span class='varid'>help4</span> <span class='keyglyph'>[</span><span class='varid'>tree</span><span class='keyglyph'>]</span>
<span class='varop'>></span>
<span class='varop'>></span> <span class='varid'>help4</span> <span class='keyglyph'>::</span> <span class='keyglyph'>[</span><span class='conid'>Tree</span> <span class='varid'>a</span><span class='keyglyph'>]</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>help4</span> <span class='conid'>[]</span> <span class='keyglyph'>=</span> <span class='conid'>[]</span>
<span class='varop'>></span> <span class='varid'>help4</span> <span class='layout'>(</span><span class='conid'>Leaf</span> <span class='conop'>:</span> <span class='varid'>ts</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>help4</span> <span class='varid'>ts</span>
<span class='varop'>></span> <span class='varid'>help4</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='conid'>Leaf</span> <span class='varid'>a</span> <span class='varid'>t</span> <span class='conop'>:</span> <span class='varid'>ts</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>a</span> <span class='conop'>:</span> <span class='varid'>help4</span> <span class='layout'>(</span><span class='varid'>t</span> <span class='conop'>:</span> <span class='varid'>ts</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>help4</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='varid'>t2</span> <span class='conop'>:</span> <span class='varid'>ts</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>help4</span> <span class='layout'>(</span><span class='varid'>t1</span> <span class='conop'>:</span> <span class='conid'>Node</span> <span class='conid'>Leaf</span> <span class='varid'>a</span> <span class='varid'>t2</span> <span class='conop'>:</span> <span class='varid'>ts</span><span class='layout'>)</span>
</pre>
This version is pretty clever. It takes apart the tree and spreads it out into the list until there is nothing left of the tree. But one might argue that this version is too clever. It is certainly not easy to understand from the code. Plus it uses a helper function, something that we have been trying to get away from.
The final version that I will show builds upon the idea that we can represent the list of trees from our previous version as simply a tree. We are going to use the input tree as the stack and modify it as we go along. It goes like this.
<pre><span class='varop'>></span> <span class='varid'>flatten5</span> <span class='keyglyph'>::</span> <span class='conid'>Tree</span> <span class='varid'>a</span> <span class='keyglyph'>-></span> <span class='keyglyph'>[</span><span class='varid'>a</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>flatten5</span> <span class='conid'>Leaf</span> <span class='keyglyph'>=</span> <span class='conid'>[]</span>
<span class='varop'>></span> <span class='varid'>flatten5</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='conid'>Leaf</span> <span class='varid'>a</span> <span class='varid'>t</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>a</span> <span class='conop'>:</span> <span class='varid'>flatten5</span> <span class='varid'>t</span>
<span class='varop'>></span> <span class='varid'>flatten5</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='varid'>t2</span><span class='layout'>)</span> <span class='varid'>b</span> <span class='varid'>t3</span><span class='layout'>)</span> <span class='keyglyph'>=</span> <span class='varid'>flatten5</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>a</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t2</span> <span class='varid'>b</span> <span class='varid'>t3</span><span class='layout'>)</span><span class='layout'>)</span>
</pre>
What happens in the third equation is that the tree is restructured so that it becomes a linear structure, just like a list. It is then simple to read off the element in the second equation.
This is my favorite version. It's short, without helper functions and no higher order stuff. Plus it is efficient. Now, you might complain that it does more allocations that some of the previous versions. Indeed we have to allocate new trees in the third case. But note that that case is tail recursive. That means that it doesn't use any stack. It is possible to implement this version of the function so that it doesn't any more space than any of the other versions
above. The control stack is encoded into the tree.
One additional thing about this version is that it's termination is not entirely obvious. This function will fool several termination checkers that I know of. In particular, in the third equation the argument tree in recursive call is not smaller than the input
tree. This is a criterion that many termination checker use. To understand that the function terminates you must instead see that it is the left subtree of the input that gets smaller together with the fact that trees are finite. To me this is an extra bonus. You might
think otherwise.
Finally, I have some code to test that all the version I've given above are equal.
<pre><span class='varop'>></span> <span class='keyword'>instance</span> <span class='conid'>Arbitrary</span> <span class='varid'>a</span> <span class='keyglyph'>=></span> <span class='conid'>Arbitrary</span> <span class='layout'>(</span><span class='conid'>Tree</span> <span class='varid'>a</span><span class='layout'>)</span> <span class='keyword'>where</span>
<span class='varop'>></span> <span class='varid'>arbitrary</span> <span class='keyglyph'>=</span> <span class='varid'>sized</span> <span class='varid'>arbTree</span>
<span class='varop'>></span> <span class='keyword'>where</span> <span class='varid'>arbTree</span> <span class='num'>0</span> <span class='keyglyph'>=</span> <span class='varid'>return</span> <span class='conid'>Leaf</span>
<span class='varop'>></span> <span class='varid'>arbTree</span> <span class='varid'>n</span> <span class='keyglyph'>|</span> <span class='varid'>n</span><span class='varop'>></span><span class='num'>0</span> <span class='keyglyph'>=</span> <span class='varid'>frequency</span> <span class='varop'>$</span>
<span class='varop'>></span> <span class='keyglyph'>[</span><span class='layout'>(</span><span class='num'>1</span><span class='layout'>,</span> <span class='varid'>return</span> <span class='conid'>Leaf</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='layout'>,</span><span class='layout'>(</span><span class='num'>4</span><span class='layout'>,</span> <span class='keyword'>do</span> <span class='varid'>elem</span> <span class='keyglyph'><-</span> <span class='varid'>arbitrary</span>
<span class='varop'>></span> <span class='varid'>t1</span> <span class='keyglyph'><-</span> <span class='varid'>arbTree</span> <span class='layout'>(</span><span class='varid'>n</span> <span class='varop'>`div`</span> <span class='num'>2</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>t2</span> <span class='keyglyph'><-</span> <span class='varid'>arbTree</span> <span class='layout'>(</span><span class='varid'>n</span> <span class='varop'>`div`</span> <span class='num'>2</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>return</span> <span class='layout'>(</span><span class='conid'>Node</span> <span class='varid'>t1</span> <span class='varid'>elem</span> <span class='varid'>t2</span><span class='layout'>)</span><span class='layout'>)</span><span class='keyglyph'>]</span>
<span class='varop'>></span> <span class='varid'>prop_flatten2</span> <span class='varid'>t</span> <span class='keyglyph'>=</span> <span class='varid'>flatten1</span> <span class='varid'>t</span> <span class='varop'>==</span> <span class='varid'>flatten2</span> <span class='layout'>(</span><span class='varid'>t</span> <span class='keyglyph'>::</span><span class='conid'>Tree</span> <span class='conid'>Int</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>prop_flatten3</span> <span class='varid'>t</span> <span class='keyglyph'>=</span> <span class='varid'>flatten1</span> <span class='varid'>t</span> <span class='varop'>==</span> <span class='varid'>flatten3</span> <span class='layout'>(</span><span class='varid'>t</span> <span class='keyglyph'>::</span><span class='conid'>Tree</span> <span class='conid'>Int</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>prop_flatten4</span> <span class='varid'>t</span> <span class='keyglyph'>=</span> <span class='varid'>flatten1</span> <span class='varid'>t</span> <span class='varop'>==</span> <span class='varid'>flatten4</span> <span class='layout'>(</span><span class='varid'>t</span> <span class='keyglyph'>::</span><span class='conid'>Tree</span> <span class='conid'>Int</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>prop_flatten5</span> <span class='varid'>t</span> <span class='keyglyph'>=</span> <span class='varid'>flatten1</span> <span class='varid'>t</span> <span class='varop'>==</span> <span class='varid'>flatten5</span> <span class='layout'>(</span><span class='varid'>t</span> <span class='keyglyph'>::</span><span class='conid'>Tree</span> <span class='conid'>Int</span><span class='layout'>)</span>
<span class='varop'>></span> <span class='varid'>runTests</span> <span class='keyglyph'>=</span> <span class='varid'>mapM_</span> <span class='varid'>quickCheck</span> <span class='keyglyph'>[</span><span class='varid'>prop_flatten2</span><span class='layout'>,</span><span class='varid'>prop_flatten3</span><span class='layout'>,</span>
<span class='varop'>></span> <span class='varid'>prop_flatten4</span><span class='layout'>,</span><span lass='varid'>prop_flatten5</span><span class='keyglyph'>]
</span><span class='varop'>></span> </pre>Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com0tag:blogger.com,1999:blog-1228698015266547737.post-73339834234018887202007-08-13T07:39:00.000-07:002007-11-27T08:43:08.995-08:00Learning Monads in 1996Given the <a href="http://www.haskell.org/haskellwiki/Monad_tutorials_timeline">plethora of monad tutorials</a> these days it's hard to imagine how anybody could have learned about this concept during the 90's. I learned most of what I know about monads in 1996, back in the days when even researchers had a hard time understanding the list monad. This post will not be another monad tutorial. I'm simply going to explain how I managed to become friendly with monads in a time where monad tutorials where very scarce.
So, why did I start looking into monads in the first place? In 1995 I started my undergraduate studies at Chalmers and the first language we were taught was Gofer, a now extinct dialect of Haskell. When the time came for teaching us I/O they didn't show us the monadic way of doing things. Apparently our teacher (who by the way was not a functional programmer, a strange fact considering that Chalmers abound with functional programmers) thought that the concept of monads was too scary and complicated and taught us continuation I/O, something which I experienced as painfully horrible and utterly confusing. But some of us students had heard about monads and how they were supposed to be they way of the future when it came to I/O and other things. They sounded really cool, and hey, they couldn't possibly be worse than continuation I/O right? So I set my mind to learning monads.
I tried to read everything that was available on the subject which wasn't much at the time. There were two papers which I spent a considerable time with: <a href="http://homepages.inf.ed.ac.uk/wadler/topics/monads.html">"The essence of functional programming" and "Monads for functional programming"</a> by Phil Wadler. I implemented every single bit of code in those papers, tore the code to pieces and put it back again. I started implementing monadic parser libraries based on the paper <a href="http://www.cs.nott.ac.uk/~gmh/bib.html#monparsing">"Monadic parser combinators"</a> by Hutton and Meijer. Trying out different implementations and combinators I wrote variation after variation of these libraries. I don't know how many of parser libraries I implemented in the end. I've lost count. Slowly I was starting to grasp monads as I played around with them.
After some time I was even able to digest and understand the paper <a href="http://web.cecs.pdx.edu/~mpj/pubs/modinterp.html">"Monad Transformers and Modular Interpreters"</a> but it certainly didn't happen the first time I read it. What finally pushed me over the edge was when a friend asked me how to combine different monads. I didn't know the answer and for some reason it kind of bugged me that I couldn't answer his question. But I had a vague recollection of ideas in the above paper. Determined to understand this, I sat down and started to hack on monad transformers and didn't finish until I finally had a rudimentary understanding of them, several hours later.
Monadic I/O turned out to be a breeze to learn thanks to the nice <a href="http://research.microsoft.com/~adg/Publications/details.htm#haskell95">"Monadic I/O in Haskell 1.3"</a> by Kevin Hammond and Andrew Gordon.
And there you have it. The way I learned about monads. No magic, just hard work. I'm a slow learner and you can probably get comfortable with monads much faster than I. But I don't think you'll find any method which is significantly different from mine. What makes me say that?
Why are monads such a difficult concept to learn? The conclusion I've come to is that the difficulty lie in the fact that it's an abstraction. Abstract things are simply hard, especially for people who have little experience with abstractions before. As an example take an abstraction like the concept of a group. What is a group? It's a set, call it G, and a binary operator on G which is associative and has inverses and an identity element. For anyone used to abstractions this is a perfectly fine definition. But not so for your average man on the street, or even programmer. They will react something like this: "Ok, I understand the definition. But what *is* a group? I mean, in *reality*?" I remember having similar thoughts myself. The fact that a group is not a concrete thing is a great confusion. In my experience this only settles after the person has gained some experience with lots of examples of what a group is, and a few examples of things that fail to be a group. By then most people will have gotten pretty comfortable with the concept.
It's the same thing with monads. They're an abstract concept which will take time to digest. And this digestion is best done by playing around with various examples of monads and occasionally meet examples of data types which are not monads and for what reason they fail. I don't think there are any shortcuts to learning abstract concepts. But I think that one can become better at it and it certainly helps to have a good tutorial when learning a subject, such as monads.Josefhttp://www.blogger.com/profile/13272830598221833253noreply@blogger.com3