The single assignment cargo cult
Erlang programmers are awfully vocal about the idea of single assignment: unlike virtually every other language on the planet where you are free to bind values to the same identifier over and over again, in Erlang you can do it once and only once. However, if you ask an Erlang programmer why multiple assignment is a bad thing you’re unlikely to get a straight answer:
"It is reason why once variable assignment is in Erlang and why it is good think. One advice for you: Don’t change well proved language until you worth know it."
"Are you saying Joe Armstrong is wrong?"
"If I have to explain to you why multiple assignment is bad, you obviously don’t know Erlang well enough"
I was rather relieved to see Damien Katz, one of the principal authors of CouchDB, expressing similar concerns about Erlang’s lack of multiple assignment in his blog post "What Sucks About Erlang":
In C, lets say you have some code:
int f(int x) { x = foo(x); x = bar(x); return baz(x); }And you want to add a new step in the function:
int f(int x) { x = foo(x); x = fab(x); x = bar(x); return baz(x); }Only one line needs editing,
Consider the Erlang equivalent:
f(X) -> X1 = foo(X), X2 = bar(X1), baz(X2).Now you want to add a new step, which requires editing every variable thereafter:
f(X) -> X1 = foo(X), X2 = fab(X1),
X3 = bar(X2),
baz(X3).
Yariv Sadan, the creator of ErlyWeb and perhaps the most prominent Erlang blogger, agreed that he’s run into it before and that it can be annoying. In his response to Damien Katz, he suggested the following:
If you’re writing code like in Damien’s example and you want to be able to insert lines without changing a bunch of variable names, I have a tip: increment by 10. This will prevent the big cascading variable renamings in most situations. Instead of the original code, write
f(X) -> X10 = foo(X), X20 = bar(X10), baz(X20).
then change it as follows when inserting a new line in the middle:
f(X) -> X10 = foo(X), X15 = fab(X10), X20 = bar(X15), baz(X20).
Yes, I know, it’s not exactly beautiful, but in the rare cases where you need it, it’s a useful trick.

What is this? BASIC? Are we’re back to line numbers as a good solution to this problem? I have the utmost respect for Yariv and am using his excellent smerl library as part of Reia, but that’s a truly silly answer.
I’m going to try to go through some of the more common complaints against multiple assignment and see if I can address them…
Immutability and Multiple Assignment Are NOT a Boolean Decision
Perhaps one of the most common complaints against multiple assignment goes a little something like this:
Immutability is a key feature of a concurrent programming language
Multiple assignment breaks immutability
Therefore, multiple assignment does not belong in a concurrent programming language
Except there’s one thing wrong with that argument: multiple assignment and immutability are two completely different things. The single assignment cargo cult loves to conflate the two terms. I cannot tell you how many times I’ve seen someone toss out immutability as a red herring in arguments about multiple assignment. They are not the same thing and have absolutely nothing to do with each other.
To begin, let’s look at how Erlang’s creator Joe Armstrong explains how to convert a statement which uses multiple assignment into one that does not:
How can you express something like X = X + 1 in Erlang?
The answer is easy. Invent a new variable whose name hasn’t been used before (say X1), and write X1 = X + 1.
Simple as that! Each time we encounter the potential reassignment of a variable, just pick a new name and bind to that. At this point you might be wondering: couldn’t the compiler do that for you? Isn’t it easy to take something like:
X = 42
Y = 1
X = X + 1
Y = Y + X
and transform it into an equivalent set of code which uses single assigment? Of course it is:
X0 = 42
Y0 = 1
X1 = X0 + 1
Y1 = Y0 + X1
In the new form, each variable is assigned exactly once, with the original variables split into versions. The compiler can take the original code which uses multiple assignment and convert it to what’s known as static single assignment form.
There’s nothing magic about this. If the values the variables are bound to are immutable due to language constraints, they remain immutable. This is just a simple compile-time transformation which has zero effect on the program when it is running.
Repeat after me: immutability and multiple assignment have absolutely nothing to do with each other. In fact, SSA form has been proven equivalent to the Continuation-Passing Style that was a popular intermediate form for functional languages (functional programmers may appreciate this paper instead). SSA form is effectively an intermediate lambda calculus representation of languages which allow multiple assignment. In other words, there exists an intermediate functional representation for imperative programs.
That said, today I began writing a pass for the Reia compiler which converts the original Reia abstract syntax with multiple assignment into SSA form before compiling it to Erlang forms. In other words, I’m preprocessing a language with multiple assignment and compiling it to an equivalent Erlang program with single assignment.
Pattern Matching and Multiple Assignment Can Coexist Peacefully
Erlang makes extensive use of an idea called pattern matching. Where the "=" operator in most languages is used exclusively to bind a variable on the left side to a value on the right, in Erlang it’s used to match two expressions. This means that if a variable is unbound it’s assigned, and if it’s bound it’s matched as part of the pattern. Patterns are a fundamental construct in Erlang and used as part of things like function declarations and case statements. Consider the following example (in Erlang):
X = 42,
Y = 0,
case {42, 24} of
{Y, Z} -> Y + Z;
{X, Z} -> X + Z;
_ -> X + Y
end.
It may be hard for non-Erlang programmers to tell what’s going on there. Perhaps I can start by telling you that the case statement returns "66", which is 42 + 24. This is because Erlang’s case statement is matching the expression {42, 24} against the three patterns listed in the case statement: the first is {Y, Z}, the next is {X, Z}, and the last is _ which acts as a catch-all pattern.
When Erlang matches a pattern containing variables, if the variable is bound it’s used as a part of the pattern. In the first case, {Y, Z}, since the variable Y is bound it’s used as part of the pattern, so the first case is effectively {0, Z}. Since 0 and 42 don’t match, the case statement moves onto the next pattern, {X, Z}, which is effectively {42, Z}. Here the pattern matches, but since the variable Z isn’t bound the match expression binds it to the value 24. The result is 66.
Erlang’s pattern matching relies on the idea of single-assignment variables in order to operate. So how could you have multiple assignment without breaking pattern matching?
Simple: introduce a unary operator for use in patterns which prevents variables from being rebound. This is precisely how Reia aims to solve the problem. Here’s the same example as above in a hypothetical "multiple assignment Erlang" which uses a unary * operator to indicate that a variable should not be rebound in a pattern matching expression:
X = 42,
Y = 0,
case {42, 24} of
{*Y, Z} -> Y + Z;
{*X, Z} -> X + Z;
_ -> X + Y
end.
The result is virtually identical, but here it’s explicitly clear which variables are bound and being used as part of the pattern matching expression, and which ones we wish to bind. Personally I like this approach better and think it makes it easier to read the pattern matching expression, as opposed to inferring from context which variables are bound and which ones aren’t, but YMMV…
Side Effects Really Just Aren’t That Bad
I can see a typical Erlang programmer actually following the last two arguments and perhaps even conceeding that they have a certain degree of validity. Here I expect to hit a wall. You, the hardcore Erlang programmer, will simply not buy this argument, and that’s fine, different strokes for different folks. But first let me just point out why it’s hypocritical to be a side effect-hating Erlang programmer.
Erlang’s entire concurrency model is built around side effects:
- When you send a message to another process, you’ve caused a side effect
- When you register or unregister a process, you’ve caused a side effect
- When you spawn a process, you’ve caused a side effect
- When a node connects or disconnects to a distributed Erlang system, it causes a side effect (the value of erlang:nodes/0 changes)
and the list does go on… if you’re an Erlang programmer who thinks side effects are truly bad, perhaps you should be looking at concurrent languages which don’t have side effects at the heart of their operation.
All that said, I will admit: I have encountered errors where a variable has an unexpected value because it was accidently rebound. I’d say that sort of thing happens to me about once a month, and usually entails about 15 minutes or so of debugging time. YMMV, but as far as I’m concerned, it really isn’t that bad.
In my case, I’m saying that single assignment would save me about 15 minutes of debugging time per month. What would I be trading that 15 minutes per month for? Well consider these examples, the first in Ruby:
x = func(x) if y
and an equivalent expression in Erlang:
X1 = if
Y -> func(X);
true -> X
end.
Here Erlang isn’t being helped by its unwieldy if syntax, but that syntax is made even more unwieldy by the lack of multiple assignment. Here we’re binding a completely new variable, X1, and so the if statement must consider if the Y condition is true and return X after having been run through the given function. However, we still need to bind X1 to X’s original value in the case that the Y condition is false.
In the Ruby example, we don’t need to worry about the latter case, since it’s implicitly done for us with a side effect. In the end, single assignment causes more work for the programmer, both in this case and the sorts of examples Damien Katz gave above.
Is it really worth it? Does having a more verbose language that from a certain perspective is easier to reason about actually save you time in the long run? My answer is no…
Single Assignment is Weird
In the programming language I’m creating for Erlang’s VM, Reia, I’m allowing multiple assignment by compiling to SSA form. In making this decision for Reia the other arguments above hardly factored in whatsoever. There was one overall motivating factor in making the decision: single assignment is weird.
The overwhelming majority of programmers have only used languages with multiple assignment. Multiple assignment is the de facto standard, and single assignment is one of the things that makes Erlang weird to new programmers. Even Joe Armstrong conceeds this:
However, if you try to assign a different value to the variable X, you’ll
get a somewhat brutal error message:4> X = 1234.
=ERROR REPORT==== 11-Sep-2006::20:32:49 ===
Error in process <0.31.0> with exit value:
{{badmatch,1234},[{erl_eval,expr,3}]}
** exited: {{badmatch,1234},[{erl_eval,expr,3}]} **What on Earth is going on here?
This is practially the very first thing Joe must introduce in his book. If you hand Erlang to a novice, "what on Earth is going on here" will precisely describe the discovery of Erlang’s single assignment.
In Reia, I’m trying to keep the "what on Earth is going on here" feeling to a minimum.
about 6 hours later:
Yariv’s answer wasn’t bad for the particular case you gave – if you have a list of meaningless variable names, why not increment the meaningless bit by 10 instead of 1?
But there were two better suggestions in comments:
First, using baz(bar(foo(x))), which is clear and succinct.
Secondly, if you are going to be adding functions to that pipeline, perhaps what you are actually doing is applying a list of functions to an initial value, in which case folding the value over the list of functions is what you want to do.
Both of those solutions make adding functions simple, and make the code clearer.
about 6 hours later:
uh … why not …
f(X) -> X1 = foo(X), Xwhatever = fab(X1), X2 = bar(Xwhatever), baz(X3).
one addition, one change. Am I missing something?
I have the impression, though, that the single assignment restriction has to do with Erlang’s great safety in parallel processing. Isn’t that the case?
about 9 hours later:
Ron: No, it isn’t. Since different processes never have access to each other’s local variable bindings, rebinding these variables can not result in concurrency problems.
about 13 hours later:
You are wrong. Your thesis is that single assignment is a dinosaur that still exists because of lack of a smart enough compiler. You suggest multiple assignment with SSA is equivalent to single assignment. It’s not that simple.
Single assignment was most likely introduced in Erlang as a speed trade-off to balance the then inefficient immutability.
Single assignment offers speed-ups because the compiler can understand the program better. A program with Multiple assignment transformed to SSA form will not be equivalent it’s handwritten Single Assignment form.
Single assignment forces the programmer to write code in a way that is optimal for the compiler. Multiple assignment limits the opportunities for optimization.
That said. Your project to add Multiple assignment to Erlang is interesting. Nice post too.
Yann,
about 13 hours later:
Couple of points:
X = 1 isn’t an assignment in Erlang and X isn’t a variable…
X = 1 is a pattern match
Lack of multiple assignment is great and it eliminates a category of bugs completely. You say that you estimate the hit to your productivity is about 15 minutes a month - when multiple assignment hits you badly it can be awful… The 22nd line of Fortran I wrote had a common data area variable that I re-initialised and it created a bug that took me 8 months to find during my PhD and nearly broke me mentally and physically - goodbye and good-riddance.
Your example of the IF statement shows up the real problem - you want Erlang to look like other languages. IF is actually a very rare construct in Erlang - I checked our source code and found that only 1 line in 8,000 was an IF statement - I wouldn’t be surprised to find it was 100 times more common in C or Ruby (I haven’t checked).
The use of pattern matching and function case heads makes:
When we dismiss calls for ‘multiple-assignment’ it is because we love pattern matching.
The arguments about side effects are another red herring. Side effects are necessary - but good coding practice in Erlang says that you should isolate your side effects in as few ‘dirty’ modules as possible and keep as much of your code ‘pure functional’ - this simplifies testing and reduces bugs. Sacrificing ‘pure functional’ for a C/Ruby like syntax is not a trade-off worth considering.
Also your title is totally misleading. What is the object of the cargo cult? The original cargo cult was when polynesiann islanders built straw and wood airfields thinking this would bring down the great steel birds (eg airplanes). So who are you suggesting is the object of the cargo cult (people copying Erlang and having ‘single-assignment’ in their languages?).
about 14 hours later:
Good post! You’re absolutely right. The let form in Scheme and monads in Haskell are good examples of multiple binding, since “redefinitions” introduce new lambdas where the redefined variables are function parameters which shadow previous names. There is nothing unfunctional about multiple assignment and pattern matching works just fine in the nested lambda context.
about 23 hours later:
Just what we need, another operator to confuse the issue.
There are basically two ways you can go with variables in patterns, either they are always new fresh variables and you have explicit tests when necessary, or they keep any value they might have had from before the match. Both of these are actually quite simple to fathom. For better or worse Erlang chose the second, partially because of its Prolog heritage.
Don’t have both. One of the initial goals of Erlang was to make it a simple language. In spite of additions since then it is actually still quite a simple language. Please let us keep it that way.
Write a new front-end if you really want to change something. It is not that difficult, I’ve done it.
1 day later:
Apologies in advance as I haven’t yet had a chance to read this post in full, but I wanted to post this before it slipped my mind :)
f(X) -> lists:foldl(fun(F, Last) -> F(Last) end, X, [fun foo/1, fun fab/1, fun bar/1, fun baz/1]).Courtesy of Steven Vinoski.
3 days later:
A new “front end” is precisely what I’m working on: Reia. However, it’s a completely different language that compiles to Erlang forms.
3 days later:
One clear benefit I forgot to mention was that with this form of variable renaming is that you clearly “see” the progression of data and who may have modified it. It is also very clear which version of the data you are using. A typical use is when passing state through a function:
foo([A|As], State0) -> {X,State1} = bar(A, State0), {Y,State2} = baz(A, State1), {Rs,State3} = foo(As, State2), {[{X,Y}|Rs],State3}.This code is very simple but it is quite common to has different types state at the same time with different progression through the code which is clearly shown in this manner.
Anyway are you seriously proposing that having to rename variables when modifying code is a problem worth worrying about. In all the Erlang code I have written this is one of the things which has caused the least amount of problems for me and has taken a trivial amount of time to deal with. I spend much more time when editing code in Emacs in erlang-mode removing the automatically generated newline after ‘->’. That is really annoying!
I honestly think that once you get past stage of running around wailing “help help, here is something new which I have never seen before, woe is me” it all goeas away and you never consider again. Except when reading bloggs where people worry about it.
Sorry for being a little sarcastic here but if this is really causing you problems then wait until you hit the really interesting stuff.
19 days later:
To me when I was looking around, there are a few aspects of languages I was looking for, and this was one thing I used to be wondering about.
This question for 1 CPU turns into an answer concerning multiple CPU’s(Concurrency-oriented programming). It is that only reason that you can’t make a variable for more than 1 thing. An example would be of Folding@Home or Seti@home on massively distributed computation. If you want a variable on the other side of the world… and god that’s gonna suck! But if you have your own little module or workload for each thread dispatched by the main supervisor(VM) thread, programming on multiple cores is a breeze and actually easy compared to any language that really allows multiple variable assignment. In the end I choose Haskell after going thru Erlang, but Erlang is still awesome if its a problem particularly suited.
That’s my take, and single assignment only is good on REAL multiple(4+)core, if u think small and on 1 core, than multiple assignment doesn’t really matter if u can handle it.
19 days later:
Interesting I haven’t done any coding with a language designed to explicitly handle concurrency like Erlang but I must admit allowing only single assignments is rather strange behavior. Adding some syntactic sugar to allow multiple assignment is ok as long as developers are aware of how their code is really being transformed behind the scenes and that Erlang really only allows single assignment. Honestly, I’m hoping physicists will have a breakthrough to design faster clock cycle cpus without overhaeating so unintiuitive language designs and compromises like this can be avoided.
19 days later:
When one has become accustomed to writing in the Erlang style; single assignment is an intuitive and welcome constraint. It is a fundamental to shaping the consistent Erlang coding style.
Complaining about Erlang’s single assignment is akin to complaining about having to write Java code inside classes. It may seem like a pain when compared to other languages that do not require it, but it is one of many constraints that shape a consistent Java coding style.
Intuitions about an unfamiliar language are a liability because they are based on conventional wisdom. Defy conventional wisdom and dig deeper to challenge any initial assumptions.
The X1, X2, X3 approach is a travesty. It does not scale and is embarrassing that the Erlang Documentation recommends it. Erlang’s list processing functions (such as foldl, map, foreach) and list comprehensions easily transform messy code using that approach to clean, easily scaled code.