Intentional Software…

Ok, I’ll come clean… this is my pet topic in software development, and one day my tag cloud will agree with me ;^)

To me the “Holy Grail” of software development is to have languages, tools and techniques that allow us to easily discover (and transcribe) the intent of software i.e. what the software is actually trying to *achieve*, not how it does it.

Why is this so important? Well, without understanding the intent of a piece of code, a developer has very little chance of successfully adding new features or fixing bugs. Changing code without understanding the intent (usually by just copying and pasting existing code) is just MSMD (“Monkey See, Monkey Do”) programming and even when combined with a process such as TDD, such an approach is going to be painfully slow and error-prone. You might well get your next test to pass, but the chances are it will be the wrong test!

Now I know that these ideas are not new, the JetBrains paper on Language Oriented Programming was written back in 2004, and Charles Simonyi’s company Intentional Software was started even earlier in 2002, but it seems to me that the mainstream world of software development is also heading this way with the rise of more declarative languages , APIs and DSLs (which I would say are excruciatingly trendy right now ;^)

Are you sure you want to delete this?

Well, if the title of this post was in a dialog box, you would have hit “Yes” by now purely on impulse, and then, if you had wanted to read the article after all, you would be frantically looking around for an “Undo” action…

Fortunately, some savvy HCI people in Web-ville have noticed that dialog boxes to confirm actions are actually a waste of time, because the user gets used to them appearing and becomes conditioned to just hitting the “Yes” or “Ok” button without thinking. The (IMHO) quite beautiful solution that they have come up with is to simply do away with the dialog, let you do the action, but then make the chance to undo it immediately very obvious… For a couple of examples, try deleting an event in Google Calendar or a task in Remember The Milk… Sweet!

Quack, you B*!#*rd!

Now, don’t get me wrong… I like dynamically typed languages (especially Python, not so much Ruby ;^), but I have to say that the whole “duck typing” thing is a bit overrated!

Take the following simple example:-

def pretty_print_person(person):
    """ Pretty print a person instance to stdout. """

    print "Name:",, "Age:", person.age


Although no types were harmed in the making of the above function, I can’t just pass any old object in as the “person” argument. If I try to pass in an integer, say, 42 I get:-

AttributeError: 'int' object has no attribute 'name'

In other words there is an implied protocol (Python terminology) or interface (Java et al!) that is required of the “person” argument, namely that it has attributes called “name” and “age” (note that in Python that’s *all* it says, it says nothing about the types of those attributes).

Now if I am a single developer, working alone in my bedroom, on a small piece of code that nobody else in the world will ever need (or want!) to see, and I name all my arguments nicely so that the implied type is kinda obvious, and I have an amazing memory then this is probably fine, but what if I am working in a team of n developers (where n > 1 ;^)?

In the team situation, the implied type information that was in the head of the developer that created the function is thrown away. Now, I can hear the quacks of protest already… “Why not just add a comment?”. Fair cop. Here is some code taken from the “ActionController” module in Ruby.

  # Holds a hash of all the GET, POST, and Url parameters passed to
  #  the action. Accessed like params["post_id"] to get the
  #  post_id. No type casts are made, so all values are returned as
  # strings.
  attr_internal :params

  # Holds the response object that's primarily used to set additional
  # HTTP headers through access like response.headers["Cache- # Control"] = "no-cache". Can also be used to access the final
  # body HTML after a template has been rendered through
  # response.body -- useful for after_filters that wants to
  # manipulate the output, such as a OutputCompressionFilter.
  attr_internal :response

  # Holds a hash of objects in the session. Accessed like
  # session[:person] to get the object tied to the "person"
  # key. The session will hold any type of object as values, but the key
  # should be a string or symbol.
  attr_internal :session

  # Holds a hash of header names and values. Accessed like
  # headers["Cache-Control"] to get the value of the Cache-
  # Control directive. Values should always be specified as strings.
  attr_internal :headers

Here the developer has, very nicely, commented the attributes so that I can see, for example that the “headers” attribute contains a hash of header names and values, and that the values should always be specified as strings. I can also see that the “response” attribute holds a response object which also places a requirement on the protocol/interface support by any values assigned to it.

All of this is very handy information indeed (assuming that the comments are correct and up-to-date of course), but why not provide the information in the code so that it is accessible beyond just the API documentation system? You never know, it might come in handy for:-

a) validation of values assigned to attributes/passed in as arguments
b) simple GUI generation
c) OR database mapping
d) component frameworks
e) web frameworks
f) insert your favourite tool here ;^)

Now, the hard-hearted amongst you might, at this point, just tell me to bugger-off back to Java or C++ or whatever statically-typed hell-hole I came from. Well, truth be told, I don’t want to. I like the terseness of most dynamically-typed languages (so, maybe Ruby overdid it there a tad ;^), I like the meta-programming/introspection capabilities, I like how I can get closer to being able to express *what* it is the code is intended to do as opposed to *how* it does it. I especially like being able to prototype without types and gradually “harden” them as I understand more about what is going on (which IIRC was a feature of Dylan, a programming language from Apple).

Now, I am obviously, neither the first nor the only one to want this combination of static and dynamic types, and if you cast an eye around the Python community you will see that every person (and their dog!) involved in team development has, at some point, written or adopted a system for specifying type information, and there at least 2 stable and mature projects that have achieved much wider adoption:-

1) Traits
2) Zope Interfaces

Disclaimer: I used to work for Enthought Inc., the company behind Traits, but I didn’t write it, I have no vested interest in it, and it is free, open-source, and BSD licensed!

IMHO, using dynamically-typed languages in conjunction with optional static-type systems combines expressive power, readability and incredible tool potential, and offers a viable alternative to statically-typed (and usually compiled) languages for non-bedroom based development teams ;^)

Programming as Language Extension…

As the popularity of things like Domain Driven Design, DSLs, Language Workbenches etc., increases, I am reminded of what one of my professors said some 15 or so years ago – that all programming is actually language extension.

We start at the lowest-level with our implementation language and we gradually build in concepts that extend it towards the highest-level which is the application domain. Now, obviously, in a non-language workbench world, we are bound by the syntax of the implementation language (and hence our extensions are in the form of types, functions etc) but we can still see the extensions as building layers of new “languages” until the final language that can (in theory) be shown (and explained) to domain expert.

For me, the key is that, DSL or no-DSL, the outermost layer should preferably be readable by the domain-expert, and even better, be writeable too ;^)

Q. When is a Boolean not a Boolean?

A. In the hands of dodgy Python developers ;^)

I just came across an example of my least-favourite(?) anti-pattern in Python – using “implicit” boolean values in conditional expressions. This particular occurrence was found in sample code in the “Google App Engine”, but it could have come from lots of places ;^)

user = users.get_current_user()
if user:
   # Do something...

  # Re-direct to login page

The “get_current_user” method returns None if there is no user currently logged in, hence the poorly written “if user” test.

Now this is (obviously) perfectly valid code because in Python, empty lists, dicts, 0 (zero), None etc all evaluate to False, whereas a list or dict with at least one item, a non-zero integer, a non-None reference to an object etc. all evaluate to True…

Well, almost… and therein lies the problem! If, for example, an object instance implements the special method “__len__” and happens to return zero then it too would evaluate to False. Maybe what you wanted, and maybe not, but in my experience this has caused some weird, wonderful and subtle bugs (the best kind ;^). IMHO it is much better to use explicit boolean expressions where, errr, booleans are expected, and hence the above example should be:-

user = users.get_current_user()
if user is not None:
   # Do something...

  # Re-direct to login page

I’m not sure why some Python developers insist on using the above pattern – do they really think that the typing it saved them reduced the overall development time? If so, maybe they also think of themselves as typists, not developers ;^)

Waiting For The Great Leap Sideways…

Having been a developer for 20+ years, I occasionally look back and wonder about the great advances that the industry has made in this time. These, of course, are:-

1) Syntax highlighting
2) Test Driven Development

Ok, maybe, a bit facecious, but sadly, not much. It seems to me that the technology that we use today is not much changed from when I started writing RTL-2 on PDP-11s back in the 80’s and when I try to think of things that have had a real impact on my day-to-day work, these are the two that spring to mind ;^)

In particular, I am constantly surprised by the fact that (for the most part) we still represent code in text files. I understand, that developers have a deep-rooted fear of losing control of their beloved text, but this particular phobia has a number of consequences, in particular in the tools that we use to create code.

Text editors have no notion of the intent (or even the structure) of the code and hence, all they actually do is help you type a bit faster… Whoopee do ;^)

IDE’s try to go a step further (err, sideways) but tend to fall into the same trap of thinking that what we need is to be able to write code faster and hence become little more than glorified text editors. What we actually need is to be able to read and determine the intent of code faster.

As an example take the popular Eclipse IDE (I’m not picking on Eclipse – substitute pretty much any IDE here). When editing code the main focus of the developer’s attention is directed at the “editor area” which shows the textual representation of the code in one or more files. The task of editing is (optionally) supported by an “outline view” that shows the actual structure of the currently selected file (what class(es), attributes, methods etc). In a text-based world-view that makes perfect sense, but what I would like is to flip that around and have multiple “outline views” (as the focus of my work), supported by a single editor view that shows me the textual representation of the currently selected outline. To understand the intent of the code it is the structure and relationships expressed in the code that I need to discover – so why not have the tool help me do it!

As another example, take the ever-present “formatting wars”. Developers like to layout their code in a myriad of ways and all of them are right! The problem is that reading code written in a myriad of ways is neither a) efficient nor b) pleasant, and so many teams introduce coding standards to produce an acceptable team ‘style’ (often after lengthy and tedious debates about where a curly-brace should go, and occasionaly after a code “Tzar” stuffs in a code formatter onto the front of the code repository – mentioning no names, Duncan ;^) When code is parsed an Abstract Syntax Tree (AST) is produced – why not use that as the storage format and allow each developer to specify how the code is textually formatted for them so that they can read it unhindered by somebody elses weird and wonderful layout.

And these issues are just the tip of the very cold thing. Tools for testing, refactoring, metrics etc – could all benefit from a more integrated approach – at least that is my hunch…

Old Punks, Young Crusties

In many technical professions, young practitioners (possibly straight out of university) come into the workplace full of enthusiasm and with the desire to use the latest and greatest developments in the field. Quite often, these ‘young punks’ encounter resistance from the ‘old crusties’ who have been in the game for years and years (and years!) and who stick to their way of doing things simply because they’ve always done it that way.

Recently however, I’ve noticed that in software development the situation seems to be reversed. For example, at one company, a number of the more experienced software engineers (‘old crusties’ like myself) were keen to try out some of the recent developments in the agile development world (pair-programming etc), and in particular, test driven development (TDD). Having tried TDD on a reasonably sized body of code, I was convinced that it was a fantastic way to work and that it improved the quality of, and my confidence in the code, whilst simultaneously reducing the stress involved in producing it. I was also struck by how subtle the TDD mantra of 1) Write a failing test 2) Make it run 3) Remove duplication really is, and how it basically describes the process of development in an eerily accurate way. Anyway, I digress. As it turned out a number of meetings were held to discuss the introduction of TDD, but to my surprise, the younger, less experienced developers (one or two years out of college) were extremely reluctant to even try it out, and at one point the discussion disolved to the point where people were even questioning the idea of unit testing itself (I know, I know, don’t get me started). Now, on occasion I have been known to be a tad cynical myself (and then some), and so I welcome healthy questioning, but I was still shocked to see the next generation basically arguing that we should continue to write code the same way that we always have, with all of its attendant weaknesses and failures.

Thinking about it later, I realised why their position was not so shocking after all. The field of software engineering is all about trying to find better ways to develop code and to help us live with it over a long period of time. There are now many well-documented techniques to help us reduce the defect count, reduce the time taken to fix bugs and add new features, to refactor etc, but if you’ve never worked on a large body of code, and if you’ve never had to live with your code (or, more to the point, other peoples code) for more than say, a semester, then obviously, the techniques to achieve these goals seem superfluous. To put it simply, if you’ve never had the illness, then you won’t have any interest in taking the medicine!

This problem can be attacked on a number of fronts. First, universities need to make sure that their computer science courses include a large software engineering component. Maybe students could be given a project that spans the entire length of their degree, allowing professors to change requirements, add new features and generally throw some real-life spanners in the works. Students could even have to swap bodies of code and documentation etc each semester, and they could be ranked by their peers about how easy (or not) it is to get started working on their code. Secondly, development organisations need to make sure that they have a clear and coherent approach to software engineering, and that every developer joining the organisation is fully exposed to it (for less experienced developers this would probably include close mentoring ). Whether or not you agree with specific techniques such as TDD, you must have a story about *how* you write software that goes beyond giving each of your developers a machine to code on ;^)

They say that the study of history is vital if we want to avoid repeating past mistakes, and the field of software engineering is really just an accumulation of wisdom from the short, but often painful history of software development. It might be an idea to learn from it ;^)

Good Test, Bad Test…

Developers shouldn’t think of writing tests as like writing code – they should think of it as *exactly* the same as writing code. IMHO, the “code” is has 2 parts, the implementation and the tests, and each needs as much care and attention as the other… Anyhoo…

A test remarkably similar to the following cropped up recently:-

def a_test(self):
   self.assertRaise(SomeError,, y, z).blargle)

Now, to paraphrase the legendary Mr Clough – “It’s not the worst test I’ve ever seen, but it’s in the top 1” ;^)

How can such a small test be so bad? Well, it manages to pack a couple of critical errors into a single line (and that’s some going I have to say ;^):-

1) It is not clear whether it is the method call or the attribute access (or both) that is expected to raise the exception.

2) The implied API is clunky at best. If the reason for calling the method is to get hold of the “blargle” then just return the “blargle”! The only thing that this test makes clear is that the API is not clear!

If I was a betting man, I would bet that this test was written after the implementation, which might excuse the API weirdness (it could be the start of refactoring a legacy API), but not the lack of clarity…


The API vs DSL debate is an interesting one. IMHO, all good software is made up of layers, with each layer representing a domain. The outermost (application) domain might model a shipping company and hence any API/DSL at this layer should easily describe how to manipulate ships, cargos, routes etc. The innermost layer(s) might model web services or relational databases etc and so any API/DSL at these layers must model the low-level concepts that a developer would expect of these technologies. The key point is that at each layer the API/DSL should allow the user (be it, an application domain expert, or a hacker-geek dude!) to express the intent of what they want to do as easily as possible.

Now, my first guess is that if I was going to provide a DSL, I would probably only provide it at the application layer, and I would just create (hopefully) nice, clean APIs for the other layers (and of course, the DSL would really just be a thin wrapper around a nice clean application level API!)…

Of course, you could go DSL crazy and create one for each layer, although this seems like overkill to me, and brings up some pragmatic issues (which, actually apply even to a single DSL architecture):-

1) The ability of the user to learn a new language (even one targeted to the domain that they are in). This would obviously require the usual array of tutorials and reference manuals etc.

2) Testing

Currently when I code (I use TDD) I write tests in the same language as the shippable code. Obviously, every DSL can’t have extensions for testing, so would we use a low-level language or have a DSL for testing?