C

Introduction

Talking about C is not easy. Almost all professional programmers have used it at some point and many have a strong attachment to it. I don’t want to start by saying that it’s a poor language, alienating much of my audience, but I figure I’m going to end up doing that anyway so I may as well get it out of the way at the beginning.

Compared to many languages that have come since, and even some that came before, C just isn’t a very good language.

There, I’ve said it.

History

Perhaps more than almost any other language since, C has a rich and famous history. It’s almost impossible to discuss it without also talking a little about the history of Unix.

In 1969, Ken Thompson finally got hold of a PDP-7 and decided to write an operating system for it. (It happened more often than you’d think back then.) That operating system was Unix.

A few years later, Dennis Ritchie designed and built a language based on B (itself based on a language called BCPL) and, creatively, called it C. Unlike its immediate ancestors, C had types and a number of other useful bits and pieces.

C was so useful that Unix was quickly rewritten in it. These days that sounds obvious, but until that point operating systems had always been written in assembler or machine code. This is an important part of computer history.

Utility

C became popular largely because it filled a very useful niche. At one extreme you have all the ‘real’ languages. At the time, real computer scientists would have used Algol68. Structured and clever, Algol was a great language but you couldn’t do anything very low level with it.

At the other extreme there is assembler which people had to resort to if they wanted to do anything close to the hardware. Assembler is only one step removed from machine code which makes writing reliable, bug-free code very difficult, especially when you’re building something as large and complex as an operating system.

C fits in the middle. Described by some as a high-level assembler, it allows you to do low level coding, accessing particular memory addresses and the like, and use high-level constructs such as functions and types.

The following books and papers helped me learn to hate C.

Practical C Programming” by Steve Oualline.

Writing Solid Code” by Steve Maguire.

Code Complete” by Steve McConnell.

Hello world. How many nasty ways can you write “Hello World” in C?

Pointers

Key to C’s ability to mix high- and low-level constructs are pointers. Most ‘serious’ languages have some concept of a them. Some call them references, some call them links, but they all, basically, refer to something that identifies a chunk of memory. Most other languages only use them when you have to, but they are C. No pointers, no language.

You want an array? That’s really a pointer. Pass by reference? A pointer. Strings? Ah, they’re arrays! (Which are pointers.)

The advantages of using pointers for just about everything in the entire language are mainly one sided: it makes writing compilers easier. For the poor souls that actually end up using the language all is not so rosy. As Steve McConnell puts it, “pointers are one of the most error-prone areas of modern programming” (Code Complete, section 11.9).

Some of the side-effects of using pointers are not immediately apparent, either. In most languages, arrays have bound-checking (the ability for the language to raise an error if you try to access an element that doesn’t exist). But C doesn’t really have arrays, it has pointers and a little syntactic sugar that makes it look like it has arrays. Pointers don’t know how much memory is being pointed to so you don’t get bound-checking. If you’re lucky your program causes a segmentation fault, if not you might corrupt other data or, on some operating systems, your program.

It takes all types

Another one of C’s biggest problem is it’s typing. As most people will already know, C allows you to put numbers into character variables, integers into floating points and any number of other nasty combinations. To C they’re all valid, but what they do are not always well defined or consistent. Even the same compiler sometimes does different things depending on the level of optimisation in use, the phase of the moon, etc.

The odd thing is that a weakly typed language doesn’t allow you to do more than one with strong types, it merely allows you to do the wrong thing more easily. For a language used for large-scale software engineering projects the risk of poor code is just too great and weak typing should be outlawed.

I’m sure that people are going to mention the myriad of warnings that modern compilers are able to produce, or that ‘lint’ has been available for nearly as long as the language has been. My counter-argument: I don’t see why you should have to add extra tools or read through pages of warnings in order to correct deficiencies in the source language!

The good bits

If C was truly as appalling as I’ve made out so far, no-one would actually be using it. The main ‘win’ for C, as far as I can see, is that it is small, well defined and widely available.

All three merits are in many ways different sides of the same coin (if you can imagine a three sided coin). To make the language well defined, it helps if it’s small. If lots of people are to use it, it needs to be simple enough that they aren’t put off (Ada anyone? Thought not.). Small and well-defined make it easier to write compilers too, meaning that it’s available on everything from the lowliest PC right up to mainframes.

The theory also goes that your programs should recompile on this range of machines too, but that’s not as true as we’d all like to think. If it was true, we wouldn’t need Java or hundreds of ‘#ifdefs’ throughout. My code tends not to be that low-level, but I wouldn’t like to make any promises about its portability. However, that doesn’t make C’s wide availability useless. Even if your program isn’t cross-platform, the skills required are. A C programmer can quickly write code for just about any machine.

Summary

I’ve probably written more lines of C code than just about any other language, so when I say that I don’t like it I hope that you can see that I’m not being narrow-minded or prejudiced.

As I’ve mentioned above there are some things that I like about C, it’s just that there is so much to hate about it and, even at the time it was written, there wasn’t an excuse for it!

The languages main features are its weak typing and over-use of pointers, both of which allow developers to make truly horrendous coding errors with ease.

If those were C’s only problems I might be able to forgive it, but they aren’t. C has more ways for both experienced and novice programmers alike to hang themselves than any other language I can think of (with the possible exception of Intercal). Even if it didn’t have the ‘=’ and ‘==’ operators to confuse, there’s still the wonderful ‘?:’ and a whole host of spectacularly error prone API calls (does all your code check the value that ‘malloc’ returns?)

No, C, as a language, is a dinosaur. It deserves to whither and die. If you write anything other than an operating system kernel and use C, switch to another language. You’ll be far more productive when you start battling the problem rather than the language.

Sri Lanka, 2001

Sri Lanka (nee Ceylon) is famous for its tea and Arthur C. Clarke, but, as I found out, there’s much more to it than that!

We started in Negombo, a beach resort a few miles away from the airport, moved round to take in the ‘Cultural Triangle’, down to Kandy, Adams Peak and a tea plantation. Finally, we headed to Unawatuna, a beach resort near to Galle (and the England – Sri Lanka test series) and then back to Negombo for the last night. We made plenty of stops along the route.

A combination of a new camera and the beauty of the place meant that I took as many pictures in Sri Lanka as I took in Georgia and Thailand put together! Here I present the highlights.

Click the small pictures below for a full size version.

All these pictures were taken on my Canon EOS300, mainly on Fuji Superior ISO400 film. I ran out of film towards the end of the trip so the last few are on Kodak Max (ISO400).

The same pictures that are on Kodak film have white lines down the middle. They are scratches on the negative that appear to have been put there during either developing or printing. I’m tempted to name the guilty company…

If the pictures have piqued your interest, there are a few web sites that you might want to visit:

  • The Lonely Planet guide is usually worth consulting.
  • If you prefer hard-copy (much more portable than your laptop!), you can buy a copy from Amazon (UK or US).

Miscellaneous Pictures, 2000

Until now, all the pictures I’ve put here have been based on a theme: the country I was in when I took them. I do take pictures at other times though, and there are often not enough of them to justify a complete page to themselves.

That’s what you’ll find on this page.

There are basically four groups. Firstly there are a few snaps from a quick trip around Durham, in the North-East of England. Then I was briefly in and around Montreal, Canada in May 2000. There are also a couple of pictures taken around Wimbledon during the summer. All these are taken on the Canon Sureshot. At the end you’ll find the highlights of the first roll of film through my EOS (Jessops ISO200). They’re all in and around London.

Click the small pictures below for a full size version.

Durham CathedralDurham's market square A couple of hours south of MontrealA view over Montreal
A square in MontrealNear Wimbledon Common Tower bridgeSt Pauls cathedral
Just outside St PaulsOutside my bedroom window

Output

Introduction

PL/SQL is good for many things. It has a structured, Ada-like syntax and can have SQL statements seamlessly embedded in it. Although not compiled right down to machine code, it’s even fairly quick.

But one thing that most people initially have difficulty doing is displaying information on the screen. The first thing I did was to enter what I wanted into a temporary table and use SQL*Plus to select from it. This works well for reports at the like, but it doesn’t work at all if you want to display something independently of a transaction or halfway through a function.

I was frustrated that there wasn’t a better way of feeding back information to end users. I was wrong to be frustrated: there is a better way.

Hello world!

The best example I can think of doing is the world famous “Hello, World!” program, so here goes:

CREATE OR REPLACE PROCEDURE hello_world AS
BEGIN
     DBMS_OUTPUT.PUT_LINE ('Hello, world!');
END hello_world;
/

In SQL*Plus you can run that procedure by typing “exec hello_world” and pressing return.

You may be surprised to see that it says “PL/SQL procedure successfully completed.” and nothing else; there’s no “hello world” text. Surely it’s lying, it hasn’t successfully executed anything, has it?!

As with most things Oracle, things are not quite as simple as they first appear. In this case you need to tell SQL*Plus that you’re going to use the DBMS_OUTPUT package. To do that, you need the following command:

set serveroutput on

You might want to put this in your “login.sql” file so that it gets executed every time you log in. You can also add the “size” parameter to the end to indicate how much output you’ll be producing. The most you’re allowed is around 1Mb so I usually type “set serveroutput on size 1000000”.

If you try executing the procedure now, you’ll find that it works correctly.

A longer example

For our second and final example, we’ll use one of the tables that Oracle installs by default. It’s called EMP and should be in the SCOTT schema (password TIGER). The important fact here is that there are a number of different columns types, NUMBER’s, VARCHAR2’s and DATE’s.

On to the example:

CREATE OR REPLACE PROCEDURE example2 AS
     CURSOR emp_cur IS
         SELECT e.empno, e.ename, e.sal, e.hiredate
         FROM   emp e
         WHERE  e.sal > 2500;
    BEGIN
     FOR current_employee IN emp_cur LOOP
         DBMS_OUTPUT.PUT_LINE ('-------------------');     -- string literal
         DBMS_OUTPUT.PUT_LINE (current_employee.empno);    -- number(4)
         DBMS_OUTPUT.PUT_LINE (current_employee.ename);    -- varchar2(10)
         DBMS_OUTPUT.PUT_LINE (current_employee.sal);      -- number(7,2)
         DBMS_OUTPUT.PUT_LINE (current_employee.hiredate); -- date
         DBMS_OUTPUT.PUT_LINE ('Year: ' ||
                               TO_CHAR(current_employee.hiredate, 'YYYY'));
     END LOOP;
 END;
 /

This code should be pretty much self-explanatory. We first start a loop over a table. Inside the loop we output the content of it. The first PUT_LINE statement shows that you can output a string literal (we know this as ‘Hello world’ is one too). The second shows that you can output numbers without first converting them to a string — the function is overloaded and will accept most standard types.

Following that theme, the third, fourth and fifth lines output a string, a different number and a date, in all cases using the default format string. The last PUT_LINE statement shows that you can output expressions, but note that it all needs to be of the same type (so in this case I convert a date to a string so I can concatenate it to another string).

On my machine, the output is as follows:

SQL> set serveroutput on
SQL> exec example2
 -------------------
 7566
JONES
2975
02-APR-81
Year: 1981
-------------------
7698
 BLAKE
2850
01-MAY-81
 Year: 1981
-------------------
7788
SCOTT
3000
19-APR-87
 Year: 1987
-------------------
7839
KING 5000
17-NOV-81
Year: 1981
-------------------
7902
 FORD
3000
03-DEC-81
Year: 1981
PL/SQL procedure successfully completed.
SQL>

Conclusion

As we’ve come to expect from Oracle over the years, simply outputting a little text to the screen isn’t quite as easy as you might wish for.

And it’s not without it’s problems. If you want to send more than a megabyte to the screen, you’re still going to have difficulty and all the playing around with “set serveroutput” is just plain tedious.

However, the important point is that you can do it and, programatically, it’s (basically) just a single command.

If you want more information on this kind of thing, have a look in chapter 6 of Oracle Builtin Packages, a whole book dedicated to this kind of thing.

Free Software HOWTO

v1.2, 17 January 2001

With the current Linux trend towards multi-million dollar IPO’s and “Open Source” software, much of the emphasis of “free” software has been lost leaving people new to the fold confused and not completely understanding all the implications. This HOWTO will, hopefully, reduce some of that confusion.

Introduction

What’s in here?

This document talks in non-technical terms about free software, what it means and why you should care about more than just the cost of your software.

Who is this HOWTO for?

People that have been hacking for years will already be fully au fait with the content of this document. Or at least you should be!

The Free Software HOWTO is aimed at people new to Linux, Open Source or free software.

New versions of this document

The official home of this HOWTO is here. You will always find the most up to date version here.

Disclaimer

You get what you pay for. I offer no warranty of any kind, implied or otherwise. I’ll help you where I can but legally you’re on your own.

Credits and Thanks

I welcome any constructive feedback on this HOWTO and any general software licencing issues, although my opinions are just that: a subjective view. You should understand what each licence means before committing to one for your own software or documentation..

Licence

This document is copyright 2000, 2001 Stephen Darlington. You may use, disseminate and reproduce it freely, provided you:

  • Do not omit or alter this copyright notice.
  • Do not omit or alter the version number and date.
  • Do not omit or alter the document’s pointer to the current WWW version.
  • Clearly mark any condensed, altered or versions as such.

These restrictions are intended to protect potential readers from stale or mangled versions. If you think you have a good case for an exception, ask me.

(This copyright notice has been lifted from Eric Raymond’s Distribution HOWTO.)

Overview

Introduction

Until now, you’ve probably never given much thought to software licences. No-one can blame you. They’re usually pages and pages of legalese telling you what you can and can’t do with the CD you just bought. If you actually sat down and read it all, you would probably never agree to it!

Licences, however, are at the heart of what free or open-source software is all about. Let’s take a look at a number of very broad categories of licence.

Commercial software

The first type is the commercial licence, like the one that Microsoft or Lotus might insist you agree with before using their software. The basic premise is that you don’t own the software, you have an agreement with the author that allows you to use it within certain guidelines. As the copyright owner, they can impose whatever restrictions they want. Common conditions include limits on the number of concurrent users, number of copies, and what you can use it for (for example, “non-commercial use” or a ban on reverse engineering).

The emphasis is what you can’t do, and all the power is in their court!

One thing to note, and this will become more relevant later on, is that none of this relates to the cost of the software (or, more properly, the licence). You can get commercial software, such as Microsoft Internet Explorer, for no cost and still be forced to abide by the publishers conditions.

Shareware

The next kind of software, shareware, is sometimes called free, although as we shall see, that’s not correct.

Shareware was started in the early eighties by Jim Button when a few of his friends started passing round some software he’d written and he started getting phone calls asking for support. Eventually he added a request for $10 to the startup screen and shareware was born.

In short, shareware authors allow you to try out their software for free but request payment for continued use. Many of the same restrictions as for commercial software remain, including the limitations on reverse engineering, concurrent use of the full version, etc, stand. Additionally, the “free” downloads are often broken in some way, perhaps limited functionality or splash screens.

Interestingly, Linus Torvalds describes shareware as combining “the worst of commercial software (no source) with the worst of free software (no finishing touches).”

Public Domain software

All the licences we’ve seen until now have been designed to reduce people’s ability to do what they want with the software.

At the opposite extreme is public domain software. This is software that has no copyright and, therefore, no restrictions on its use. You can copy it onto as many machines as you like, reverse engineer it, make modifications and distribute it as you feel fit.

This is the first kind of software that we’re come across that can rightly be called “free.”

Normally, whenever you write something you automatically own the copyright, even if you don’t add an explicit copyright message. For public domain software, the author throws away these rights allowing everybody to do what they like with the software.

Unfortunately, “anything” also includes selling it. Imagine spending a huge amount of time producing your masterpiece, giving it away and then finding that someone else was able to sell it and make their fortune with all your hard work! Worse, people don’t even have to give you credit for your work; they can legally take it, replace their name with yours and distribute it.

This might be why there isn’t a huge amount on public domain software.

At this point it might be worth looking back at free commercial software. Both public domain software and a free piece of commercial software cost the same, but the freedom you’re given to use it varies. A common phrase you hear with free software is “think free speech not free beer.” The difference between public domain and commercial software show the opposite extremes.

Free software

By now it should be clear that there are many different kinds of free software and not all are equal. The version of free that this section relates to is the one promoted by Richard Stallman of the Free Software Foundation.

But first, a little history.

The Free Software Foundation was formed in 1984 when a printer manufacturer refused to give Richard Stallman the source code (the computer instructions required to make a program). It had been leading up to this for some time — the increasing number of non-disclosure agreements, new software that banned sharing of information, etc. — and the printer manufacturer was just the straw that broke the camels back.

Stallman decided that he could not, in clear conscience, sign a non-disclosure agreement or work with a company that restricted his ability to share information. While most people would have given up at this point and gone to work, for obscene amounts of money, for a big company writing proprietary software, Stallman stuck by his principles and decided to make his own operating system, free of the constraints of commercial systems.

The process leading up to this is documented in more detail in Steven Levy’s excellent ‘Hackers.’ You’ll note that Levy calls Stallman ‘the last real hacker.’ Happily he was wrong, otherwise you wouldn’t be reading this!

As we can tell from the background, “free,” in this context, relates not to the initial cost but to freedom. Stallman was unwilling to surrender the right to make modifications or improvements to any software he used — and to do this you need the source code.

This may sound just like public domain software up to this point. The difference is that there are clauses in the licence that attempt to keep the software free no matter what changes are made. The most famous free software, Linux, uses the most famous free software licence, the GNU General Public Licence. It is sometimes also called Copyleft, as it very cleverly uses the current copyright laws to do the exact opposite of its original mandate.

The way it does this is by insisting that the code and anything derived from it is also released with the GPL licence. In some senses it is ‘viral’ in nature and it is this that is central to many people’s objections.

Also, it’s worth noting that the word ‘derived’ is a little too vague. Does a library linked to a GPL’d program need to be GPL’d also? Does a program running on a free operating system need to be GPL’d?

There’s no clear, obvious answer for either of these with the current version of the GPL. The new version (3) is intended to fix some of these shortcomings, but it’s viral nature will remain.

Open Source software

Open Source software is, in many ways, exactly the same as free software, despite what Richard Stallman says!

It was started in 1997 by Eric Raymond and Bruce Perens as a response to the increasing confusion over the use of the word “free” in relation to software. (Confusion that has continued, or I wouldn’t be writing this document!)

In essence, Open Source is a marketing or PR exercise to make free software more acceptable, more understood by the general public and the big corporates that, until that point, were comforted by the money they had to pay to get commercial software.

Like “free” software, the “Open Source” trade-mark does not mandate a single licence. (Freedom of choice is important, even when it comes to giving away software!) Technically any licence that meets all the requirements in the Open Source Definition can be termed Open Source. These, in summary, are no restrictions on the use of the software, access to the source code and the freedom to make modifications and distribute them.

The two most famous are the Berkeley System Distribution (BSD) Licence, which allows distribution without the source code, and the GNU General Public Licence, although there are many more.

The Sun Community Source Licence, for example, is not compliant because Sun Microsystems demand a fee for any commercial distribution and insists that derivations are still compatible in some arbitrary ways. (This is not intended to single out the SCSL as being particularly bad. However, the fact that it purports to be “open” when it isn’t is a disturbing trend.)

The implications

Overview

Free software has already had a significant effect on the computer industry. Free software is behind most of the critical parts of the Internet, it was used (unsuccessfully) as part of the defense in the Microsoft anti-trust trial and the spate of recent IPO’s has shown that there’s money too.

Unless you bought some shares, this all appears to be affecting other people. There is an impact, however.

I just use software, how does this all affect me?

It’s tempting to think that, if you’re not a ‘techie’ or a hacker, the difference between free commercial, public domain and Open Source software is minimal. But that’s not true, even as a user the difference affects you because the development of the software is affected.

I’ll outline a plausible scenario as an example.

Company X designs and releases a fantastic piece of software. It’s commercial software, but the publisher seems receptive to new ideas, indeed versions 1.1 and 2.0 are exactly what you were looking for and the upgrade costs were reasonable.

At this point any number of things can happen. Perhaps you find a bug in it, one critical to your business. But the publisher offers no warranty for the software and say they are not sure whether they will fix it. (Note that just because you paid for the software, you do not necessarily get better service or a guarantee of any kind. According to the licence, you can’t sue Company X if the software is not fit for the purpose it was sold for.)

Or maybe they go out of business. Or perhaps they start competing with your company and won’t sell you the next version. Or perhaps the features you want are not in the new version. There are a huge number of ‘if’s and ‘maybe’s, all outside your control.

Basically, if you use commercial, close-source software you are at the whim of the publisher. If they do something you don’t like, tough.

However, if you’d used Open Source software you’d have access to the source code and could fix or upgrade the program as you saw fit. And if you couldn’t do it, there are many programmers willing to support the new versions or you could hire someone to do it for you. In summary, you have much more control over future development of the product.

You’ll note that none of the advantages here are strictly related to cost. I think that’s something that people tend to focus on too much, possible due to the “free software” title. However, there are still advantages.

But first we need to get away from the initial cost of the software as that’s normally a small percentage of the total cost. Instead, let’s think in terms of support costs. Once you’ve installed the software, what costs are there? An obvious cost is that of upgrades. Less obvious is lost productivity due to software failure and support charges from the manufacturer.

In the case of free software, there are no upgrade costs (other than the time and inconvenience of doing it, which also applies to commercial software). Free software is usually regarded as more reliable then commercial software — see Fuzz Revisited for more information. And the support charges are optional: you can either deal with it in-house or hire any one of a number of support organisations.

I’m a developer. How does this affect me?

As a developer, you’ll already know the benefits of being able to access the source code for a program. You can fix it, see how it works and integrate bits of it into your new program. (Using parts of another program, however, isn’t quite that simple and is dependent on the licence of the original software.)

A common problem developers see is the loss of their livelihood. If everyone gives away their software, how can anyone make money? It’s a fair question and there’s no single, correct answer.

Perhaps the most common answer is that most software is developed in-house and is not distributed. None of this development will be affected, so if you have that kind of job you can rely on your salary for a while yet!

If you work for a consultancy, almost all the revenue comes not from selling software but from “professional services,” i.e., they charge for developing the software rather than the licence to use it. Again your job is safe.

Then there’s shrink-wrapped software. The Free Software Foundation would say that it should be free. So if you listen to them and you work for Microsoft your job is in danger unless they diversify into services.

The Open Source people would say that there’s nothing wrong with shrink-wrapped software, but point out that there are advantages in releasing the source.

As you can probably see, the risk to your jobs is small and there are many benefits. At least that’s the way I see it and I work writing software!

Where can I read more?

The most obvious place to read more is at the GNU website, after all they started it all.

But there are alternatives. Other important web-sites are as follows:

  • Open Source. This is the ‘official’ Open Source page. There’s lots of interesting stuff here, including a more detailed discussion on the effect of free software on different people Microsoft’s Halloween documents, their unofficial response to the Linux threat.
  • Eric Raymond’s web-site. Eric has written much about Open Source software, with much more depth and style than I can muster!
  • Slashdot. There aren’t more ‘HOWTO’s here, but Slashdot is a community of people interested in Open Source software. The discussions sometimes get childish, but you can learn a lot!

There are also a few books published (at least partially) on the subject:

  • OpenSources. Voices from the Open Source Revolution.. This book talks about all aspects of the Open Source community, including licencing. The main reference is Bruce Perens essay.
  • The Cathedral and the Bazaar. Again, this book is about general Open Source issues, but it includes a discussion on some of the non-obvious implications of the licences, particularly the reason why just because you can, you don’t frequently get many versions of a piece of software.
  • Hackers Steven Levy talks about the early days of the hacker community, including a good piece on Richard Stallman.

A matter of style

Introduction

The open source community does a lot of things right. The internals of a program is one of them. The people who write code do so because they are proud of what they do and want the respect of other people in the community. The beauty of the code is a very important aspect in this acceptance.

The same isn’t necessarily true in the commercial world. Time-scales are much more important than how the guts of a computer system looks and it’s generally not good to be seen spending a bit of time making your code look pretty.

This column is not going to say that style is more important than substance, but that this interest is something that can bring a large increase in productivity.

Indentation

I know people that hate me for this, but something that really bugs me about some code is the indentation style. I can handle a poorly thought-out, hacked algorithm if there’s a good reason (time usually), but I can never see a good excuse for badly formatted code.

Initially I thought I was being anal, just getting very hung-up on a not-very-important detail, but then I started noticing things. Usually the people that produced the worst formatted code also included the largest number of defects and, although they initially appeared to write code more quickly than me, finished late much more frequently.

Why does that matter?

There are two fundamental reasons why the ‘look’ of the code matters:

  1. Comprehension. How quickly can people understand the code?
  2. Structure. Is it obvious how the program is split into units? Is it obvious what the next statement to be executed is?

I guess they are quite similar, but I feel that it’s important to separate them out. Hopefully you’ll see why in a minute.

Let’s talk about comprehension first. Here’s a snippet of code:

if (x < foo (y, z))
   haha = bar[4] + 5;
 else
   {
     while (z)
       {
         haha += foo (z, z);
         z--;
       }
     return ++x + bar ();
   }

If you've read the GNU coding standards, you might recognise this as an example of good formatting. It isn't.

But there are some redeeming factors. Firstly they've used two spaces for each level of indentation and not a tab. Studies have shown that while people prefer (aesthetically speaking) tabs in programs, those same people actually understand the code more effectively with between two and four spaces. Also, the formatting of the mathematical expressions is clear, with good use of whitespace.

Now the bad. My main problem is with the use of the braces. The lesser crime is immediately after the if condition where they haven't used braces at all. That's bad because a novice programmer might add an extra line below the 'haha' assignment. It probably won't be a valid program any more, but it looks okay at first glance. Since maintenance is the largest part of the software life-cycle, anything that makes updating code more difficult is bad news.

Structure

The indentation of the second half is dreadful, though. Why have two levels of indentation to to indicate one block of code? Again, there are studies that show that formatting like this actually reduces the readability of the code.

My preferred method of formatting the same code would be:

if (x < foo (y, z)) {
   haha = bar[4] + 5;
}
else {
   while (z) {
     haha += foo (z, z);
     z--;
   }
   x++;
   return x + bar ();
}

The main difference is the formatting, but I've also altered the 'return' statement at the end. Side-effects (like using '++x' in an expression) seriously reduce readability as it's much more complex to figure out what it's trying to evaluate. And the solution only takes an extra line...

But why is this a better way to format the code? Simple: it shows the structure of the code more effectively. Since there are only three levels of indentation, your brain doesn't have to work as hard to figure out where, for example, the end of the "else" block is. (It's not so easy to see the advantage with ten lines of code. Try to imagine a page full of code with a large number of indentation levels.)

Some languages appear to be more tricky than the C example here, too. Where does the exception block in Java go? Should the procedure definitions in a package header be indented in PL/SQL?

If you don't really understand block-structure in modern programming languages and are just trying to follow the example of fellow developers, these are probably difficult questions. They're not supposed to be.

A computer will understand any syntactically valid program, but not necessarily in the same way that you or I would. This makes it vitally important that everyone involved has a common understanding of how the program is supposed to work. The first steps to doing this are structuring the code sensibly and making it easy for others to comprehend.

The last important comment is about timing: none of this takes any longer than building your code with poor formatting and structure. Why? Well, if you understand how it's structured it'll be more likely to work the first time. And if it doesn't, you should be able to find the buy more quickly because your code is easier to comprehend.

Summary

To summarise, the neatest, best formatted code takes no longer to write and is much easier to maintain than code that has bad 'style' (although it's much less likely to need fixing).

So the next time that you come across some dreadful, untidy code, try to make the person that wrote it start again. If they don't understand how their own code is structured, neither will anyone else.

You'll note that I keep mentioning studies, but don't reference the source. That's because I've read all about it in Code Complete (follow the link to by it from Amazon.com) and not from the horses mouth. I recommend you also read it for the full story.

Photography, opinions and other random ramblings by Stephen Darlington