Tag Archives: Languages

Swift Types

If you look at the Swift Language guide, you get the distinct impression that the type system is sleek and modern. However the more you dig into it the more eccentricities you find.

The one I’m going to look at today makes sense only if you look at the problem domain from a slightly skewed perspective. I’ve been trying to think whether this is a sensible, pragmatic way of designing a language or a mistake. Judge for yourself.

So, the feature. Let’s define a dictionary:

var test1 = [ "Foo" : "Bar" ]

Check the type and we find that it’s of type Dictionary<String,String>. The generics and type inference are doing exactly what you’d image.

test1["Test"] = "Works"

So basically it’s all good.

So, what type is this expression?

var test2 = [:]

And why does this not work?

test2["Test"] = "Doesn't work"

Let’s take a step back. What’s the problem? Well, [:] is an empty dictionary but give us no clue what the type is. Remember, Swift dictionaries and arrays use generics, so the compiler only allows objects of a particular type to be added.

A good guess for the type would be Dictionary<AnyObject,AnyObject>. But a little fishing around tells you that’s not the case because AnyObject is neither “Hashable” or “Equatable” and keys need to be both.

The answer? test2 is an NSDictionary. That is, in this one circumstance, Swift extends outside its native dictionary type and decides to use a class found in Foundation.

Once you know that, it is clear that the second line should be:

test2.setValue("Does work now", forKey:"Test")

Maybe if you’re familiar with the guts of both Objective C and Swift this behaviour makes sense, but a language built-in returning a completely different type just because it can’t figure out the type feels broken to me.

In the end I think I’ve convinced myself that, while it might be convenient to allow this syntax, it’s a bad idea to saddle the language with these semantics so early on. In a few years when no one uses Objective C or when Swift is no longer fully tied to Cocoa, will this make sense?

I would prefer to see it being a compiler error with the correct approach being explicit with the type:

var test2:Dictionary<String,String> = [:]

Thoughts?

Swift Hate

I’m seeing a surprising amount of vitriol aimed at Swift, Apple’s new programming language for iOS and Mac development. I understand that there can be reasoned debate around the features (or lack thereof), syntax and even the necessity of it but there can be little doubt about the outcome: if you want to consider yourself an iOS developer, it’s a language that you will need to learn.

The only variable I can think of is when you learn it.

I think it’s perfectly reasonable to delay learning it as you have code to deliver now and because Swift is, well, very beta currently.

But asking what the point of Swift is not constructive. Asking what problems can be solved exclusively by Swift makes no sense at all you can do almost anything in most programming languages. Just because Intercal is Turing complete doesn’t mean that you’d want to use it for any real work. What varies between languages is what’s easy and what’s hard.

Objective-C undoubtedly makes some things easier than Swift. It’s a more dynamic language and its C foundations open up a lot of low-level optimisations that probably are not there in higher level languages.

But that same flexibility comes with a price: segmentation faults and memory leaks; pointers; easy-to-get-wrong switch statement; a lack of bounds checking. It also inherits a lot of ambiguity from original C language specification which makes certain automatic optimisations impossible.

How many applications require low-level optimisations more than safety? (And that’s ignoring that the biggest optimisations are usually found in designing a better algorithm or architecture.) How often is it better to have five lines of code instead of one? Every line is a liability, something that can go wrong, something that needs to be understood, tested and maintained.

Whatever its failings, it’s already clear that Swift is more concise and safer than Objective C.

There are absolutely applications where Objective C, C or C++ would be a better choice than Swift. It’s the old 80-20 rule applied to a programming language. And, for those resistant to learning a new language, the 80% is not on “your” side.

Right now, some of this requires a bit of a leap of faith. Swift clearly isn’t finished. You can either learn it now and potentially have some say on what the “final” version looks like, or you can learn it afterwards and just have to accept what’s there. But, either way, you’ll probably using it next year. Get used to it.

C++

Introduction

I don’t want to start off on the wrong foot again, but I’m afraid I might have to. If you read my discussion of the C programming language you may imagine that I’d like C++. After all, C++ fixes some of C’s idiosyncrasies, adds object orientation and a whole host of new features.

You’d be wrong though. In many ways I consider C++ to be a step backwards from its parent and this piece will hopefully explain why.

The big things in life

Identifying the main thing wrong with C++ is easy when you start making a list of features. I don’t mean a list trying to identify things it does badly, but a genuine feature list, stuff like object orientation, exceptions, strong-ish typing, multiple inheritance… Well I’ve only just started, but there’s a huge list.

And that is the problem. C++ has tried to incorporate just about every interesting software engineering development that has been made over the last twenty-five years. In some ways that’s a very good thing: it allows programmers to build code in the most appropriate way which ever that way might be.

The problem is that there’s more than one way to skin any particular cat. While just about any approach is fine on a small program, one with a single developer, when you have a team writing code if there’s no consistency in approach you get the situation where no-one is able to understand the whole. There is no one head big enough.

While There’s More Than One Way To Do It is a great motto for Perl, as a language it has a very different objective. Most Perl programs are ‘hacks,’ small programs designed to solve a particular problem. C++ is a hard-core software engineering language; large teams of developers are common. The same approach used for small programs just doesn’t work for bigger systems. I can build a thousand line program at the keyboard, but a ten million line system? Anyone that thinks they can are deluding themselves. Even on the off-chance that they aren’t, other people need to understand it too. No-one is ever around for ever and no-one is indispensable (except in the case of bad management, but that’s a different story).

Counter Arguments

People often cite C++’s similarity to C as a major plus. If you’ve already learned C, then C++ is easy, right? Just a few extra commands, use “class” instead of “struct” and you’re well away. Except some of the worst C++ code I’ve ever seen has come from people who think like that. Using “//” to start your comments rather than “/*” doesn’t make you a C++ programmer!

There are, however, some benefits for C programmers using C++ compilers. They tend to be less forgiving of bad code, they often give better diagnostics and error messages. But so do Java and C#, only more so. And the jump from C to Java is probably easier than moving from C to C++.

Conclusion

If we think right back to to the beginning of the development of programming languages, we remember that they were designed to simplify things; they were designed so that you could think about the problem rather than what the machine would do.

For the audience that they were aimed at, many of the earlier languages did just that. Fortran allowed scientists to write programs (in fact it’s still being used). Cobol put a greater focus on the business than had ever been the case.

And this is where C++ falls down. Its audience is software engineers, people who write very large and complex applications. Yet its complexity actually hinders development. With a large team, “write-only” code, programs that no-one can understand once they have been constructed, become not just possible but almost guaranteed. There are so many ways of doing the same thing, so many ways to shoot yourself in the foot, that the odds of it being both bug-free and maintainable are almost zero.

C++ does have its plus points, though. It is an excellent language to show how smart you are. If you can understand the entire language and write huge, complex and error-free programs in your sleep, you are clearly much more clever than I am.

Myself, I prefer to fight the problem rather than the development language.

Programming Languages Home

You tend to find two things relating to programming languages on the Internet. The first is a long list of all the languages, often with links to a representative web site. The second category are the “representative” web sites, detailed descriptions of a particular language.

I hope to make this part of ZX81.org.uk to be the middle-ground. I won’t cover every language ever designed and nor will I cover them in vast amounts of detail. Instead I’ll look at each languages main features, advantages and flaws.

What makes me qualified to write these critiques? Well, I’ve used all of the languages covered so far on at least one non-trivial project and at university I did a lot of programming language design and compiler development courses.

The two languages I’ve covered so far are the major Unix stalwarts Perl and C.

C

Introduction

Talking about C is not easy. Almost all professional programmers have used it at some point and many have a strong attachment to it. I don’t want to start by saying that it’s a poor language, alienating much of my audience, but I figure I’m going to end up doing that anyway so I may as well get it out of the way at the beginning.

Compared to many languages that have come since, and even some that came before, C just isn’t a very good language.

There, I’ve said it.

History

Perhaps more than almost any other language since, C has a rich and famous history. It’s almost impossible to discuss it without also talking a little about the history of Unix.

In 1969, Ken Thompson finally got hold of a PDP-7 and decided to write an operating system for it. (It happened more often than you’d think back then.) That operating system was Unix.

A few years later, Dennis Ritchie designed and built a language based on B (itself based on a language called BCPL) and, creatively, called it C. Unlike its immediate ancestors, C had types and a number of other useful bits and pieces.

C was so useful that Unix was quickly rewritten in it. These days that sounds obvious, but until that point operating systems had always been written in assembler or machine code. This is an important part of computer history.

Utility

C became popular largely because it filled a very useful niche. At one extreme you have all the ‘real’ languages. At the time, real computer scientists would have used Algol68. Structured and clever, Algol was a great language but you couldn’t do anything very low level with it.

At the other extreme there is assembler which people had to resort to if they wanted to do anything close to the hardware. Assembler is only one step removed from machine code which makes writing reliable, bug-free code very difficult, especially when you’re building something as large and complex as an operating system.

C fits in the middle. Described by some as a high-level assembler, it allows you to do low level coding, accessing particular memory addresses and the like, and use high-level constructs such as functions and types.

The following books and papers helped me learn to hate C.

Practical C Programming” by Steve Oualline.

Writing Solid Code” by Steve Maguire.

Code Complete” by Steve McConnell.

Hello world. How many nasty ways can you write “Hello World” in C?

Pointers

Key to C’s ability to mix high- and low-level constructs are pointers. Most ‘serious’ languages have some concept of a them. Some call them references, some call them links, but they all, basically, refer to something that identifies a chunk of memory. Most other languages only use them when you have to, but they are C. No pointers, no language.

You want an array? That’s really a pointer. Pass by reference? A pointer. Strings? Ah, they’re arrays! (Which are pointers.)

The advantages of using pointers for just about everything in the entire language are mainly one sided: it makes writing compilers easier. For the poor souls that actually end up using the language all is not so rosy. As Steve McConnell puts it, “pointers are one of the most error-prone areas of modern programming” (Code Complete, section 11.9).

Some of the side-effects of using pointers are not immediately apparent, either. In most languages, arrays have bound-checking (the ability for the language to raise an error if you try to access an element that doesn’t exist). But C doesn’t really have arrays, it has pointers and a little syntactic sugar that makes it look like it has arrays. Pointers don’t know how much memory is being pointed to so you don’t get bound-checking. If you’re lucky your program causes a segmentation fault, if not you might corrupt other data or, on some operating systems, your program.

It takes all types

Another one of C’s biggest problem is it’s typing. As most people will already know, C allows you to put numbers into character variables, integers into floating points and any number of other nasty combinations. To C they’re all valid, but what they do are not always well defined or consistent. Even the same compiler sometimes does different things depending on the level of optimisation in use, the phase of the moon, etc.

The odd thing is that a weakly typed language doesn’t allow you to do more than one with strong types, it merely allows you to do the wrong thing more easily. For a language used for large-scale software engineering projects the risk of poor code is just too great and weak typing should be outlawed.

I’m sure that people are going to mention the myriad of warnings that modern compilers are able to produce, or that ‘lint’ has been available for nearly as long as the language has been. My counter-argument: I don’t see why you should have to add extra tools or read through pages of warnings in order to correct deficiencies in the source language!

The good bits

If C was truly as appalling as I’ve made out so far, no-one would actually be using it. The main ‘win’ for C, as far as I can see, is that it is small, well defined and widely available.

All three merits are in many ways different sides of the same coin (if you can imagine a three sided coin). To make the language well defined, it helps if it’s small. If lots of people are to use it, it needs to be simple enough that they aren’t put off (Ada anyone? Thought not.). Small and well-defined make it easier to write compilers too, meaning that it’s available on everything from the lowliest PC right up to mainframes.

The theory also goes that your programs should recompile on this range of machines too, but that’s not as true as we’d all like to think. If it was true, we wouldn’t need Java or hundreds of ‘#ifdefs’ throughout. My code tends not to be that low-level, but I wouldn’t like to make any promises about its portability. However, that doesn’t make C’s wide availability useless. Even if your program isn’t cross-platform, the skills required are. A C programmer can quickly write code for just about any machine.

Summary

I’ve probably written more lines of C code than just about any other language, so when I say that I don’t like it I hope that you can see that I’m not being narrow-minded or prejudiced.

As I’ve mentioned above there are some things that I like about C, it’s just that there is so much to hate about it and, even at the time it was written, there wasn’t an excuse for it!

The languages main features are its weak typing and over-use of pointers, both of which allow developers to make truly horrendous coding errors with ease.

If those were C’s only problems I might be able to forgive it, but they aren’t. C has more ways for both experienced and novice programmers alike to hang themselves than any other language I can think of (with the possible exception of Intercal). Even if it didn’t have the ‘=’ and ‘==’ operators to confuse, there’s still the wonderful ‘?:’ and a whole host of spectacularly error prone API calls (does all your code check the value that ‘malloc’ returns?)

No, C, as a language, is a dinosaur. It deserves to whither and die. If you write anything other than an operating system kernel and use C, switch to another language. You’ll be far more productive when you start battling the problem rather than the language.

A matter of style

Introduction

The open source community does a lot of things right. The internals of a program is one of them. The people who write code do so because they are proud of what they do and want the respect of other people in the community. The beauty of the code is a very important aspect in this acceptance.

The same isn’t necessarily true in the commercial world. Time-scales are much more important than how the guts of a computer system looks and it’s generally not good to be seen spending a bit of time making your code look pretty.

This column is not going to say that style is more important than substance, but that this interest is something that can bring a large increase in productivity.

Indentation

I know people that hate me for this, but something that really bugs me about some code is the indentation style. I can handle a poorly thought-out, hacked algorithm if there’s a good reason (time usually), but I can never see a good excuse for badly formatted code.

Initially I thought I was being anal, just getting very hung-up on a not-very-important detail, but then I started noticing things. Usually the people that produced the worst formatted code also included the largest number of defects and, although they initially appeared to write code more quickly than me, finished late much more frequently.

Why does that matter?

There are two fundamental reasons why the ‘look’ of the code matters:

  1. Comprehension. How quickly can people understand the code?
  2. Structure. Is it obvious how the program is split into units? Is it obvious what the next statement to be executed is?

I guess they are quite similar, but I feel that it’s important to separate them out. Hopefully you’ll see why in a minute.

Let’s talk about comprehension first. Here’s a snippet of code:

if (x < foo (y, z))
   haha = bar[4] + 5;
 else
   {
     while (z)
       {
         haha += foo (z, z);
         z--;
       }
     return ++x + bar ();
   }

If you've read the GNU coding standards, you might recognise this as an example of good formatting. It isn't.

But there are some redeeming factors. Firstly they've used two spaces for each level of indentation and not a tab. Studies have shown that while people prefer (aesthetically speaking) tabs in programs, those same people actually understand the code more effectively with between two and four spaces. Also, the formatting of the mathematical expressions is clear, with good use of whitespace.

Now the bad. My main problem is with the use of the braces. The lesser crime is immediately after the if condition where they haven't used braces at all. That's bad because a novice programmer might add an extra line below the 'haha' assignment. It probably won't be a valid program any more, but it looks okay at first glance. Since maintenance is the largest part of the software life-cycle, anything that makes updating code more difficult is bad news.

Structure

The indentation of the second half is dreadful, though. Why have two levels of indentation to to indicate one block of code? Again, there are studies that show that formatting like this actually reduces the readability of the code.

My preferred method of formatting the same code would be:

if (x < foo (y, z)) {
   haha = bar[4] + 5;
}
else {
   while (z) {
     haha += foo (z, z);
     z--;
   }
   x++;
   return x + bar ();
}

The main difference is the formatting, but I've also altered the 'return' statement at the end. Side-effects (like using '++x' in an expression) seriously reduce readability as it's much more complex to figure out what it's trying to evaluate. And the solution only takes an extra line...

But why is this a better way to format the code? Simple: it shows the structure of the code more effectively. Since there are only three levels of indentation, your brain doesn't have to work as hard to figure out where, for example, the end of the "else" block is. (It's not so easy to see the advantage with ten lines of code. Try to imagine a page full of code with a large number of indentation levels.)

Some languages appear to be more tricky than the C example here, too. Where does the exception block in Java go? Should the procedure definitions in a package header be indented in PL/SQL?

If you don't really understand block-structure in modern programming languages and are just trying to follow the example of fellow developers, these are probably difficult questions. They're not supposed to be.

A computer will understand any syntactically valid program, but not necessarily in the same way that you or I would. This makes it vitally important that everyone involved has a common understanding of how the program is supposed to work. The first steps to doing this are structuring the code sensibly and making it easy for others to comprehend.

The last important comment is about timing: none of this takes any longer than building your code with poor formatting and structure. Why? Well, if you understand how it's structured it'll be more likely to work the first time. And if it doesn't, you should be able to find the buy more quickly because your code is easier to comprehend.

Summary

To summarise, the neatest, best formatted code takes no longer to write and is much easier to maintain than code that has bad 'style' (although it's much less likely to need fixing).

So the next time that you come across some dreadful, untidy code, try to make the person that wrote it start again. If they don't understand how their own code is structured, neither will anyone else.

You'll note that I keep mentioning studies, but don't reference the source. That's because I've read all about it in Code Complete (follow the link to by it from Amazon.com) and not from the horses mouth. I recommend you also read it for the full story.