Tag Archives: computer

Blocks, both technical and mental

Blocking content from the Internet is getting a lot of press of late. The last couple of weeks has seen the Pirate Bay being blocked by a number of large ISPs and debate over whether the blocking of “adult” content should be opt-in or opt-out.

Unfortunately the enthusiasm to “protect the children” and “protect the copyright holders” seems to have pushed aside much of the debate of whether we should be doing this at all or whether it’s practical.

Whether we should be doing it or not is political. I have my opinions1 but what I want to concentrate on here is whether or not blocking such content is actually possible.

There are a number of different ways of vetting content. They’re not necessarily mutually exclusive, but they’re all deeply flawed.

First, a common one from politicians: the Internet is just like TV and cinema:

Perry said that she has been accused of censorship over the campaign, but argued that the internet was no different to TV and radio and should be regulated accordingly.

No, no it isn’t. There are a handful of TV channels, even taking cable and satellite into account, and a relatively small number of movies released every week. It’s practical to rate movies. TV programmes are distributed centrally, so pressure can be placed on a small number of UK-based commercial entities when they do naughty things.

The Internet is very different. Firstly, counting the number of web pages is rather harder. This is what Wikipedia has to say:

As of March 2009, the indexable web contains at least 25.21 billion pages.[79] On July 25, 2008, Google software engineers Jesse Alpert and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs.

Note that even the smaller number is from three years ago. I’d bet that it’s not smaller now. Clearly the same system of rating an regulation clearly isn’t going to work on that scale. And even if it was possible to rate each of these sites, the UK government has little leverage over foreign websites.

There are basically three ways to automate the process: white list, black list and keyword scanning.

A white list says “you can visit these websites.” Even assuming no new websites are ever added and no new content is ever created, rating those 25 billion pages is not practical. I don’t think we want an official approved reading list.

A black list is the opposite: “you can visit anything except these pages.” We have the same scale problem as with white lists and a few more. Much of the Internet is “user contributed” and it’s not hard to create new sites. If my site is blocked, I can create a new one with the same content very, very quickly. Basically, there’s just no way to keep on top of new content.

Keyword scanning is exactly as it sounds. Your internet traffic is scanned and if certain keywords are spotted, the page is blocked. It’s automated and dynamic, but what keywords do you look for? “Sex”? Well, do you want to block “sex education” websites? “Porn”? That would block anti-porn discussion as well as the real thing.

The scanners can be a lot more sophisticated than this but the fundamental problem remains: there’s no way to be sure that they are blocking the correct content. Both good and bad sites are blocked, and still with no guarantee that nothing untoward gets through.

In all cases, if children can still access “adult” content with relative ease — both deliberately and accidentally — what’s the point?

Of course I’m not in favour of taking content without paying for it or exposing children to inappropriate material. But, to use a cliche, the genie is out the bottle. Like the reaction to WikiLeaks there is little point in pretending that nothing has changed or that the same techniques and tools can be used to fight them.

Instead, if you’re a publisher you need to make your content legally available and easier to access than the alternative. iTunes has showed that people are willing to pay. So far, you’ve mostly shown that you’d rather treat paying customers as criminals. That’s not helping.

As for protecting children, it all comes back to being a responsible parent. Put the computer in the living room. Talk to them. Sure, use white or black lists or filtering, just be aware that it can never be 100% effective and that not everyone has children that need protecting. Whatever the Daily Mail and your technically unaware MP says, you can’t say the connection is being checked, problem solved.

  1. I’m basically anti-censorship and in favour of personal responsibility. There are already laws covering the distribution obscene materials, why should there be restrictions on legal materials? []

Spectrum

You’ve probably seen that it’s the Sinclair Spectrum’s thirtieth birthday today. There are lots of great retrospectives — this is probably the best single source I read — so I’m not going to rehash all that. But I thought it might be worth a few words of my own.

Like many Brits my age, the Spectrum was my first computer. Technically it was the family computer, but after a few weeks I was pretty much the only one who used it.

I remember some of the first few hours using it. I remember, for example, ignoring most of the preamble in the manual and diving straight into typing in a programming example. Those who have used a Spectrum will realise that you can’t just type in code; you need to use special keyboard combinations to get the keywords to appear on screen. I didn’t know about that.

After a while I managed to persuade to let me type in the code. The computer didn’t really understand it and I didn’t know why. I can’t remember whether I found the right section in the manual before having to go to bed but even in my confusion I knew that I was hooked.

And really that was the start of my career in IT. I started really wanting to play games but I ended up spending more and more time programming and less and less loading Bomb Jack. Usually I saw something neat in a program and though “How do they do that?” Then I’d try to figure out what they’d done. I was quite proud of making some text slowly fade into view and then gradually disappear again. Obviously that was after the obligatory:

10 PRINT "Stephen was here!"
20 GO TO 10

I remember all that surprisingly well. Which makes the following line all the more shocking:

It may have been startlingly modern once, but at 30, the Sinclair Spectrum is as close in time to the world’s first commercial computers of the 50s as it is to the latest iPad.

The first commercial computers where created in the early fifties. The first computers — at least computers you’d kind of recognise as computers — were only built a few years before that. Computers are so powerful and connected these days that it’s difficult for me to remember what I even did with them. I wonder what we’ll make of the iPad in a few decades? I’m sure it’ll look just as dated.

The last thing I wanted to mention was about that weird, unique program entry system. In short, each key had a number of different keywords printed on it. The J key, for example, had LOAD (used to load a program from tape), VAL, VAL$ and the minus sign. When you entered code, the editor would be in one of a number of modes and, in addition to shift, there was a key called “Symbol Shift.”

I’ve never seen a sensible explanation for this. They all seem to say it was “easier” or a cue for users so they’d know all the valid keywords to Sinclair Basic. I never bought this. Is it really easier to remember a bunch of non-standard keyboard shortcuts rather than just type? Don’t think so.

And then when I was at university I did a course on compilers, the software that is used to turn human readable code into the binary gibberish that a computer can actually run.

The interesting bit was all around the grammars and recursive descent parsers and the mental challenge of writing a program that wrote other programs. But the first step of the process is called lexical analysis, which takes the jumble of letter that is your program and converts them into “words” that can be processed more easily, so PRINT is a keyword, 123.4 is a number and “Hello” is some text.

Given the resource limitations of writing a whole operating system and BASIC interpreter in 16k, my guess is that it was easier to write a strange keyboard entry system than a lexer.

Can anyone comment on the accuracy of this guess?

But back to nostalgia. From hazy memories, to university, to wild speculation and the iPad. We’ve come a long way. But it was the Sinclair Spectrum that started it all for me. Thank you Clive Sinclair!

My delicious.com bookmarks for November 23rd through November 30th

  • The BBC Micro turns 30 – Pretty much every Brit around my age will remember the Model B. It felt so… professional after using the Sinclair Spectrum!
  • Thanksgiving Is Un-American – Socialism and illegal immigration… Why thanksgiving is un-American.
  • Coders are creatives too: Where’s our love? – "How did a person whose greatest educational achievement is crayoning without going over the lines get termed 'a creative', when the people who built our world are dismissed as geeks and bottom feeders?"

My delicious.com bookmarks for November 16th through November 22nd

  • Coders are creatives too: Where’s our love? – "How did a person whose greatest educational achievement is crayoning without going over the lines get termed 'a creative', when the people who built our world are dismissed as geeks and bottom feeders?"
  • Happy 40th birthday, Intel 4004! – In a way this stated the whole microcomputer… I hate to say "revolution" but I can't think of a better word.
  • Steve Jobs: The parable of the stones – "It's the disease of thinking that a really great idea is 90% of the work. And if you just tell all these other people 'here's this great idea,' then of course they can go off and make it happen. And the problem with that is that there's just a tremendous amount of craftsmanship in between a great idea and a great product."

My delicious.com bookmarks for October 23rd through October 27th