Tuesday, April 13, 2010

Witch doctors should be available on the NHS

One of my relations wrote to his MP opposing the MP's position on funding homeopathy through the NHS. Here's the interesting bit of the MP's reply:

Thanks for your e-mail. There are many people who consider that homeopathy is beneficial to them, and would thus disagree with both the Committee's conclusions and the view you express. In the grand scheme of the billions spent by the NHS, the cost of homeopathy is small - and if people sense that homeopathy is helping them get better, then that is sufficient reason why I think the present arrangements should continue.

It's instructive to reread this email with homeopathy replaced by witch doctors.

Thanks for your e-mail. There are many people who consider that witch doctors are beneficial to them, and would thus disagree with both the Committee's conclusions and the view you express. In the grand scheme of the billions spent by the NHS, the cost of witch doctors is small - and if people sense that witch doctors are helping them get better, then that is sufficient reason why I think the present arrangements should continue.


Tuesday, April 06, 2010

It's the thought that counts

After I finally replaced the old HP Procurve 420 Access Point at the office with an Airport Extreme, HP came up with a solution to my problem. They decided to send me, free of charge, a brand new access point (since the 420 had been end-of-lifed).

This was very kind of them and now, sitting on my desk, I have a brand new HP Procurve MSM310 Access Point. It came with all the trimmings: two antennae, a power adapter and a serious steel wall mounting bracket. Compared to the Apple device it looks seriously industrial.

Weirdly, all four parts, the access point, the power adapter, the antennae and the wall bracket, came in four separate packages. The funniest of which was the one that just contained the two small antennae. Good business for DHL I suppose.

Now, I don't know if this device actually fixes the original Bonjour problem I was having, and I'm unlikely to find out. Despite the London address, HP sent a power adapter with a US plug.

Ah well, it's the thought that counts.

Labels: ,

Thursday, March 25, 2010

Goodbye HP Procurve Access Point 420

Over a year ago I discovered a bug in an HP ProCurve Wireless Access Point 420 that we were using in our office. After being treated badly by HP I finally got support from them by blogging my frustration and ending up on the front page of Google search results for procurve support.

Eventually, weeks later, HP acknowledged the problem with the device. But this story doesn't have a happy ending.

In November 2009 HP informed me that the product was being end of lifed.

Out of curiosity I called HP Procurve Support and asked them about the status of my case and they couldn't even look up my case number. Searching around with my name they did manage to find me and my case was active and open. The latest update was on March 19, 2010 and the case had been escalated to Level 3 Support.

My previous experience with HP support wasn't good, but this time Derek was great. He tracked down the ancient case, updated me with information, updated contact information for me. An excellent experience. Unfortunately, waiting over a year for this to be fixed had become intolerable. (If anyone from HP is reading this, email me and I'll tell you Derek's email address since he deserves a special mention).

I replaced it today with an Airport Extreme. Configuration was a breeze with Apple's Airport Utility. And, here's a little known fact, the Airport Extreme can act as a level 2 bridge which means it can successfully extend our existing network without doing NAT and messing up our Bonjour packets (which were the source of the original HP bug).

And, joy of joys, it can perform WPA2 Enterprise authentication against our RADIUS server.

It's nice that HP is working to track down this bug, but 13 months is a little too long to wait for a fix. Sorry.

Labels: ,

Sunday, February 14, 2010

A bad workman blames his tools

One of the most depressing things about being a programmer is the realization that your time is not entirely spent creating new and exciting programs, but is actually spent eliminating all the problems that you yourself introduced.

This process is called debugging. And on a daily basis every programmer must face that fact that as they write code, they write bugs. And when they find that their code doesn't work, they have to go looking for the problems they created for themselves.

To deal with this problem the computer industry has built up an enormous amount of scar tissue around programs to make sure that they do work. Programmers use continuous integration, unit tests, assertions, static code analysis, memory checkers and debuggers to help prevent and help find bugs. But bugs remain and must be eliminated by human reasoning.

Some programming languages, such as C, are particularly susceptible to certain types of bugs that appear and disappear at random, and once you try figuring out what's causing them they disappear. These are sometimes called heisenbugs because as soon as you go searching for them they vanish.

These bugs can appear in any programming language (and especially when writing multi-threaded code where small changes in timing can uncover or cover race conditions). But in C there's another problem: memory corruption.

Whatever the cause of a bug the key steps in finding an eliminating a bug are:

  1. Find the smallest possible test case that tickles the bug. The aim is to find the smallest and fastest way to reproduce the bug reliably. With heisenbugs this can be hard, but even a fast way to reproduce it some percentage of the time is valuable.

  2. Automate that test case. It's best if the test case can be automated so that it can be run again and again. This also means that the test case can become part of your program's test suite once the bug is eliminated. This'll stop it coming back.

  3. Debug until you find the root cause. The root cause is vital. Unless you fully understand why the bug occurred you can't be sure that you've actually fixed it. It's very easy to get fooled with heisenbugs into thinking that you've eliminated them, when all you've done is covered them up.

  4. Fix it and verify using #2.

Yesterday, a post appeared on Hacker News entitled When you see a heisenbug in C, suspect your compiler’s optimizer. This is, simply put, appalling advice.

The compiler you are using is likely used by thousands or hundreds of thousands of people. Your code is likely used by you. Which is more likely to have been shaken out and stabilized?

In fact, it's a sign of a very poor or inexperienced programmer if their first thought on encountering a bug is to blame someone else. It's tempting to blame the compiler, the library, or the operating system. But the best programmers are those who control their ego and are able to face the fact that it's likely their fault.

Of course, bugs in other people's code do exist. There's no doubt that libraries are faulty, operating systems do weird things and compilers do generate odd code. But most of the time, it's you, the programmer's fault. And that applies even if the bug appears to be really weird.

Debugging is often a case of banging your head against your own code repeating to yourself all of the impossible things that can't ever happen in your code until one of those impossible things turns out to be possible and you've got the bug.

The linked article contains an example of exactly what not to conclude:

“OK, set your optimizer to -O0,”, I told Jay, “and test. If it fails to segfault, you have an optimizer bug. Walk the optimization level upwards until the bug reproduces, then back off one.”

All you know from changing optimization levels is that optimization changes whether the bug appears or not. That doesn't tell you the optimizer is wrong. You haven't found the root cause of your bug.

Since optimizers perform all sorts of code rearrangement and speed ups changing optimizer levels is very likely to change the presence or absence of a heisenbug. That doesn't make it the optimizer's fault; it's still almost certainly yours.

Here's a concrete example of a simple C program that contains a bug that appears and disappears when optimization level is changed, and exhibits other odd behavior. First, here's the program:

#include <stdlib.h>

int a()
int ar[16];

ar[20] = (getpid()%19==0);

int main( int argc, char * argv[] )
int rc[16];

rc[0] = 0;


return rc[0];

Build this with gcc under Mac OS X with the following simple Makefile (I saved it in a file called odd.c):


odd: odd.o

And here's a simple test program for run it 20 times and print the return code:


for i in {0..20}
./odd ; echo -n "$? "

If you run that test program you'd expect a string of zeroes, because rc[0] is never set to anything other than zero in the program. Yet here's sample output:

$ ./test
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

If you are an experienced C programmer you'll see how I made that 1 appear (and why it appears at different places), but let's try to debug with quick a printf

rc[0] = 0;

printf( "[%d]", rc[0] );


Now when you run the test program the bug is gone:

$ ./test
[0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0
[0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0

Weird, so you move the printf:

rc[0] = 0;


printf( "[%d]", rc[0] );

and get the same odd result of a disappearing bug. And the same thing happens if you turn the optimizer on even without the printfs (this is the opposite of the situation in the linked article):

$ make CFLAGS=-O3
gcc -O3 -c -o odd.o odd.c
gcc odd.o -o odd
$ ./test
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This all came about because the function a() allocates a 16 integer array called ar and then promptly writes past the end of it either 1 or 0 depending on whether the PID of the process is divisible by 19 or not. It ends up writing on top of rc[0] because of the arrangement of the stack.

Adding printfs or changing optimization level changes the layout of the code and causes the bad write to not hit rc[0]. But beware! The bug hasn't gone, it's just writing on some other bit of memory.

Because C programs are suspectible to this sort of error it's vital that good tools are used to check for problems. For example, the static code check splint and the memory analyzer valgrind help eliminate tons of nasty C bugs. And you should build your software with the maximum warning level (I prefer warn-as-error) and eliminate them all.

Only once you've done all that should you start to suspect someone else's code. And even when you do, you need to follow the same steps to reproduce the bug and get to the root cause. Most of the time, unfortunately, bugs are your fault.


Wednesday, January 13, 2010

Utter crap reporting from The Daily Telegraph

Here's The Daily Telegraph on the Avatar depression story. The reporting is crapola.

On one website, Avatar Forums, the topic "Ways to cope with the depression of the dream of Pandora being intangible" has more than 1,000 posts.

Ivar Hill, a 17-year-old fan from Sweden, wrote on a similar site: “When I woke up this morning after watching Avatar for the first time yesterday, the world seemed grey. It was like my whole life, everything I’ve done and worked for, lost its meaning … It just seems so meaningless. I still don’t really see any reason to keep doing things at all. I live in a dying world.”

As reported before there are well less than 1,000 posts in that thread. Not sure why the subs changed 'gray' to 'grey'. That's a copy/paste quote from someone. Surely a (sic) would be more appropriate.

Also, when they say "another site", they actually mean "another thread".

But the real mistake is here:

Stacy Kaiser, a psychotherapist, said obsession with the film was masking more serious problems in the fans' lives. "They’re seeing Avatar, they're lonely people, a lot of them don’t have a lot going on in their lives right now," she said. "The movie opened up a portal for them to express their depression.”

Dear Heidi Blake (author of the Telegraph story), that quote is from Jo Piazza from CNN.com (she used to be a gossip writer for the New York Daily News) who wrote the original story that got this ball rolling. When you were watching the report on CNN you managed to write down the wrong name. It was the GOSSIP JOURNALIST and not the psychotherapist who said that.

So basically this entire Daily Telegraph story is a rewrite of a CNN.com story followed by a misquote from a CNN TV story. Wow! Actually involved zero reporting.


Monday, January 11, 2010

CNN.com jumps the shark by writing a story about a forum containing 129 people

So CNN.com has a stunningly ridiculous story called Audiences experience 'Avatar' blues. Now, I'm not saying being depressed is ridiculous, that's a very serious issue, but the CNN.com story is rubbish.

Here's the critical paragraph:

On the fan forum site "Avatar Forums," a topic thread entitled "Ways to cope with the depression of the dream of Pandora being intangible," has received more than 1,000 posts from people experiencing depression and fans trying to help them cope. The topic became so popular last month that forum administrator Philippe Baghdassarian had to create a second thread so people could continue to post their confused feelings about the movie.

So, dig into that forum and you'll find three interesting things:

1. There are actually only 576 messages (not more than 1,000). Hey, what's an error of 40% between friends? I realize that 576 doesn't sound as impressive as 1,000.

2. 129 different people (or at least registered users) have posted to that thread. Hmm. I realize that 129 people in some forum discussing a topic doesn't sound as impressive as 1,000 posts.

3. Of the 129, 60% of the posts have been made by 16 people. In fact, 50% are the work of 10 people. Also 72 of the people (from 129) posted once. I realize that 16 people actively discussing a topic doesn't sound as good as 1,000 posts.

So a small number of people discussing feeling depressed after seeing Avatar is enough for a front-page CNN.com story!?

Now, how many of the 129 people (or the 16 if you prefer) were already feeling depressed before they saw the movie?

UPDATE: I've updated the figures (see comments).

UPDATE: The story made it to CNN on TV. Most interesting part is where the story's author says that she talked to the people in the thread and they were "lonely to begin with" and they are "lonely people". They "don't have a lot going on in their lives right now" and that the movie "didn't create depression".


Friday, January 08, 2010

How I got 50,000 page views by simply being me

Today on Hacker News there's a post about a user who 'engineered' two postings that resulted in 60,000 page views. That post annoyed me because the last thing people like is to know that they've been manipulated.

The other day I blogged about being a geek with an Ikea train set and for some reason that post really captured the imagination of a certain part of the Internet.

I hadn't expected that post to be so popular, and I certainly didn't tailor it to any community. It was just me being me.

But within hours it was on the top of Hacker News, on the front page of Reddit and Wired, and being tweeted widely.

I try to make my blog genuine, if you follow it then you'll be getting a raw feed of me and that could cover all sorts of topics. On the other hand there are many blogs that pander (to varying degrees) to different communities. Part of the reason I follow very few RSS feeds is that much blog writing is vapid self-promotion.


Tuesday, January 05, 2010

What ever were Southern Railway thinking?

Wow, just wow. I got this flyer in the post over Christmas which uses the stereotype of a Mexican with poor English to advertise cheap train service on Southern Railway.

Here's the outside:

And here's the inside:

Apparently I'm meant to be attracted by "I spend less on ticket so I spend more in los sales". It took me forever to realize that 'los sales' was meant to be 'the sales' and not 'the salts'. Also, not to be pedantic or anything, but there's an accent on 'Adiós'.

Here's an idea Southern Railway... next year why don't you feature a black man speaking pidgin English.


Monday, December 21, 2009

The bipolar world

An article on a web site I'd not previously seen called American Thinker says the following about me:

(There are also efforts by true believers to justify the code. Try following the logic of the post in that last link.)

So, I'm a climate change 'true believer' am I? You mean because I blogged something that doesn't agree with your interpretation of the facts I must be from the other side?

Well, guess what. I don't believe in this bipolar world of yours where you're with us or against us, pro-choice or pro-life, or, frankly, any of the other ridiculous black or white notions beloved of people who get involved in politics (of any kind).

If I was a 'true believer', pray tell, why I would have analyzed raw data from the Met Office and found an error in it, or gone on TV in the UK and criticized the quality of code taken from CRU, blogged about all the errors in it?

My take on global warming is... unless you can demonstrate to me that it's false I'm going to believe the scientists who've been working on it. Pretty much the same way I do about any other bit of science. That's how science works, unlike politics.

Labels: ,

Thursday, December 17, 2009

Data Visualization Disease

A few days ago I moaned about an inaccurate and ininterpretable visualization appearing in a book touting its own excellence at visualization. Now, I'm pointed to a visualization of the recently released Met Office land surface temperature record that makes similar mistakes.

Folks, data visualization isn't about pretty colours, or slapping some data into a CSV and asking Excel to make you a line graph. It's about thinking about how the data needs to be interpreted and then creating an appropriate visualization. Many of the 'infoporn' graphics that adorn the blogs and magazines of the digerati (a pejorative term) are little more than the fantasies of a graphic designer sprinkled with some magical 'data' or 'statistics' pixie dust.

But these designers shouldn't be messing around with magic like that. They aren't trained to handle it, Hermione.

Here's the first graph from the blog. It appears to show that it's 10C hotter now than in the 1800s. Holy cow, Batman, the Earth's on fire!

It's all wrong.

All they've done is averaged the temperature readings from across the globe to try to get a sense of global warming. Averages are fun because any fool can calculate them, but pity the fool who averages without thinking. Some questions:

1. Did they ask themselves about the distribution of temperature readings across the globe to ensure that the average correctly reflected the entire Earth's surface? For example, are there lots of thermometers clustered close to each other that might bias the average?

2. Did they ponder the fact that there's much more land in the northern hemisphere, hence many more readings, hence without weighting the average is dominated by northern climes?

3. Did they ask themselves if an average is what you want? Is it reasonable to take the temperature in London in December and the temperature in Sydney in December and average them? Given that it's winter up north and summer down south what does an average tell you?

4. Did they ever ask themselves why the standard deviation is so freakin' huge (see the 2008 numbers in the graph above)?

No, they made a CSV file and graphed it. And since they get some 'warming' out of it they are happy.

This is what I call Data Visualization Disease. You grab some data, you think of a fancy (or not so fancy) way to show it. You shade that it in pastel colours you picked by wandering around Habitat, label it in a sans-serif font, and you're a God of visualization.

What they should have done is taken the thermometer readings, calculated a long term average for each location, calculated the difference between each reading and the average (to understand how much temperature has changed, not the absolute values), mapped those onto a grid laid across the Earth's surface, averaged (perhaps with variance adjustment) values from all the thermometers in each grid square to get a grid anomaly value, then produced a weighted average for the hemispheres based on weighting by the cosine of the latitude (since the grid box area varies with latitude) to get hemisphere averages.

Then they could have plotted that.

But there's no infoporn in doing that, that sounds like actual work, and worse, thinking. Phew! No, thanks. Pass the Crayola.

Update: since writing this rant I've seen that the blog I'm criticizing has listened to the complaints of people who pointed out similar problems.

Labels: ,

Friday, December 04, 2009

Facebook's creepy privacy

Yesterday I received an email from Facebook that I assumed was some sort of scam. In fact, it was totally genuine and I received it because someone I know is using Facebook to promote their business.

Here's the email:

I know three of those people, but the three people in the red box are unknown to me. And the three people I know are not people I am friends with on Facebook. I'm guessing that what happened here is that these people have my email address in their address book. Since I'm a fairly public person it wouldn't be a surprise if they've emailed me at some time in the past, and Facebook has looked in their system to see who had my email address imported and used it to target me.

Now when I first used Facebook I used their system of uploading information from my Google Mail address book to find friends. Little did I know (and I would not have expected) that Facebook would retain that information after I'd used it for the purpose that I gave it to them for, and later use it to tell other people about me.

Digging into the Facebook Privacy document I found the following (I added emphasis):

To make Suggestions. We use your profile information, the addresses you import through our contact importers, and other relevant information, to help you connect with your friends, including making suggestions to you and other users that you connect with on Facebook. If you want to limit your visibility in suggestions we make to other people, you can adjust your search visibility privacy setting, as you will only be visible in our suggestions to the extent you choose to be visible in public search listings. You may also block specific individual users from being suggested to you and you from being suggested to them.

I don't think this is acceptable, it's creepy.

When I imported my address book I assumed it was being done just to help me find people at that moment. I did not assume they were going to store this information for future use and then use it to target other people.


Thursday, December 03, 2009

Whoops. There's a third bug in that code.

So, I'm sitting on the bus this morning executing CRU's IDL code in my head when I suddenly realized that there's another more subtle bug in the exact same code I was looking at the other day.

Here's the critical loop once more:
 for i=0.0,nel do begin
x=x<179.9 & x=x>(-179.9)
y=y>(-89.9) & y=y<89.9
; avoids a bug in IDL that throws out an occasional
; plot error in virtual window
if error_value ne 0 then begin


So, it's plotting those little 32-sided polygons on a flat map of the world and it's making the adjustment to the size so that when the polygon is near the top (or bottom) of the world it gets larger to correctly cover the required area.

But what happens if it plots a polygon near the 'edge of the world'. For example, what happens if it plots a polygon at 85 degrees of latitude and 170 degrees of longitude?

First, here's a picture of a polygon plotted at 85 degrees of latitude but well away from the 'edge of the world'.

Now look at the same polygon at 170 degrees of longitude. See the problem? It doesn't wrap around to the other side. Oops. Since the world is a sphere you'd expect the polygon to reappear on the left hand side of this picture showing the area of influence of the meteorological station being plotted.

So some information is lost for data being plotted near the 180 degrees line. Admittedly, that's in the middle of the Pacific Ocean (although it does cut through some land mass). But if there are any ocean temperature measurements at the 'edge of the world' then bits of their data isn't being taken into account.

I wonder what, if any, impact these three bugs have on the output of this program.

PS. There's actually a fourth problem with this code. The number 110.0. It's being used to convert from kilometres to degrees of longitude and latitude. The same number is used for both even though the Earth isn't a perfect sphere.

The code is using a value of 39,600 km for the circumference of the Earth, whereas the mean value is actually 40,041 km. But, hey, what's an error of 1% between friends?


Tuesday, December 01, 2009

We should probably feel sorry for Ian 'Harry' Harris at CRU

Reading through the code and then through his HARRY_READ_ME.TXT you can see a man up against something that was slightly outside his ability. I don't mean that in a nasty way; what was needed was a professional programmer and not a professional scientist.

In the midst of the file we find the following plaintive exclamations:
Something is very poorly. It's my programming skills, isn't it.

So, once again I don't understand statistics. Quel surprise, given that 
I haven't had any training in stats in my entire life, unless you count
A-level maths.

and.. yup, my awful programming strikes again.

So, good news - but only in the sense that I've found the error. 
Bad news in that it's a further confirmation that my abilities are
short of what's required here.


Monday, November 30, 2009

Bugs in the software flash the message 'Something's out there'

The more I look at the software used by the folks at CRU, the more I think: "these guys seriously need to hire a professional programmer." The code is mostly an undocumented, untested tangled mess of little programs. Ugh.

Oh, and it's buggy.

My old colleague Francis Turner found a lovely example of something that's the work of either a genius or a fool (or perhaps a mad genius). To calculate information about the influence of one weather station on another (which requires working out how far apart they are by the great circle route between them) the code draws little coloured circles (actually little 32-sided polygons) on a virtual white screen and then goes looking for non-white pixels to identify areas for which data is missing.

Here's a snippet (full source):
 for i=0.0,nel do begin
x=x<179.9 & x=x>(-179.9)
y=y>(-89.9) & y=y<89.9
; avoids a bug in IDL that throws out an occasional
; plot error in virtual window
if error_value ne 0 then begin


The first bug appears to be in IDL itself. Sometimes the polyfill function will throw an error. This error is caught by the catch part and enters the little if there.

Inside the if there's a bug, it's the line i=i+1. This is adding 1 to the loop counter i whenever there's an error. This means that when an error occurs one set of data is not plotted (because the polyfill failed) and then another one is skipped because of the i=i+1.

Given the presence of two bugs in that code (one which was known about and ignored), I wonder how much other crud there is in the code.

To test that I was right about the bug I wrote a simple IDL program in IDL Workbench. Here's a screen shot of the (overly commented!) code and output. It should have output 100, 102, 103 but the bug caused it to skip 102.

Also, and this is a really small thing, the code error_value=0 is not necessary because the catch resets the error_value.

BTW Does anyone know if these guys use source code management tools? Looking through the released code I don't see any reference to SCM.


If, like me, you are trying to actually read and understand the code you might like to know the following:

1. The value 110.0 there is, I believe, the number of km in a degree of longitude at the equator (40,000km of circumference / 360 = 111.1). It is used to convert the 'decay' distance in xkm to degrees.

2. The (1.0/cos(!pi*pts1(i,0)/180.0)) is used to deal with the fact that km 'go further' in terms of degrees longitude when you are not on the equator. This value elongates the polygon so that it 'correctly' covers the stations.

3. The entire thing is plotted on a 144 x 72 display because there are 2.5 degrees in each square grid and so 360 degrees of longitude / 2.5 = 144 and 180 degrees of latitude / 2.5 = 72.

In the HARRY_READ_ME.TXT file there's commentary about a problem where the sum of squares (which should always be positive) suddenly goes negative. Here's what it says:
17. Inserted debug statements into anomdtb.f90, discovered that
a sum-of-squared variable is becoming very, very negative! Key
output from the debug statements:

OpEn= 16.00, OpTotSq= 4142182.00, OpTot= 7126.00
DataA val = 561, OpTotSq= 2976589.00
DataA val = 49920, OpTotSq=-1799984256.00
DataA val = 547, OpTotSq=-1799684992.00
DataA val = 672, OpTotSq=-1799233408.00
DataA val = 710, OpTotSq=-1798729344.00
DataA val = 211, OpTotSq=-1798684800.00
DataA val = 403, OpTotSq=-1798522368.00
OpEn= 16.00, OpTotSq=-1798522368.00, OpTot=56946.00
forrtl: error (75): floating point exception
IOT trap (core dumped)

..so the data value is unbfeasibly large, but why does the
sum-of-squares parameter OpTotSq go negative?!!

Probable answer: the high value is pushing beyond the single-
precision default for Fortran reals?

The 'probable answer' is actually incorrect. If you examine the code you'll find that DataA is declared as integer type which is a signed 4 byte number in most Fortran. That means its maximum positive value is 2147483647 but the program is doing 49920-squared which is 2492006400 (which is 344522753 bigger than the maximum value).

When this happens the number wraps around and goes negative. It then gets added to OpTotSq (which is a real) and the entire thing goes negative.

Here's the code:
    do XAYear = 1, NAYear
if (DataA(XAYear,XMonth,XAStn).NE.DataMissVal) then
end if
end do

Oddly, the solution given is to change the incoming data file to eliminate the value that causes the algorithm to go wrong. i.e. the actual algorithm was not fixed.

Action: value replaced with -9999 and file renamed:


The 'very artificial correction' flap looks like much ado about nothing to me

The other day I posted that some code from the CRU Hack that people were shouting about was actually dead code. It was subsequently pointed out to me that there was another version in which the code was not dead.

So, I figured I'd have a look at it and see what was going on and try to read the scientific papers behind this. I've come to the conclusion "move along, there's nothing to see here".

Firstly, here's couple of segments of the code that does the 'artificial correction':
2.5,2.6,2.6,2.6,2.6,2.6]*0.75 ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,'Oooops!'

Since I don't have the datasets that underlie this code I can't actually execute it, but I can follow it. Here's a line-by-line explanation of what's happening:

yrloc=[1400,findgen(19)*5.+1904]: this creates a list of numbers which are stored in yrloc. The list starts with the number 1400 and then consists of 1904, 1909, 1914, 1919, etc. all the way up to 1994. findgen(19) creates the list 0, 1, 2, up to 18. The *5. multiplies these numbers by 5 to obtain the list 0, 5, 10, up to 90. The +1904 adds 1904 to them to get the list 1904, 1909, 1914 up to 1994.

So the final value of yrloc is [1400, 1904, 1909, 1914, 1919, 1924, 1929, 1934, 1939, 1944, 1949, 1954, 1959, 1964, 1969, 1974, 1979, 1984, 1989, 1994]. The square brackets around the numbers make it into a list (which is called a vector in this computer language).

valadj=[0., 0., 0., 0., 0., -0.1, -0.25, -0.3 ,0. ,-0.1 ,0.3 ,0.8 , 1.2, 1.7, 2.5, 2.6, 2.6, 2.6, 2.6, 2.6]*0.75 creates a list called valadj where each element of the list in square brackets is multipled by 0.75 (that's the *0.75 bit).

So the final value stored in valadj is [0, 0, 0, 0, 0, -0.075, -0.1875, -0.225, 0, -0.075, 0.225, 0.6, 0.9, 1.275, 1.875, 1.95, 1.95, 1.95, 1.95, 1.95]

if n_elements(yrloc) ne n_elements(valadj) then message,'Oooops!' This is checking that yrloc and valadj have the same number of elements (i.e. the two lists are the same length). This is important because the rest of the code relies on lining these two lists up and treating them as pairs.

For example, the 1400 in yrloc corresponds to the first 0 in valadj and 1994 in yrloc corresponds to the final 1.95 in valadj. Here's a simple plot of the data with yrloc on the horizontal (x-axis) and valadj:

Now it gets a little bit more complicated.

yearlyadj=interpol(valadj,yrloc,x). The interpol function is doing a linear interpolation of the data in valadj and yrloc and it uses that to look up the adjustment necessary for the years that are mention in x. You'll have to go read the original code to see where x comes from but basically there's code to read maximum latewood density (MXD) values from a file and transform that into two lists: x (which contains the years) and densall. As above these have the same number of entries and there's a correspondence between the entries similar to the yrloc and valadj explained above.

What the interpol function is doing is saying given the linear interpolation of the datapoints we know about (the 'aritificial correction') make a list of the corrections necessary for the years in x. It works like this; here's a graph of the linear interpolation from 1400 to 1994.

Now, suppose that x contains data for 1653. To find the adjustment necessary the program essentially looks at the graph above, finds 1653 on the x-axis and then the corresponding value on the y-axis (which, in this case, is 0 since there's no adjustment between 1400 and 1924). Or if it needs a value for 1992 it finds 1992 on the x-axis and find the corresponding point on the y-axis (which in this case will be between the datapoints for 1989 and 1992 along the line that joins them: in this case it will find an adjustment of 1.95 since the graph is flat at that point. Final example, suppose x contains data for the year 1975. Follow the x-axis to 1965 and then up to the line between the datapoints at 1964 and 1969. Here the line is a slope and the value at the intersection is 1.395.

So the program ends up with yearlyadj filled with the adjustments necessary for the data in densall corresponding to the years in x. Finally, the program makes the adjustment with the line densall=densall+yearlyadj. On a year-by-year basis the values in yearlyadj are added to the values in densall. The first value of yearlyadj is added to the first value of densall, the second value of yearlyadj is added to the second value of densall and so on.

Now this program is dated at September 1998 by Ian Harris at CRU. And if we look there's a paper by Harris (with others) published in Philosophical Transactions of the Royal Society B. The paper is Trees tell of past climates: but are they speaking less clearly today?. The entire paper is about how the record in tree-rings doesn't match up in recent decades with actual temperatures, but it did in the past.

There's a nice picture of this in the paper:

Ignore the dotted line. The thin line shows that average summer temperatures, the thick line the predicted temperature from the maximum latewood density. They are going along fine until around 1930 when the actual temperature starts to exceed the predicted temperature and the gap gets (mostly) greater and greater. Real divergence occurs in the mid-1960s.

Now pop back to my first graph. There's a rough correspondence between the correction being made and the gap in CRU's graph. It's certainly not perfect but it looks to me like what's happening here is a scientist trying to make two datasets correspond to each other starting with a rough estimate of what it would take to make them work correctly.

And the paper in fact cautions that such divergence is a cause for concern about relying on the historical tree-ring record until we understand why there is such divergence. To quote the paper:
Long-term alteration in the response of tree growth to climate forcing must, at least to some extent, negate the underlying assumption of uniformitarianism which underlies the use of twentieth century- derived tree growth climate equations for retrodiction of earlier climates.

And the conclusion calls for understanding what's happening here.

So, if there's a smoking gun it was published in Philosophical Transactions of the Royal Society B where the authors point out clearly their concerns about the data and the need to understand why. In doing so they are bound to want to remove the divergence by understanding what happens. A first start is a stab at that with this 'artificial correction'.

In later years, they went looking for the divergence using principal component analysis. this can be seen in this file and this file). Their code tells you what it's doing:
; Reads in site-by-site MXD and temperature series in
; 5 yr blocks, all correctly normalised etc. Rotated PCA
; is performed to obtain the 'decline' signal!

So, they appear to have replaced their hacked-up adjustment above with actual analysis to try to understand what part of the MXD data is caused by some unknown 'decline' causing the difference. The 1998 paper speculates on what might be causing this difference, but the PCA is done just to find it statistically without knowing why.

So, given that I'm a total climate change newbie and haven't been involved in what looks like a great deal of political back and forth I'm going to take this with a grain of salt and say this looks OK to me and not like a conspiracy.

But all the talk of not releasing data and hiding from FOIA requests does make me uneasy. Science is better when other scientists can reproduce what you are doing, just as I once did.


Friday, November 27, 2009

How to fail at data visualization

Today the BBC News made me aware of an upcoming book and associated blog about data visualization. Flicking through the blog I was instantly repelled by the horrible data visualizations presented as something to be proud of.

For example, consider the following snippet:

It purports to show you which countries are the most dangerous to fly to by charting the 'density' of fatal aircraft accidents by destination country. This visualization is wrong on so many levels that it's hard not to laugh:

1. The term 'density' is not defined and in fact he isn't showing a density at all, just the raw total numbers by country.

2. The underlying data is incorrect. The diagram specifically says that it uses 'fatal accidents' drawn from this database. Unfortunately, the person doing the visualization has used the total number of 'incidents' (fatal and non-fatal accidents). For example, he gives a value of 75 for the number of fatal accidents in Ecuador, whereas the database gives 38. The same applies for all the other countries.

3. He uses circles with dots in the middle to represent the size. So is the value being plotted proportional to the radius of these circles or the area? Not clear. For example, try to compare Russia and Canada; they look about the same. Now look at Russia and India, India looks smaller to me. So what's the truth?

Using his incorrect figures we have Russia with 626 accidents, India with 456 and Canada with 452. So Russia is a 1.38x more dangerous destination than Canada. Can you spot that from the diagram?

4. Take a look at Europe. Can you figure out which countries he's indicating in the diagram? It's almost impossible.

5. The US comes out as the top country in terms of fatal accidents with 2613 (actually, it's 1088) but that fails to take into account the one important thing: how many flights are there, or even how many people actually fly? The US could be the most dangerous if there are few flights and lots are fatal, but it could even be the safest if there are lots and lots of flights. What you actually want to answer is 'what's the probability of me dying if I fly to the US?' One way to calculate that would be total number of flights with a fatality / total number of flights; another way would be total number of fatalities / total number of passengers. Either way just knowing the total number of crashes doesn't tell you much.

So the diagram charts the wrong statistics, uses the wrong underlying data, and then presents it in a way that's hard to interpret. And this is what gets a book published?


Monday, November 09, 2009

Parsing HTML in Python with BeautifulSoup

I got into a spat with Eric Raymond the other day about some code he's written called ForgePlucker. I took a look at the source code and posted saying it looks like a total hack job by a poor programmer.

Raymond replied by posting a blog entry in which he called me a poor fool and snotty kid.

So far so good. However, he hadn't actually fixed the problems I was talking about (and which I still think are the work of a poor programmer). This morning I checked and he's removed two offending lines that I was talking about and done some code rearrangement. The function that had caught my eye initially was one to parse data from an HTML table which he does with this code:

def walk_table(text):
"Parse out the rows of an HTML table."
rows = []
while True:
oldtext = text
# First, strip out all attributes for easier parsing
text = re.sub('<TR[^>]+>', '<TR>', text, re.I)
text = re.sub('<TD[^>]+>', '<TD>', text, re.I)
# Case-smash all the relevant HTML tags, we won't be keeping them.
text = text.replace("</table>", "</TABLE>")
text = text.replace("<td>", "<TD>").replace("</td>", "</TD>")
text = text.replace("<tr>", "<TR>").replace("</tr>", "</TR>")
text = text.replace("<br>", "<BR>")
# Yes, Berlios generated \r<BR> sequences with no \n
text = text.replace("\r<BR>", "\r\n")
# And Berlios generated doubled </TD>s
# (This sort of thing is why a structural parse will fail)
text = text.replace("</TD></TD>", "</TD>")
# Now that the HTML table structure is canonicalized, parse it.
if text == oldtext:
end = text.find("</TABLE>")
if end > -1:
text = text[:end]
while True:
m = re.search(r"<TR>\w*", text)
if not m:
start_row = m.end(0)
end_row = start_row + text[start_row:].find("</TR>")
rowtxt = text[start_row:end_row]
rowtxt = rowtxt.strip()
if rowtxt:
rowtxt = rowtxt[4:-5]# Strip off <TD> and </TD>
rows.append(re.split(r"</TD>\s*<TD>", rowtxt))
text = text[end_row+5:]
return rows

The problem with writing code like that is maintenance. It's got all sorts of little assumptions and special cases. Notice how it can't cope with a mixed case <TD> tag? Or how there's a special case for handling a doubled </TD>?

A much better approach is to use an HTML parser than knows all about the foibles of real HTML in the real world (Raymond's main argument in his blog posting is that you can't rely on the HTML structure to give you semantic information---I actually agree with that, but don't agree that throwing the baby out with the bath water is the right approach). If you use such an HTML parser you eliminate all the hassles you had maintaining regular expressions for all sorts of weird HTML situations, dealing with case, dealing with HTML attributes.

Here's the equivalent function written using the BeautifulSoup parser:

def walk_table2(text):
"Parse out the rows of an HTML table."
soup = BeautifulSoup(text)
return [ [ col.renderContents() for col in row.findAll('td') ]
for row in soup.find('table').findAll('tr') ]

In Raymond's code above he includes a little jab at this style saying:

# And Berlios generated doubled </TD>s
# (This sort of thing is why a structural parse will fail)
text = text.replace("</TD></TD>", "</TD>")

But that doesn't actually stand up to scrutiny. Try it and see. BeautifulSoup handles the extra </TD> without any special cases.

Bottom line: parsing HTML is hard, don't make it harder on yourself by deciding to do it yourself.

Disclaimer: I am not an experienced Python programmer, there could be a nicer way to write my walk_table2 function above, although I think it's pretty clear what it's doing.


Thursday, October 22, 2009

Nerd is the new normal

When I was writing The Geek Atlas there was a big debate about the title. My original title included the word geek, but O'Reilly quickly overruled it. By the time the final title was agreed, we went with a title that O'Reilly themselves suggested: The Geek Atlas.

And then, just the other day, a US TV station did a report called "Nerd is the new normal".

Labels: , ,

Friday, October 09, 2009

The DisplayLink USB video adapter just works

The other day I wrote a post about the hell of Apple video adapters. Out of the blue I received an email telling me about DisplayLink. The company offered to send me a review unit to see if it would help with my problem.

I accepted and received one a few days later. Here it is.

There are three components: a standard USB to mini-USB cable. The DisplayLink box (top right) itself which has two connectors: mini-USB on one side and a DVI connector on the other. It also comes with a DVI to VGA converter (top left).

The adapter works by simply plugging it into the USB port of a computer. After installing the appropriate driver (Windows or Mac OS X) you can then connect a display or projector to the adapter and it appears as if it were directly connected to your machine.

For example, I connected the adapter to a MacBook in our conference room which is normally connected directly to our Epson projector. Instead I attached the connector to the DisplayLink. The Epson appeared in the list of displays detected by Mac OS X (the only poor part of the experience was downloading the correct driver which was OS version specific).

I then tried it on a MacPro and on a MacBook Air. Worked like a charm. Then I switched to a laptop with Windows Vista installed. Once again, it worked perfectly. The video was crisp and smooth.

I'm not a gamer so I didn't try it with fast moving graphics (although I did watch both YouTube and iPlayer video streaming video).

The DisplayLink adapter is available through a variety of OEMs, you can't get it directly from the manufacturer so I can't tell you the cost of the model I was using. But, as an example, the Diamond BVU195 HD USB Display Adapter costs about $80.

The only problems I see with it are: it got very hot with extended operation, and it only partially solves my problem. Now I have to ensure that all the machines that I might connect to the projector have the right software on them. But, at least, I don't have to get into trouble when someone steals the one Apple video adapter I need.

Overall, for our needs (office applications) the adapter is a nice solution. We'll probably be sticking with it.


Friday, October 02, 2009

The hell of Apple video connectors

At my day job I made the fateful decision to go with an all Apple shop. Everyone has an Apple MacPro maxed out with dual monitors (we don't buy Apple monitors or RAM because of the excessive price). (Some of the team have switched their machines to run Windows XP or Windows Vista and one team member uses Ubuntu; the only person on PC hardware is the CEO).

For mobile users we have a range of laptops including the MacBook Air and different generations of MacBook Pro and one MacBook. We even have an Xserve as our main server.

But there's a price to pay... when connecting laptops to the projector in the meeting room we are in adapter hell.

To solve these problem I have one of each adapter with little stickers on them. One end of each of these is a standard 'VGA' style connector for the projector.

The red is for MacBook Air, because it has a micro DVI connector.

The green is for the MacBook, because it has a mini DVI connector.

The yellow is for the recent MacBook Pros, because they have a mini DisplayPort connector.

The blue is for the old style MacBook Pros, because they have a Apple DVI connector.

Oddly all these machines manage to have the same MagSafe power connector, but I'm in video connector hell. When is someone going to do a wireless video standard?


Friday, September 18, 2009

Yet more rubbish UI design on a telephone

What is is about telephones that inspires such awful design? Is it any wonder that the iPhone is a success? It must have been like shooting fish in a barrel to design a phone that works (if you happen to be Apple).

My desk phone is a Cisco IP Phone 7960 Series which is apparently "designed to meet the communication needs of professional workers in enclosed office environments--employees who experience a high amount of phone traffic in the course of a business day".

Right now I've got voicemail. This is shown by a big red thing glowing on the handset and a message that says "You Have VoiceMail" on the display, and a flashing envelope symbol.

Right next to the flashing envelope is a soft button. Since it's pointing to the envelope, doesn't it seem like that should take me to voicemail? Well, it did to me, at least, but nope, it's the equivalent of pressing the "PickUp" button and asks me what number I want to dial.

And then, why, when I get into voicemail, do I have to use the number keys to navigate and not all the nice soft keys on the phone? For crying out loud.


Tuesday, August 25, 2009

How to trick Apple Numbers into colouring individual bars on a bar chart

Suppose you have a bar chart like this:

and it's made from a data table like this:

And you are really proud of the sales of the XP1000 and want to change the colour of its bar to red. In Apple Numbers you can't do that because the bar colour is based on the data series.

But you can fool Apple Numbers by creating two data series like this:

Then choose a Stacking Bar chart after selecting the two series of data in the data table and you'll get a chart like this:

You can change the colour of any of the series by clicking on the Fill button on the toolbar. And you can extend that beyond two series to colour the individual bars as needed.


Tuesday, August 18, 2009

The Gay Agenda

One of the adverse effects of my Alan Turing Petition is that some commentators see it as part of a 'gay agenda'. Here's a comment from someone:

This is sad. Turing is being used by sectors of the UK gay lobby as a political wedge to bash an already-weak Labour government. Despite the good intentions, we all know that the newsbite would be "Brown apologises to gay war hero". That is wrong on so many levels.

I've seen other similar quotes implying that what's behind the petition is a 'gay agenda' or some local government trying to be PC. All there is behind this 'campaign' (that's the newspapers word, not mine) is one person: me.

Last night while talking on BBC Radio Manchester the interviewer asked me a question about why Alan Turing isn't better known in the gay community. It was then that I had to admit (since I wasn't planning to talk about sexuality) that I'm not gay.

That probably comes as a shock to some people who don't understand that my petition isn't motivated by a hidden agenda. I think Alan Turing's treatment was appalling. I think we lost a great, great man when he died at 41 who had much more to contribute and I think that Britain has not adequately recognized this great man, or the manner of his decline. If he hadn't died so young we probably would have knighted him and celebrated his genius.

I do not expect that the British Government will apologize. They are damned if they do because they'll really need to apologize to all the other men prosecuted for gross indecency and then there's probably a list of other nasty things in the past that people could ask for an apology for.

But if they do want to honour Alan Turing (and others who were prosecuted), then I suggest that they fund Bletchley Park and The National Museum of Computing in his and their honour.

Labels: ,

Wednesday, August 05, 2009

Unmarked surveillance vehicles in Central London

I see these vehicles all the time. Today two were parked near my office (the drivers appeared to be having lunch and a chat):

However, they give me the creeps for two reasons: they are totally unmarked and they are doing automatic number plate recognition.

Since they are unmarked it's impossible to tell who's controlling them. Are they police vehicles looking for the evil-doers, or are they from the local council looking for people breaking parking laws. No idea (I think it's the latter).

ANPR creeps me out because it's part of the larger surveillance state that's evident since I returned to the UK.

The four cameras on the vehicles are marked with "PIPS Technology / A Federal Signal Company" and that appears to refer to these guys. As a geek I find the ability to spot and read multiple number plates while traveling at a relative velocity of 155 MPH very neat.

I'd just rather you didn't do that. Or if you do that please tell me why and where the data is going.


Just give me a simple CPU and a few I/O ports

Back when I started programming computers came with circuit diagrams and listings of their firmware. The early machines I used like the Sharp MZ-80K, the BBC Micro Model B, the Apple ][ and so on had limited instruction sets and an 'operating system' that was simple enough to comprehend if you understood assembly language. In fact, you really wanted to understand assembly language to get the most out of these machines.

Later I started doing embedded programming. I wrote a TCP/IP stack that ran on an embedded processor inside a network adapter card. Again it was possible to understand everything that was happening in that piece of hardware.

But along the way Moore's Law overtook me. The unending doubling in speed and capacity of machines means that my ability to understand the operation of the computers around me (including my phone) has long since been surpassed. There is simply too much going on.

And it's a personal tragedy. As computers have increased in complexity my enjoyment of them has plummeted. Since I can no longer understand the computer I am forced to spend my days in the lonely struggle against an implacable and yet deterministic foe: another man's APIs.

The worse thing about APIs is that you know that someone else created them, so your struggle to get the computer to do something is futile. This is made worse by closed source software where you are forced to rely on documentation.

Of course, back in my rose tinted past someone else had made the machine and the BIOS, but they'd been good enough to tell you exactly how it worked and it was small enough to comprehend.

I was reminded of all this reading the description of the Apollo Guidance Computer. The AGC had the equivalent of just over 67Kb of operating system in ROM and just over 4kb of RAM. And that was enough to put 12 men on the moon.

Even more interesting is how individuals were able to write the software for it: "Don was responsible for the LM P60's (Lunar Descent), while I was responsible for the LM P40's (which were) all other LM powered flight". Two men were able to write all that code and understand its operation.

12 men went to the moon using an understandable computer, and I sit before an unfathomable machine.

Luckily, there are fun bits of hardware still around. My next projects are going to use the Arduino.


Monday, August 03, 2009

Please don't use pie charts

I don't like pie charts. I don't like them because they fail to convey information. They do that because people have a really hard time judging relative areas instead of lengths. Wikipedia mentions some of the reasons why pie charts are generally poor.

I'd go a little further and say that pie charts are really only useful when a small number of categories of data are far, far greater than others. Like this image from Wikipedia of the English-speaking peoples:

Yep, there are lots of Americans.

Once you get data that isn't widely different or you have lots of categories your pie chart would be better as either a bar chart, or as simply a data table. Here's a particularly bad pie chart from a blog about Microsoft Office. It depicts the number of features added in various releases.

Literally eveything is wrong with this pie chart. The data being presented is the number of features added per release. Releases occur chronologically. So an obvious choice would be a bar chart or a line chart for cumulative information with time going from left to right. Instead we have to follow the chart around clockwise (finding the right starting point) to follow time.

And since the releases didn't come out at equal intervals it would be really nice to compare the number of features added with the amount of time between releases.

The pie chart has no values on it at all. We don't get the actual number of features, or just the percentage added. So we are left staring at the chart trying to guess the relative sizes of the slices. And that's made extra hard by the chart being in 3D. For example, how do Word 2000 and Word 2003 compare?

But if you still must use pie charts, I beg you not to use 3D pie charts. Please, they are simply an abomination. Making them 3D just makes them even harder to interpret.


Network Solutions renames their services for added obscurity

I logged into my Network Solutions account this morning for a bit of Monday domain name management to be treated to a page which contained the following:

You see Network Solutions has decided that the service called "Domain" was much too obscure and difficult to understand and so it's much clearer if we now call it "nsWebAddress". Huh?

Also, "Web Site" was so obscure that it was better changed to "nsSpace" or "nsBusinessSpace". And you know those "SSL Certificates"? Well, that was way too confusing, so let's call them "nsProtect".

I'm sure that someone in Network Solutions' marketing department got really excited about all these changes. I'm also betting that they don't actually use their own product.

The best part comes when you actually run the gauntlet of offers and get to your account. Obviously "nsWebAddress" was so crystal clear that they felt the need to put "(Domains)" after it. Pure, pure genius.


Wednesday, July 01, 2009

How to do customer service

I've previously complained about poor technical support that I received from Hewlett-Packard. That particular incident isn't over yet... the issue has been escalated a couple of times, HP has told me they are end-of-lifeing the product, ... I'll write that up when it comes to a resolution.

But it's not all moaning! Two companies that have provided excellent customer service recently are Apple and Bugaboo. I dealt directly with Apple myself, a friend with small children told me about the Bugaboo goodness.

First off, Apple. I own a MacBook Pro that I bought in mid-2007. Unfortunately, it suddenly started to suffer from the NVIDIA GeForce 8600M GT problem a couple of months ago. The upshot was that my machine would boot but couldn't find a display adapter (or at least it found the Intel display adapter, not the NVIDIA one).

I verified that I could ssh into the machine and ran System Profiler on the command-line. A quick search by serial number showed that my machine was susceptible to this problem and that Apple offered free service.

So, I called AppleCare. I never bought AppleCare for this machine and for this problem I didn't need it. I described my problem in detail to the technician including the steps that I'd taken to try to resolve it (including resetting the PRAM and SMC) and he did something great. He completely avoided going through any script, realized that I knew what I was talking about and immediately set the machine up for repair.

Next step was an appointment with the Genius Bar. This was the most annoying part because Apple's Concierge software is poorly designed. But once at the Genius Bar I got my appointment in about 10 minutes of the allotted time. The technician immediately verified that I had the NVIDIA problem and that I was eligible for a motherboard replacement.

While I was chatting with him I mentioned that my iPhone headphones had a fault and I wanted to buy some new ones. He asked me how long I'd had the iPhone (about 3 months) and simply went and got me a new pair, for free, just like that.

Then he told me to expect that my MacBook Pro would take about a week to repair. I left the Apple Store and went into work. That evening Apple called me to tell me the laptop was ready.


Now Bugaboo. My friend Bill has two small kids and one of them is always in a Bugaboo Cameleon stroller. These are really high-end and expensive bits of kit. But they are very, very well made.

Now Bill's Bugaboo's brakes had developed a fault. They didn't always work and it was a minor annoyance. Little did Bill know that Bugaboo had identified this as a common fault and recalled the Cameleon.

Happily, Bill had filled out the warranty card for the stroller and sent it back when he bought it. One day a small package arrived unannounced containing a kit to fix the brakes. The kit worked perfectly.


In both cases, Apple and Bugaboo, we were dealing with premium brands and got premium support. Apple's ability to just give me new headphones made my experience wonderful, and Bugaboo simply sending the repair kit to Bill made him a loyal customer for life (he just needs to have some more kids).


Tuesday, June 30, 2009

The 1944 US Presidential Election was fraudulent

OK, it wasn't really, but I thought I'd run the Scacco/Beber analysis on that election and see what it comes up with. Guess what.

If you look at the non-adjacent, non-repeated digits in the last two places in the votes counts by state for Roosevelt and Dewey you discover that 59.38% of the votes are non-adjacent, non-repeated. If the numbers were truly random you'd expect 70%. That's way worse than the 62.07% in the Iranian election.

If you then do the old Z-Test you get a Z value of -2.49 with a p-value of 0.013. That's well below the 0.05 critical value so you can reject the null hypothesis. The final digits are not random.

Is this fraud?

Is there any suggestion that the state-level numbers in the 1944 US election were invented by people?

If not, how can anyone claim that this test indicates fraud in the Iranian election?

Now run the other bit of their test looking at the frequencies of the last digit. You get 'too many' 7s (expected 10%, got 16%) and 'too few' 1s (expected 10%, got 5%).

I'm telling you, man, what's the chance of that happening, and the non-adjacent, non-repeating digits thing? (It's about 0.17% according to simulation) I mean, come on, that's gotta be fraud.

Oh, wait, it's not.


The Iranian Election Detector

OK, I thought I was done criticizing the Washington Post Op-Ed about how statistics leave 'little room for reasonable doubt' that the Iranian election was fraudulent. But then Hannah Devlin at The Times did her own analysis and it got me thinking about the errors in that article again.

Firstly, my previous post talks about the right way to determine whether the digits are random or not, I'm not going to go over that again, but I am going to go back over some of the actual figures that are presented in the article.

So begin with this quote:

But that's not all. Psychologists have also found that humans have trouble generating non-adjacent digits (such as 64 or 17, as opposed to 23) as frequently as one would expect in a sequence of random numbers. To check for deviations of this type, we examined the pairs of last and second-to-last digits in Iran's vote counts. On average, if the results had not been manipulated, 70 percent of these pairs should consist of distinct, non-adjacent digits.

Not so in the data from Iran: Only 62 percent of the pairs contain non-adjacent digits. This may not sound so different from 70 percent, but the probability that a fair election would produce a difference this large is less than 4.2 percent.

And there's a footnote:

This is a corollary of the fact that last digits should occur with equal frequency. For an arbitrary second-to-last numeral, there are seven out of ten equally likely last digits that will produce a non-adjacent pair. Note that we treat both 09 and 10 as adjacent.

Firstly, I believe they mean to say that they treat 09 and 90 as adjacent (not 09 and 10). That means that for any number there are two possible adjacent digits out of a ten, in other words 20% of digit pairs are adjacent, so 80% of digit pairs are non-adjacent.

In their article they say 70% 'distinct, non-adjacent'. OK, so their definition of non-adjacent means that you need to exclude repeats as well (so 23, 32 and 33 are all to be excluded).

They then present the argument that a figure of 62% or less will only happen in 4.2% of fair elections. Nowhere do they explain how they derived this figure, so I decided to run a simulation. (Hannah Devlin argues that this number is incorrect in her article, worth a read)

I ran a simulation of 1,000,000 elections that generate 116 counts of votes and I looked at the adjacent pairs of numbers in the vote counts and then I calculated the percentage of fair elections that would result in the same 62% or less as seen in the Iranian election. The figure is 2.66%. 2.66% of fair elections would produce the result (or 'worse') seen in Iran.

The difference, 4.2% vs 2.66%, comes about because the figure that they must have used is not 62%, but 62.07%. That is the actual number, to two decimal places, that comes from analyzing the digit distribution in the Iranian election results.

(Email me if you want my source code)

So, what does that tell you? That in almost 3 in 100 fair elections we would have seen the result in Iran. Or if you use their numbers 4 in 100. Either way that's pretty darn often. In the 20th century there were 26 general elections in the UK. Given their 4/100 number is 1/25 we shouldn't be at all surprised if one of those general elections looked fraudulent!

Now, we expect that the percentage of non-adjacent digits is normally distributed. And, in fact, my little simulation shows a nice little normal distribution centered on 70 with a standard deviation of 4.27.

So, we've got normally distributed data, a mean and a standard deviation and a sample (62.07%). Hey, time for a Z-test!

For this situation the Z value is -1.86 which yields a p-value of 0.063 for a two-tailed test (I'm doing two-tailed here because what I'm interested in is the deviation away from the mean, not the specific direction it went in). That's above the 0.05 value typically used for statistical significance and so we can't from this sample determine that there's statistical significance in the 62.07% figure.

So, I'd say that based on the figures given I can't find statistical significance. So I don't learn anything from that about the Iranian election.

Given that the Z-test on their 'non-adjacent, non-repeated' digits test doesn't find statistical significance, and my previous piece showed that the chi-squared test on the other claim in their paper didn't find statistical significance (that was on the randomness of the last two digits).

You might be scratching your head wondering how the authors made the claim that this was definitely fraud (their words: 'But taken together, they leave very little room for reasonable doubt.')

Well, what they do is take the probability of seeing the 62% or less number in a fair election (4.2%) and multiply it by the probability of seeing the specific variance they see in the digits 7 and 5 in a fair election (4%) to come up with 1.4% likelihood of this happening in a fair election:

More specifically, the probability is .0014 that a fair election (with 116 vote counts) has the characteristics that (a) 62% or fewer of last and second-to-last digits are non-adjacent, and (b) has at least one numeral occurring in 17% or more of last digits and another numeral occurring in 4% or fewer of last digits.

That's a very specific test. In fact, it's so specific that I'm going to name it the "Iranian Election Detector". It's a test that's been crafted from the data in the Iranian election results, it's not the test that they started with (which is all about randomness of digits, and adjacency).

So, let's accept their 1.4% figure and delve into it... that's 1.4 in 100 elections. That's roughly 1 in 71. So, they are saying that their test would give a false positive in 1 in 71 elections.

How is that 'leaving little room for reasonable doubt'?


Thursday, June 25, 2009

The Scacco/Beber analysis of the Iranian election is bogus

OK, I wasn't going to write another blog entry about the 2009 Iranian election, but the article in the Washington Post that supposedly gives statistical evidence for vote fraud just won't die in the blogosphere and just got a boost from a tweet by Tim O'Reilly.

The trouble is the analysis is bogus.

The authors propose a simple hypothesis: the last and second-to-last digits of vote counts should be random. In statistical terms this is often called uniformly distributed, which just means that they are each equally likely. So you'd expect to see 10% 0s, 10% 1s, 10% 2s, and so on.

Of course, you only expect to see that if you had an infinite number of vote counts because the point about random processes is that they only 'even out' to the expected probabilities in the long run. So if you've got a short run of numbers you have to be careful because they won't actually be exactly uniform.

To confirm that try tossing a coin six times. Did it come up with exactly 3 heads and 3 tails? Probably not, but that doesn't mean it's unfair.

Now, given some run of numbers (vote counts for example), the right thing to do is ask the statistical question "Could these numbers have occurred from a random process?" If they couldn't then you can go looking for some other reason (e.g. fraud).

The question "Could these numbers have occurred from a random process?" is given the ugly name the 'null hypothesis' by stats-heads. That just means that thing you are testing.

More concretely, the Scacco/Beber null hypothesis is "the last and second-to-last digits in the vote counts are random". What you want to know is with what confidence can you reject this, and for Scacco/Beber rejecting means fraud.

Now, what you don't do is go count the last and second-to-last digits, look for some that have counts that deviate from what you expect (the exactly 10% figure) and then try to work out how often that happens. That's like tossing a coin a few times, noticing that heads has come up more than 50% of the time and then starting to think the coin is biased.

Unfortunately, that's essentially what Scacco/Beber did. They picked on two numbers that lay outside their expected value and went off to calculate how frequently that would occur. That's cherrypicking the data.

What you do do is apply a chi-square test to figure out whether the numbers you are seeing could have been generated by a random process. And you use that test because it gives you the probability with which you can reject your null hypothesis.

To prevent you, dear reader, from having to run the test I've done it for you. I took their data and wrote a little program to do the calculation against the last and second-to-last digits. Here's the program:

use strict;
use warnings;

use Text::CSV;
my $csv = Text::CSV->new();

my %la;
my %sl;

foreach my $i (0..9) {
$la{$i} = 0;
$sl{$i} = 0;

my $count = 0;

open I, "<i.csv";
while (<I>) {
my @cols = $csv->fields();
for my $i (@cols[1..4]) {
my @d = reverse split( //, $i );
close I;

print "Count: $count\n";

my $e = $count/10;

my $slchi = 0;
my $lachi = 0;

foreach my $i (0..9) {
print "$i,$e,$sl{$i},$la{$i}\n";

$slchi += ( $sl{$i} - $e ) * ( $sl{$i} - $e ) / $e;
$lachi += ( $la{$i} - $e ) * ( $la{$i} - $e ) / $e;

print "slchi: $slchi\n";
print "lachi: $lachi\n";

Here's a little CSV table that you can steal to do your own analysis:

Digit,Expected Count,Second-to-last Count,Last Count

And true enough I get the same figures as Scacco/Beber. The number 7 does occur 17% of the time in the last digit, and the number 5 only occurs 4% of the time. But, I don't care. What I want to know is, is the null hypothesis wrong. Could these results have occurred from a random process? And with what likelihood.

So here's where I avoid staring at the numbers (which can get to be borderline numerology) and do the chi-square test.

For the last digit the magic chi-square number is (drum roll, please): 15.55 and for the second-to-last digit it's 9.33. Then I go to my chi-square table and I look at the row for 9 degrees of freedom (that corresponds to the 10 possible digits; if you want to know why it's 9 and not 10 go read up on the subject) and I see that the critical value is 16.92.

If either of my numbers exceeded 16.92 then I'd have high confidence (greater than 95%) that the digit counts were not random. But neither do. I cannot with confidence reject the null hypothesis, I cannot with confidence say that these numbers are not random, and I cannot with confidence, therefore, conclude that the vote counts are fraudulent.

What this means is, is that there is no 'statistically significant' difference between the Iranian results and randomness. So, what we learn is that this statistical analysis tells us nothing.

It doesn't mean that the numbers weren't fiddled, it just means that we haven't found evidence fiddling.

PS In the notes added to their annotated version of the article Scacco/Beber mention that they did the chi-square test and got a p-value of 0.077. This is below the 'statistical significance' cut off of 0.05 and so their results are (as I find) not statistically significant.

To put 0.077 in context it means that there's a 7.7% chance that the digits are random. Sounds small but 7.7 is approximately 8 in 100 or 4 in 50 or 2 in 25 or ... 1 in 12.5. i.e. in 1 in every 12.5 fair elections we shouldn't be surprised to see the sort of figures we saw in Iran. That's pretty often! That's why chi-square tells us not to find non-randomness in the Iranian results.

30 June 2009 Update: I've removed that paragraph because that interpretation of the p-value is arguably inaccurate and if you are a statistician you'd probably shout at me about it. Doesn't change the fact that the data says the Iranian result is not statistically significant; it just says that my attempt to do a 'layman's version' is faulty.

To come up with better layman's version I ran a little simulation to find out how often you'd expect to see one digit occurring more than 17% of the time with another occurring less than 4% of the time (as in the Iranian election). The answer is about 1.48% of the time, or in about 1 in 67 fair elections.


Britannica.com makes me want to weep

I got a marketing mail from Britannica.com trying to entice me back after I canceled my subscription. So, I figured I'd just go take a quick look at a random Britannica entry and remind myself of what I was missing. Nightmare.

On the Britannica.com home they were mentioning that their article about the US Voyager program was featured and I could see it for free. So I clicked.

This featured article contains 503 words that give the briefest of introductions to Voyager. The related articles are all about the planets that Voyager passed, and there's a connection to a general article about space exploration. There's absolutely no drill down to explore Voyager in any depth.

Of course, I whizzed over to Wikipedia and looked up the same subject. The main article contains 2,009 words and links to in-depth articles about Voyager 1 and Voyager 2. And there are links to interesting articles about their voyages, their power systems, the Voyager Golden Record and more.

And Wikipedia links you straight to the definitive source for Voyager information: NASA's Voyager Program page. Britannica doesn't link; they choose to link to a small collection of images of the Voyager craft from NASA's web site.

So, basically Britannica.com's article is close to useless because it's a dead-end and a short dead-end at that. In contrast, Wikipedia's article is rich, links to even more information and lets me get to source material.

And if that's not enough Britannica.com's page is infested with distracting ads. The worst of these are the weird keyword-linked ads buried right inside the article itself.

It looks like you might be able to click on, say, solar system in the article to drill down. Far from it! Hover over solar system and you get the following irrelevant, useless, pop-up ad.

Pure genius, Britannica.com. Pure, pure genius.

Now, Britannica.com's article does contain some drill down, but some of it is useless. For example, the Voyagers each contain a phonograph record with a recording of sounds from Earth (language, music, etc.). On the Britannica.com page the words phonograph record are a link. Click through and they will tell you what a phonograph record is, not about the ones on board the Voyagers. Thanks, I'm old enough to know what a phonograph record is.

So, Britannica.com, now you know why I donate money to Wikipedia, and don't buy your service.