Wednesday, April 14, 2010

The right people with the wrong idea

A couple of months ago I was hanging around the Y Combinator offices scouting start-ups when Paul Graham introduced me to a guy called Paul Biggar. Paul had this crazy idea about offering a better commenting system for news web sites. A sort of Disqus just for newspapers. I didn't think it was a very good idea, but I offered to help anyway because Paul seemed like the right person with the wrong idea.

I've spent my entire career in technology start-ups (and, briefly, in venture capital) and if there's one thing that's invariant it's that it's better to find the right people with the wrong idea than the wrong people with the right idea. The right people will be able to change their idea to fit the market, the wrong people will rarely capitalize on the right idea.

So it was no surprise to me that Paul came back with a new idea: a news web site built around journalist brands with revenue sharing between the journalist and the web site. Now that seemed like an interesting idea. It's a sort of anti-Economist where the name of the journalist matters more than the brand of the overall site. I said I'd help and emailed journalist and writer friends to get them to look into it. I also sat down over the Easter weekend and wrote three articles for the future web site.

Yesterday, the resulting web site called NewsTilt went live. My three stories are:

Ode to the Number 11 bus.

If you're visiting London, stop before you spend £50 ($76) on a sightseeing tour and consider taking a bus used by Londoners to get to work. It might not sound like the best idea for an out of town visitor, but at £1.20 ($1.80) per person a trip along the 7 miles of the number 11 bus route will let you see the sights in true London double-decker bus style.


Long haul heaven

For many people the thought of a long haul flight is enough to fill them with dread and loathing. They loathe the indignities of airport security, the stale food and staler air, the cramped seats and cramped conversation. But I love a good long haul from London to San Francisco, or Miama to Buenos Aires. I love it because when I step onto a 747, an L-1011 or an A340, I'm entering my mile-high monastery.


The missing element in travel: science

Many people wouldn't consider wrapping their head around some science to be an ideal way to spend a holiday. But science and enjoyment aren't incompatible. Here are seven ideas to get you out and about, and make you think.

Labels:

Monday, April 05, 2010

The Offal Tower

OK, that's not the headline that The Guardian chose, but here's me writing in The Guardian's Comment is Free section:

At the unveiling of Anish Kapoor's design for the Orbit tower it was compared to the Colossus of Rhodes and the Tower of Babel. But the history of those follies isn't auspicious. The Colossus of Rhodes was destroyed by an earthquake after standing for only a few decades, and the Tower of Babel was, the book of Genesis tells us, constructed to glorify those that constructed it.

I can't help wondering to what extent the ArcelorMittal Orbit is being built for the glory of Boris Johnson, Kapoor and Lakshmi Mittal. And as details emerge of its Olympic corporate entertainment role, it looks less and less like a work of art. But setting the motivation of the creators aside, the worst comparison of all is with the Eiffel Tower.

The rest is here

Labels:

Monday, March 29, 2010

Squaring two digit numbers in your head

All my life I've done mental arithmetic the 'wrong way': I've calculated from left-to-right, instead of right-to-left. So when I do something like 24 + 35 I'll see the 50 first and then the 9. This even applies when there's a carry and I'll do something like 36 + 56 as 80 + 12. I do the same thing for multiplication as well.

Turns out I'm not so weird after all (well, apart from the finding doing mental arithmetic fun bit). I've been reading Secrets of Mental Math: The Mathemagician's Guide to Lightning Calculation and Amazing Mental Math Tricks and the author, Michael Shermer, is just like me: he works from left-to-right.

He, like me, has found this to be a good system because it lets you discard digits early and not hold some enormous calculation in your head. For example, in the calculation 124 + 353 you can immediately say "four hundred" before doing the rest of the sum. This seems to free up headspace (at least for me) and let's me do the rest of the calculation. I'd do it like this: 124 + 353 = 400 + 24 + 53 = 400 + 70 + 4 + 3 = 477.

The books is filled with tricks for doing all sorts of mental calculations, including a nice section about estimation (I've always been an estimator) and working out things like tips and sales taxes. But the most fun part to me was a trick to let you do two-digit squares in your head really, really fast.

Quick, what's 272. Of course, the brute force way to do that is to calculate 27 x 27 which is a bit of a pain because it involves doing something like 27 x 20 + 27 x 7 = 540 + 189 = 729. But there's a much faster way.

Observe that 272 = 30 x 24 + 32. Since you probably know that 32 = 9 this means you have to calculate 30 x 24 + 9 which is relatively easy because the multiplication involves a multiple of ten which means it's really 3 x 24 and then add a zero.

So the rule is that if you want to square number X you first round it to the nearest multiple of 10, called that X + r, and then calculate X - r (i.e. round the same amount in the opposite direction). You calculate (X + r) x (X - r) and add back the square of the amount you rounded by, r2, which will be 1, 4, 9, 16 or 25.

This works because ( X + r ) x ( X - r ) + r2 = X2 - rX + rX - r2 + r2 = X2.

Example: 672 is 70 x 64 + 32 which is fairly easy to do in your head. And naturally the same trick works no matter how many digits you have, it's just that the multiplication gets harder.

The trick is especially impressive/easy with numbers near 100 because the multiplication becomes dead easy. For example, 962 is 100 x 92 + 42 which you (or at least I) can almost instantly see is 9216.

Labels:

Wednesday, March 24, 2010

An appeal for unremarkable women

Today across the blogosphere Ada Lovelace will be honoured by writing about her and other remarkable women in technology. But today I prefer to appeal to and for the unremarkable.

At the day job we've been hiring people continuously for a year for programming jobs. My entire team of 12 is all men, and in the last 12 months we've interviewed two women. Looking into all the candidates who've sent in an application, men overwhelm women with a ratio of 24:1. Even if we just hired people randomnly the chance of hiring a woman would be tiny.

I find this situation tragic. We've hired with blindness to the age of our employees or their backgrounds. In a team of 12 there are people who hail from Mexico, Northern Ireland, Greece (via Canada), and Poland. But not a single woman.

And yet I've known women who love these implacable machines as much as I. I've talked to women programmers about the deepest guts of computers, about algorithms and problems to be solved. Many of those women have recognized something that people outside the industry overlook: much of computer programming is a craft.

The people who made the smart phone in your hand, or built the system that waits patiently in your car for an accident to happen so it can fire the airbags in the right order, or made your web-based email and your music player, or built the software on the Voyager 1 spacecraft that keeps it looking longingly back at Earth as it leaves our grasp: those people are incredibly creative.

They have made beautiful things: beautiful things that everybody sees, and beautiful computer structures hidden out of sight but playing a part in all of our lives. They have struggled against constraints of time, space and resources, and they have made. Although software may seem ephemeral because it has no physical form, these people, men and women, have sculpted electricity to form numbers to form programs and make software that touches the physical world.

So, my appeal when thinking about Ada Lovelace is to all the women who, like me, will never be a world famous genius, who will never look like computer engineer Barbie, but, like the men I've worked with in the computer industry, love what they do. Don't think that computers are just for men to program and debug, but come place your fingers on a keyboard and share with me the joy of making.

And, perhaps, one day, I won't remark a woman in a team of computer programmers, because she'll be unremarkable. Or at least I'll only remark her because she's remarkable for her ability, not her missing Y.

Labels:

Friday, March 19, 2010

Responding to jacquesm's challenge

The other day on Hacker News a user posted an anonymous comment. Regular Hacker News participant jacquesm wanted to unmask the writer and posted a challenge to unmask the user.

He also emailed me because he thought I might be the anonymous user. I agreed to help him with a little bit of text mining. Jacques had a nice database of all Hacker News submissions and comments and gave me a 250Mb SQL file suitable for loading into MySQL. Unfortunately Jacques' database only had comments up until September 2009 so if the user who wrote the anonymous comment has joined recently then it won't be possible to identify them.

I quickly whipped up a naive Bayesian text classifier in Perl (similar to this one that I wrote about in Dr Dobbs five years ago) with one category per Hacker News user. I took the text of the anonymous comment and fed it to the classifier. The classifier used the complete set of comments for each user to train each category and then scored the anonymous comment against those categories.

Culling the list for users who have commented recently the 10 most likely users (according to a text classification) are (in order of likelihood): sh1mmer, stcredzero, gojomo, patio11, andreyf, anigbrowl, teej, physcab, thorax, run4yourlives.

Now none of those users actually commented on the thread in question. Assuming it was someone who was commenting and then switched to a different account the most likely candidates are (again in order): petercooper and jacquesm.

So did this approach work?

PS As a quick test of my classifier I ran it against one of jacquesm's own comments and got the following people in order of likelihood: jacquesm, geoscripting, kunqiana, jerf, mixmax.

Labels: ,

Thursday, March 11, 2010

My bio

Occasionally I get asked for some sort of official bio. Here's one people can use:

John Graham-Cumming is computer programmer and author. He studied mathematics and computation at Oxford and stayed for a doctorate in computer security. As a programmer he has worked in Silicon Valley and New York, and the UK and France. His open source POPFile program won a Jolt Productivity Award in 2004.

He is the author of a travel book for scientists called The Geek Atlas and has written articles for The Times, The Guardian, The Sunday Times, The San Francisco Chronicle and New Scientist.

He is CTO of Causata. He can be found on the web at jgc.org and on Twitter as @jgrahamc.


If you've heard of him at all, it's likely because in 2009 he successfully petitioned the British Government to apologize for the mistreatment of British mathematician Alan Turing.

Labels: ,

Wednesday, March 03, 2010

A welcome bunch of amateurs

Here's me writing in The Guardian's Comment is Free section:

We're all the children of amateurs: amateur parents. There's no government department that will certify you as a parent (thankfully), nor a university department where you get your PhD in being a daddy, nor a professional body ready to strike you off for not following mothering standards. But any parent who's held a newborn child in their arms has unconsciously taken the amateur's oath: "I may not be a professional, but I'm going to do whatever it takes to act like one."

You can read the rest here.

Labels:

Tuesday, February 16, 2010

The magic of sub-editors

In the print version of my Times article today there's been significant cutting to get it to fit into the space available. This is the magic work of sub-editors.

Here's the full text of the article with the words that remained in the sub-edited version (which appeared in the paper):

The history of science is filled with stories of amateur scientists who made significant contributions. In 1937 the American amateur astronomer Grote Reber built a pioneering dish-shaped radio telescope in his back garden and produced the first radio map of the sky. And in the 19th century the existence of dominant and recessive genes was described by a priest, Gregor Mendel, after years of experimentation with pea plants.

But with the advent of powerful home computers, even the humble amateur like myself can make a contribution.

Using my laptop and my knowledge of computer programming I accidentally uncovered errors in temperature data released by the Met Office that form part of the vital records used to show that the climate is changing. Although the errors don’t change the basic message of global warming, they do illustrate how open access to data means that many hands make light work of replicating and checking the work of professional scientists.

After e-mails and documents were taken from the Climatic Research Unit at the University of East Anglia late last year, the Met Office decided to release global thermometer readings stretching back to 1850 that they use to show the rise in land temperatures. These records hadn’t been freely available to the public before, although graphs drawn using them had.


Apart from seeing Al Gore’s film An Inconvenient Truth I’d paid little attention to the science of global warming until the e-mail leaks from UEA last year.

I trusted the news stories about the work of the IPCC, but I thought it would be a fun hobby project to write a program to read the Met Office records on global temperature readings and draw the sort of graphs (a graph) that show(ing) how it’s hotter now than ever before.

Since my training is in mathematics and computing I thought it best to write self-checking code: I’m unfamiliar with the science of climate change(climate science) and so having my program perform internal checks for consistency was vital to making sure I didn’t make a mistake.

To my surprise the program complained about average temperatures in Australia and New Zealand. At first I assumed I’d made a mistake in the code and used (having checked the results with) a pocket calculator to double check the calculations.

The result was unequivocal: something was wrong with the average temperature data in Oceania. And I also stumbled upon other small errors in calculations.

About a week after I’d told the Met Office about these problems I received a response confirming that I was correct: a problem in the process of updating Met Office records had caused the wrong average temperatures to be reported. Last month the Met Office updated their public temperature records to include my corrections.

Labels: ,

Thursday, February 11, 2010

Is your new technology crappy enough?

Email developed from a file transfer based protocol to the widely used SMTP (which was created in 1982). As a protocol it leaves much to be desired: senders can be forged, it's a playground for spammers, the ability to send anything other than plain text (ASCII) messages had to be added with duct tape at a later date, its error messages are cryptic and it has few facilities to deal with the complexities of back and forth exchanges of messages on the same subject. But email has one redeeming feature: its forward and store nature. The very essence of email is the ability to send a message, get it delivered to the recipient and allow that recipient to read the message when they choose to.

Instant messaging (starting with services like talk on Unix and other systems) are similarly poor. They allow two users (and sometimes more) to exchange text based message in real time. In general they do a poor job of doing anything else. Sending files or pictures often ends in disappointment as firewalls between recipients block the transaction. But for all its poor functionality IM is wildly popular, precisely because it allows simple communication in real time.

The same goes for SMS messaging. It's too short, character based and difficult to type (yes, Steve, even on an iPhone). Yet it took off. Although it is credited as having taken off because SMS charges were much lower than calling between mobile phones in Europe, I think there's another reason. SMS allows the recipient to decide what to do with the message. In that way, SMS is a polite technology. Unlike the telephone that interrupts the person answering it, SMS allows the essence of communication to get through quickly. (It is any wonder that MMS isn't similarly popular? If you send me a picture of your child doing something cute, we almost certainly want to have a conversation about it to satisfy human needs for affirmation.)

Skype is poor. For a long time I tried to use it for business purposes and it was too unreliable. Yet it's very successful at allowing people to talk cheaply for personal calls. The secret of its success, IMHO, is its ability to make the firewall problem that plagues IM go away. Even though calls vary wildly in quality, the user interface is difficult to use (and Skype insists on adding functionality that distracts from its core purpose), solving the firewall problem is its key value. Doing that made free voice and video calling possible.

Twitter is much derided, yet very, very popular. Why? It serves a need: asymmetric communication. Unlike Skype, IM, SMS or email, Twitter allows a person to graphcast: to send messages to people who have chosen to be attached to them. This functionality didn't exist in any previous form. This turns out to be very useful for marketing, and very useful for casual connections between people (without the people agreeing on the connection). In fact, Twitter shares an important characteristic with the web: it's permissionless.

Similarly, the web is rubbish. The web's greatest problem is that links rot, yet, in fact, that's the web's greatest triumph. Without trying to enforce connections between sites, the web allowed anyone to publish whatever they want... fast. Other competing hypertext systems were too complex and required too much central control to be successful. Twitter and the web work because there's no asking permission to follow or link.

Which brings me to Google Wave. What's Google Wave's single greatest piece of functionality that solves a real need? I can't see one. Google Wave looks like exactly what the designers said it was: "what email would look like if we designed it today". The trouble with designing systems after bathing in the mineral waters of Silicon Valley is you get functionality overload (as an aside, this is what makes Apple so successful: it's not what Apple puts in their products as much as what they leave out).

I wonder what the telephone would look like if it were designed in Mountain View today. "Well, Mr Bell, I'm sorry but your invention will never take off: the person on the other end is indistinct, I have to remember a number to get to them, I can't tell if they are even there to answer the call, I can't see them when I'm speaking, and how are we going to exchange pictures? Sorry, come back when you've invented telepresence".

So, next time you are inventing a communication technology (or, possibly, any technology), ask yourself: "Is this crappy enough to be successful?"

Labels:

Wednesday, February 10, 2010

A year without TV

So it's been a year. A year ago I moved house and didn't unpack the TV. It was a nice TV: a 42" flat panel display which when connected to the right sort of cable receiver could be used to watch broadcast TV. With the TV in a box, I never subscribed to any cable or satellite service. My house is without a TV receiver of any kind.

And I don't miss it. Only occasionally do I get that urge to switch on the goggle box and just watch some nonsense to se changer les idées (take my mind off things).

The real revelation came when I was staying in a hotel and for the first time ever simply didn't switch the hotel TV on. I knew that there was nothing worth seeing on it.

That's not to say that I haven't watched TV programmes. I'm an avid follower of 24 which I've watched by buying season pass via the iTunes Store. I've made use on occasion of the BBC iPlayer watching a total of 11 programmes in the last year (thank you iPlayer for keeping a record). The most interesting of those programs were The Secret Life of Chaos and Chemisty: A Volatile History.

And I've watched lots of movies through the DVD subscription service LoveFilm (the most recent of which was Skin).

Amusingly, I succumbed to the idea of watching rubbish on TV and decided to watch the most recent episode of So you think you can dance. It was a wonderful reminder of why I don't have a TV. In fact, it was like being parachuted into a strange world filled with consistently ugly, shallow people wearing too much make up. It was only by being away from TV for so long that I saw it like that, I'm sure if I'd been watching TV all along I would never have noticed (just like the proverbial frog being slowly heated in a pan of water).

But the thing I miss the least is TV news. It's all about panic and fear and not about analysis. I seriously wonder how much harm TV news is doing to society.

The nice thing is that the Internet can kill TV without killing TV programmes. I'm very happy to pay to rent DVDs and pay to buy individual programs. If the BBC hadn't made the programmes cited above available freely I would have been very tempted to pay for them individually.

I realize that some readers will wonder why I would pay for content when I could probably download it and violate copyright for free. The main reason is guaranteed quality. I probably could spend my time searching for those programmes on some torrent site, but just as I don't want to waste time channel hopping for something good to watch, I don't want to waste my time downloading torrents only to find they are corrupt, incomplete or overdubbed in Urdu.

It's a simple trade-off: I'll give copyright holder $X if they'll guarantee that I get a high quality copy of their programme when I want it.

Another proposed solution is the PVR. This is a bizarre solution which when compared to paying to download programmes from the Internet seems almost ridiculous. It works like this. You pay to receive a random selection of TV programmes broadcast at times you do not select. You then pay to have a device that you must programme to wake up and record those programmes so that you can then watch them when you want. You have to tell the PVR what programmes you want to watch before you know about them; you can't subsequently decide to watch something.

Labels:

The Rank Amateur

In 1937 an amateur American astronomer named Grote Reber completed construction of a 9 meter radio telescope in his back garden. By 1940 he had verified that there were radio signals coming from the heavens and by 1943 he had completed a radio frequency map of the sky. Reber, with his enormous hand-built dish, kick started radio astronomy and eventually sold his invention to the US government.

One of the biggest advances in the understanding of genetic inheritance was made by Gregor Mendel. Mendel was a Augustinian priest. In 1866 his paper on plant hybridization (Mendel had spent years observing and experimenting with pea plants) showed the existence of dominant and recessive genes. Mendel's discoveries went largely unnoticed until rediscovered two professional scienstists.

Amateurs like Reber and Mendel have made enormous contributions to science ever since science was called natural philosophy. So it's dismaying to see a professor from Oxford University write in The Guardian: "The most effective people at finding errors in scientific reasearch are scientists: it was professional glaciologists, after all, who exposed the error in the IPCC 2007 case study of Himalayan glaciers." To exclude the amateur is to deny a large part of the history of science.

Another amateur who made a big impact on science is Albert Einstein. Although Einstein had received a scientific education (after having been refused entry to the prestigious ETH Zurich) he was unable to find a research post and did all his pioneering work while working for the Swiss Patent Office.

I'm no Reber, Mendel or Einstein, but don't rule us amateurs out. In December 2009 the Met Office released thousands of records of temperature readings from around the globe stretching from the present day to 1850. These records form a vital part of the evidence that the globe is warming and the climate changing.

I thought it would be a fun hobby project to use those records to reproduce the worrying charts that show the increase in global temperatures. Since I'm a professional computer programmer I wrote software to process the Met Office data. You can see the result in this YouTube video.

Because I was working with unfamiliar data I put special functions into my program to ensure that I wasn't making any mistakes. To my surprise these functions began reporting that there was something wrong with temperature data in Australia and New Zealand. I whipped out a pocket calculator and checked that my program wasn't mistaken and then reported the problems to the Met Office. They quickly acknowledged that I was right.

Last month the Met Office released an update to CRUTEM3 and HADCRUT3, the critical data sets used to track global warming. The new version contains corrections for all the errors I reported.

Making a distinction between professional and amateur in science is artificial: what matters is the 'what' of science not the 'who'. And amateurs have by their very nature something that professionals don't need to have: passion. Without the comfort of a tenured position, a subsidized bed in an ivory tower, or a well funded laboratory, the only thing keeping amateurs going is a love for their subject.

PS In the comments a professional scientist wrote to object to my final paragraph. I urge you to read his comment since it makes good points. In my defense I didn't mean to say that professional scientists lack passion, just that that's all the amateurs have got. His point is that professionals need to have passion because the funding environment for science is so bad that they're certainly not in it for the money or security!

Labels: ,

Tuesday, February 09, 2010

Wing Kong

Lufthansa is running a competition to name one of their A380 jets. You can submit your own entry here.

My suggestion is Wing Kong. If you like it you can vote for me on their site.

Wing Kong: it's big, it's powerful, it comes from a strange, dark place (well, Toulouse). It's also just a little bit romantic.

Labels:

Monday, February 08, 2010

24 years of email

I first got an email address with an Internet @ in it in 1986. It was [email protected], or for those of you on JANET it was [email protected] (happily I only briefly used bang paths). In 24 years I think there have been three major end-user innovations: address books, MIME and email searching.

Address Books

Initially, I didn't need an email address book. Most of the people I was emailing were on the same domain (often the same machine) and so everything after the @ was irrelevant. And the number of people on email world-wide was so small that remembering their email addresses was easy (I don't mean remembering them all, just remembering the ones I needed to talk to).

And most people's domains hadn't reached the point where just using initials was unworkable. So most email addresses consisted of their initials. That made them short and rememberable. I don't recall anyone with a ridiculous address like [email protected]

But things changed: the Internet got bigger, people's addresses got more complex, I was communicating with more and more people. Hence address books.

MIME

The ability to send more than just plain text inside an email (even if it is actually being transmitted as 7-bit ASCII) was big. Prior to the introduction of MIME in 1992 there were some limited ways to send binary content in email (mostly using uuencode) but it was an ugly mess and mail clients often didn't know what to do with the contents and you were forced to save the mail to a file and manually unpack it.

Happily, MIME made that problem go away.

Email Searching

As email got considerably more widespread it became necessary to put it into folders to try and keep a handle on the volume. This led to the sort of trees of folders that are seen in programs like Microsoft Outlook. This is, IMHO, a less than optimal solution. The right solution is the sort of high-speed email searching offered by Google Mail. With it folders are completely irrelevant.

In fact foldering was such a pain that it was part of the reason I invented POPFile.

The Bad

Two bad things have happened since I started using email: spam (first spam was in 1978 on ARPANET, but I don't recall any unwanted messages during the late 1980s at all) and HTML email. HTML email has been a spammers playground and for messages I want to receive (i.e. everything other than marketing) it's almost useless.

Minor irritations are: vacation responders, people who don't edit replies sending me gigantic threads embedded in a message.

8 years

That's one major innovation every 8 years. With Google Mail being released in 2004 we've got another 2 years to wait for the next one. What do you think it will be? For me it has to be something to do with threading. That's still pretty messy, and Google Wave doesn't seem to have improved it. I don't think the little > is cutting it anymore.

Labels:

Wednesday, February 03, 2010

A compliment from The Times

The Times has kindly mentioned this blog as one of its Top 30 Science Blogs saying:

John Graham-Cumming is one of the few people out there who makes the nuts and bolts of computer programming actually sound interesting. Expect anything from an analysis of the statistical likelihood of election fraud in the last Iranian election to the unveiling of flaws in the Met Office’s global climate models.

Thanks!

Labels:

Thursday, January 28, 2010

John's Amazing Diet Secrets Revealed!

Now, at last, I can reveal the top diet secrets that doctors have been keeping from you! Yes, this is how I lost an AMAZING 9.9kg (21.8 pounds) in just 6 months doing absolutely no exercise at all.

Put down those weights, step off the StairMaster and follows these amazing simple steps to a better figure:

EAT LESS, EAT WELL

In April 2008 I decided that my 82.5kg (181.9 pounds) was too much for my height. Ideally I should have been 72kg (158.7 pounds), so I needed to lose 10.5kg (23.1 pounds) to be my ideal weight.

It's this image of me that really decided me to lose weight. Nothing like the shape of that stomach in public.

So, I read up on dieting and digested The Hacker's Diet. I ignored almost all the advice except for one thing: you can stuff your face with calories far faster than you can burn it off. That revelation lead me to the Colarie.

A single can of Coke is about 147 calories. How much exercise does it take to burn that off? Well just running it would take something like 15 minutes. And I thought why run? Just don't stick the stuff in your face in the first place.

So, I eliminated all soda and drank cold water instead. I also stopped ever eating sweets (candies). But that wasn't enough, I decided that I needed to eat less.

So, I simply looked at whatever food I was eating and ate about 50% of it and left the rest. I didn't order desserts, I stopped having sugar in coffee.

And, boy, was I hungry at first. After a while I wasn't hungry any more and I think I enjoyed the taste of food more. After a while the amount of food I was eating was simply less than before and I was no longer forcing myself to stop.

That was helped by eating slowly. By doing that I discovered when I was full and stopped eating. I didn't stop eating particular foods (other than the particular nasties like Coke, desserts and sugary snacks), I ate a wide variety of things (including foie gras and other delicacies). I just ate less of it.

The weight fell off me. Here's the magic chart:


I never quite made it to my goal, my weight stabilized at around 72.5kg (158.7 pounds) and remained there. Now whenever it goes up a bit I know what to do.

(BTW I'm not saying don't exercise, there are lots of good reasons to do exercise, but I don't think weight loss is one of them. Do it because you enjoy it, do it because it clears your head, etc.)

WEIGH YOURSELF

Weighing yourself is vital. I made it a ritual so that I got consistent results. Your weight will vary throughout the day so I had a simple technique:

1. Same time, same day each week: I weighed myself on Saturday mornings, immediately after getting up (after visiting the bathroom) and before eating.

2. Same situation: I always weighed myself naked so that there were no inconsistencies.

3. Once: I weighed myself once on that Saturday morning and recorded the result. I didn't fret about the result. I didn't reweigh myself.

KEEP TRACK

I kept all the weight data in a spreadsheet. From that I could draw an encouraging chart, and calculate how many more weeks were needed for me to reach my goal. And I could calculate my BMI (which seems like a bogus figure to me).

That's it: eat less, eat well, keep track.

PS I did do some exercise. During this period I didn't own a car and walked everywhere or took public transport. I was working at home and frequently would walk outside to get some sun and clear my mind.

Labels:

Tuesday, January 26, 2010

The squawking Squaw King was stabbed in a stab bed

Yesterday I tweeted: Realized that 'assisted' is 'ass is ted'. Are there other non-compound words in English which consist entirely of other words? and people replied with is land and cut lass.

Naturally, I couldn't resist writing a small amount of code to figure out other word sequences within words. Using a short program and a 57,000 word English dictionary of common words I had the answer: 12,870 words. That means 23% of English words have this property.

Of course, many are rather boring because they are just compound words. But others are more fun:

I have secretions of secret ions and a seepage (see page 21), but sematically my sematic ally says I am fatalistic and fatal is tic bite. But the fellow fell, ow! And asked, do we seal ants with sealants? I went to the palace to see my pal (ace) and said "Serge! Ants". He called for "Sergeants!". But with an antelope the ant elope.

I smelt an aroma: the rapist! Yet, it was just the aromatherapist.

You can get the full list here.

Update Here's the code

# ----------------------------------------------------------------------------
# Small program to find words that consist entirely of other words
# concatenated. An example is 'fatalistic' which is 'fatal is tic'
#
# Written by John Graham-Cumming
# ----------------------------------------------------------------------------

use strict;
use warnings;

# The first argument to the program is the filename of a dictionary of
# words, this dictionary will be searched for words consisting of word
# sequences. It should be simply one word per line.
#
# It is loaded into the %words hash.

my $dict = $ARGV[0];
my %words;

if ( open F, "<$dict" ) {
while (<F>) {
chomp;
$words{$_} = 1;
}
close F;
} else {
die "Cannot open dictionary file $dict\n";
}

# Check every word in the dictionary using the recursive function
# check_word. Note that I don't sort the words here since that might
# take a long time. Sorting can be done on the output.

foreach my $w (keys %words) {
my $sub = check_word($w);

if ( $sub ne '' ) {
print "$w ($sub)\n";
}
}

# check_word extracts ever longer subsequences of the word to be
# checked and sees if they are themselves words (by checking in
# %words). If a word is found then the remainder of the word is sent
# to a recursive call to check_word.
#
# For example, suppose we do check_word( fatalistic ), the code will
# check the following:
#
# check_word: fatalistic; found so far:
# f?
# fa?
# fat?
# check_word: alistic; found so far: fat
# a?
# check_word: listic; found so far: fat a
# l?
# li?
# lis?
# list?
# check_word: ic; found so far: fat a list
# i?
# listi?
# al?
# ali?
# alis?
# alist?
# alisti?
# fata?
# fatal?
# check_word: istic; found so far: fatal
# i?
# is?
# check_word: tic; found so far: fatal is
#
# This function returns an empty string if the word does not consists
# of other words, or a string containing the word broken down into
# space separated words
#
# e.g. check_word('fatalistic') returns ' fatal is tic'
# check_word('potato') returns ''

sub check_word
{
my ( $w, # The word to check
$depth ) = @_; # Contains the words found so far, or
# undefined when first called

if ( !defined( $depth ) ) {
$depth = '';
} else {
if ( defined( $words{$w} ) ) {
return "$depth $w";
}
}

for my $i (1..length($w)-1) {
my $fragment = substr($w,0,$i);
if ( defined( $words{$fragment} ) ) {
my $sub = check_word(substr($w,$i), "$depth $fragment");
if ( $sub ne '' ) {
return $sub;
}
}
}

return '';
}

Labels:

Tuesday, January 19, 2010

Stay classy, SoftwareFX, stay classy


Seriously, the Haitian earthquake is not a lead generation exercise.

Labels:

Thursday, January 14, 2010

Mendelian Randomization: getting genes to run randomized trials for you

One of the core elements of my day job is dealing with causal relations: we try to understand what cause caused an effect. An area where much work has been done in understanding causal relationships is medicine where randomized controlled trials are used to understand the relationship between taking a medicine and some outcome.

But some things are hard to perform a trial on. It's all very well if you have a medicine to try out, but what if you want to know if, for example, having low serum cholesterol is associated with an increased risk of cancer?

That's not an idle question, the Framingham Heart Study, was thought to have shown a relationship between serum cholesterol and cancer (e.g. The serum cholesterol-cancer relationship: an analysis of time trends in the Framingham Study). But the question is: given that there appears to be a relationship, is it causal? Does low serum cholesterol cause cancer?

It could be that it's the other way around (called reverse causation: Low Cholesterol May Be Marker of Undiagnosed Cancer): if you are likely to get cancer you are likely to have low serum cholesterol. Or it could be that there's a confounding factor: something causes both low serum cholesterol and cancer.

It turns out that genetics, and specifically the fact that genes are randomly assigned during meiosis (in humans, for example, half the genes come from the mother and half from the father). Gregor Mendel's law of independent assortment says that the alleles of genes are chosen randomly from the possible alleles when a baby is being formed from the genetic material of mother and father.

This Mendelian Randomization means that it's possible to have nature perform a randomized trial for you. If you can find an allele that affects the trait you are trying to understand you can use it to sample the population to look for a cause and effect relationship.

In the case of low serum cholesterol there's a specific allele associated with Apolipoprotein E. The variant Apo E2 is associated with low serum cholesterol. And because of Mendel's law of independent assortment it will be assigned randomly in the population.

In 1986 Martijn B. Katan published a letter in The Lancet pointing out that Apo E2 causes a rare disease where patients have almost zero serum cholesterol.

Since Apo E2 is randomly assigned by Mendel's laws it's enough to look at the population and examine cancer rates and their relationship to the presence of the Apo E2 gene. So a 'trial' can be run by selecting a control group from the population and examining the rate of Apo E2 in that control. Then a group with cancer is tested for Apo E2.

If there's really a connection between low serum cholesterol and cancer then the cancer group should have a higher prevalence of Apo E2 than the control. You can think of the presence of Apo E2 being random across the population, if it's less than random in the cancer group (i.e. there's more or less than expected) then a causal relationship can be inferred. One way to see that is to look at a causal diagram of the relationships.


The arrows in the diagram represent causal relationships.

1. There's an arrow from Apo E2 to serum cholesterol because it is known that this allele causes low serum cholesterol.

2. The hypothesis is expressed in the arrow from low serum cholesterol to cancer. It's that arrow that's being determined.

3. There are other factors (age, diet, location, illnesses) which could affect both serum cholesterol and cancer.

4. There's no arrow leading to Apo E2 because it is completely determined by Mendel's laws. There's also no arrow from Apo E2 directly to the other factors because they are not affected by Apo E2.

5. There's no arrow directly from Apo E2 to cancer because there's no known direct relationship between the two.

(Note that these assumptions have to be justified. For example, #1 needs biological justification, as does #5).

With those relationships in place it's just a matter of performing the statistical test on the control group and cancer group to see if more Apo E2 is present in the cancer group (there's more on that in Mendelian randomization as an instrumental variable approach to causal inference).

This technique has been used to show a causal relationship between alcohol intake and blood pressure (see Alcohol Intake and Blood Pressure: A Systematic Review Implementing a Mendelian Randomization Approach) and to show no causal relationship between a mother's BMI and the fatness of her offspring (see Exploring the Developmental Overnutrition Hypothesis Using Parental–Offspring Associations and FTO as an Instrumental Variable).

And what of low serum cholesterol and cancer? A study (Apolipoprotein E Genotype, Plasma Cholesterol, and Cancer: A Mendelian Randomization Study) from 2009 concludes: "These findings suggest that low cholesterol levels are not causally related to increased cancer risk."

Thanks, Mendel!

Labels:

Sunday, January 10, 2010

Blog Greatest Hits 2009

At the start of May 2009 I added Google Analytics to jgc.org and so it's easy to find out the most popular stories on my site for 2009. Here are the top 11 by page views:

  1. Just give me a simple CPU and a few I/O ports

  2. Letter to Her Majesty The Queen: Strange that this one should be so high

  3. Tonight, I'm going to write myself an Aston Martin: An entry from 2008 that's a perennial favourite

  4. Parsing HTML in Python with BeautifulSoup

  5. Double-checking Dawkins: An entry from 2007 that gets constant traffic

  6. Data Visualization Disease

  7. Hello John, it's Gordon Brown: The end of the Alan Turing petition campaign

  8. How I got 50,000 page views by simply being me

  9. What is jsHub?

  10. If you build it, they will ignore it: How I marketed The Geek Atlas

  11. In which I switch on a 30 year old computer and it just works

Labels:

Saturday, January 09, 2010

Simplifying my OLPC XO-1 Temperature Sensor

Back in 2008 I wrote about a little circuit to turn the OLPC XO-1 Measure application into a digital thermometer. That circuit required two 9 volt batteries, 11 components and a PCB (plus connectors)

A few weeks ago I got asked about making a commercial version of the probe which naturally led to thinking about how to simplify the circuit. I've now got the entire circuit down to a single component that costs 50p in bulk. I've eliminated all the rest (except the connectors) and the circuit is entirely powered by the OLPC itself.

I actually tried a total of four designs for this circuit.

Design 1: The Original The original circuit looked like this:


It works, but it's a bit awkward since it requires those external batteries.

Design 2: Dump the op-amp One simple thing to do is just make a parallel adder with a few resistors and a reference voltage (the original 0.45v from Design 1) from a voltage divider and not worry about all the stability that the op-amp brings.

Here's that circuit under test. It works, but it can be made even simpler.


Design 3: Diode bias After mentioning this project on Hacker News jacquesm suggested trying an even simpler circuit: the original LM35 temperature sensor with a diode inserted between its ground pin and ground to bias ground to the forward voltage drop of the diode (typically 0.6v) which would bring the output voltage for household temperatures into the range of the OLPC.

That worked nicely (I had a 1N4007 lying around which has a forward voltage of up to 1.1v) and here's the circuit.


Design 4: Switch the sensor And then I discovered that there's a pin compatible replacement for the LM35 called the TMP36 which output 0.75v at 25C with 0.01v per C. That means that at 0C it outputs 0.5v and at 100C it output 1.75v. That puts its output inside the range the OLPC can sense. And it can run on the 5V from the USB port. And it has low output impedance.

So the final circuit has a single component. Here it is under test:


And here it is soldered to a connector ready for connection to the OLPC via my original USB/Mic In connector cable.


So if you want to make a really simple temperature probe for the OLPC XO-1 then get a TMP36.

Now all I need to do is find a supplier of stainless steel probes to put the TMP36 in and I can start making them.

PS Ever since I had eye surgery and stopped wearing glasses I've been worried about splashing solder in my eyes. So I wear a pair of Panther Vision LED Lighted Safety Glasses which protect the eyes and let you see what you are doing at the same time.

Don't you wish your boyfriend was hot like me

Labels: ,

Wednesday, January 06, 2010

The Ikea Lillabo Processing code

By popular demand... here's the code, written in Processing that actually draws the train sets. I hadn't released it because I didn't think it was very interesting, but you are welcome to it.
// --------------------------------------------------------------------------
//
// Small program to draw pictures of Ikea Lillabo track layouts using
// instructions derived from my Perl program.
//
// Written by John Graham-Cumming (http://www.jgc.org/)
//
// Released under the General Public License, version 2
//
// --------------------------------------------------------------------------

// This is the cursor position (x, y) coordinates and angle to the
// horizontal

float x, y, a;

// The length in pixels of a single straight piece

float len = 40;

// See the Perl program for a full explanation, but there are 8 curves
// in a circle and from that the radians of curve arc, the length of the
// straight line between the curve ends and the curve angle to the
// horizontal can be calculated.

float curves_in_circle = 8;
float radians_in_curve = 2 * PI / curves_in_circle;
float curve_angle = radians_in_curve / 2;
float curve_length = len * 2 * cos(PI/2 - radians_in_curve/2);

// The Processing equivalent of main()
void setup() {

// Set up the basic parameters for the drawing: a 1000x1000 canvas,
// with a white background. None of the elements drawn will be filled
// and the lines will be four pixels wide.

size(1000,1000);
background(255,255,255);
strokeWeight(4);
noFill();

// These are the nine possible layouts discovered by the Perl program
// and were copy/pasted here. Clearly this would be better done
// dynamically with this program reading the Perl program's output.

int layouts = 9;
String s[] = new String[layouts];

s[0] = "AAAACCAAAASSAAB";
s[1] = "CCCCCCSSAAAAAAB";
s[2] = "CAAAAACASSAAAAB";
s[3] = "CAAAAASCASAAAAB";
s[4] = "CAAAAASSCAAAAAB";
s[5] = "AAACAACAAASSAAB";
s[6] = "ACAAAASACSAAAAB";
s[7] = "ACAAAASSACAAAAB";
s[8] = "AACAAASSAACAAAB";

// (row, col) is the row and column position of the next layout to draw
// starting from the top left of the canvas. Since we know there are
// 9 the loop below lays them out 3x3. h is the height of space
// reserved for a layout.

int row = 0;
int col = 0;
int h = 250;
int w = h + 50;

for ( int j = 0; j < layouts; j++ ) {

// Start 200 pixels from the top left corner and with an initial
// angle of 0

a = 0;
x = 200 + w * col;
y = 200 + h * row;
col++;

if ( col == 3 ) {
col = 0;
row++;
}

for ( int i = 0; i < s[j].length(); i++ ) {
switch(s[j].charAt(i)) {
case 'B':
bridge();
break;

case 'C':
clockwise();
break;

case 'A':
anticlockwise();
break;

case 'S':
straight();
break;
}
}
}
}

// Function to draw a piece and update (x, y) and a
void draw_piece( float l, // Length of piece to be drawn
float ang ) // The angular change due to the piece
{
// If the ang is zero then this is a straight piece so use line(), if
// non-zero then it's a curve and so use arc()

if ( ang == 0 ) {

// (dx, dy) is the end of the piece truncated (the 0.8 multiplier)
// to leave a little gap between pieces.

float dx = x + l * 0.8 * cos(a + ang);
float dy = y + l * 0.8 * sin(a + ang);
line( x, y, dx, dy );
} else {
int h = (ang<0)?-1:1;

// (ox, oy) is the location of the centre of the circle on which the
// arc we are drawing lies. s and e are the starting and ending angles
// of arc to draw. Note that s must be less than e. Note the 1.5 here
// is used to shorten the arc to leave a small gap between pieces.

float ox = x - h * len * cos(PI/2-a);
float oy = y + h * len * sin(PI/2-a);
float s = a;
float e = a + ang * 1.5;
if ( s > e ) {
float t = e;
e = s;
s = t;
}

// The PI/2 adjustment here is needed because the angles in s and e are
// derived from a which is to the horizontal and the arc() function needs
// angles to the vertical

ellipseMode(CENTER);
arc( ox, oy, len*2, len*2, s - h * PI/2, e - h * PI/2 );
}

// Update (x,y) and a to be at the end of the new piece that's been
// added and with the correct orientation.

x += l * cos(a + ang);
y += l * sin(a + ang);
a += 2 * ang;
}

// Four functions to draw the four pieces giving them different colours.

void bridge()
{
stroke(255,0,0);
draw_piece(2*len,0);
}

void straight()
{
stroke(0,255,0);
draw_piece(len,0);
}

void clockwise()
{
stroke(255,0,255);
draw_piece(curve_length,curve_angle);
}

void anticlockwise()
{
stroke(0,0,255);
draw_piece(curve_length,-curve_angle);
}

Labels:

Tuesday, January 05, 2010

More fun with toys: the Ikea LILLABO Train Set

As further proof of my unsuitability to be a child minder (see previous post) I found myself playing with an Ikea LILLABO 20-piece basic set train.


The train set has 16 pieces of track (12 curves, two straight pieces and a two part bridge) and 4 pieces of train. What I wondered was... how many possible looping train tracks can be made using all 16 pieces?

The answer is... 9. Here's a picture of the 9 different layouts.


The picture was generated using a little program written in Processing. The bridge is red, the straight pieces are green and the curves are blue or magenta depending on whether they are oriented clockwise or anticlockwise. The curved pieces can be oriented in either way.

To generate those layouts I wrote a small program which runs through all the possible layouts and determines which form a loop. The program eliminates duplicate layouts (such as those that are mirror images of each other).

It outputs a list of instructions for building loops. These instructions contain 15 characters C (clockwise curve), A (anticlockwise curve), S (straight piece) and B (the entire bridge). Here are the building instructions for all the shapes in the picture above:

AAAACCAAAASSAAB
CCCCCCSSAAAAAAB
CAAAAACASSAAAAB
CAAAAASCASAAAAB
CAAAAASSCAAAAAB
AAACAACAAASSAAB
ACAAAASACSAAAAB
ACAAAASSACAAAAB
AACAAASSAACAAAB

PS You can actually create more possible layouts than these because the Ikea pieces have a lot of play in them and can be abused to form other shapes. But where's the fun in that?

PPS Ikea can actually double the fun by adding just two more pieces. If they added two more curved pieces the number of possible tracks doubles to 18. If they added four curved pieces then the number of possible tracks goes to 130.

PPPS Here's the code
use strict;
use warnings;

# ----------------------------------------------------------------------------
# Program to determine all the possible valid closed loop train tracks
# possible using the Ikea Lillabo 20-piece basic train set (see
# http://www.ikea.com/gb/en/catalog/products/30064359). Note that
# although this is 20-piece there are only actually 16 pieces of track
# (the train itself is in four pieces).
#
# Written by John Graham-Cumming (http://www.jgc.org/)
#
# Released under the General Public License, version 2
#
# v1.0 - First public release
# ----------------------------------------------------------------------------

# Universal constant derived from 1 Kings 7:23 :-)

my $PI = 3.1415927;

# The length of a single straight piece. For the purposes of the
# program this is set at one but could be made into a real value if
# you wanted to output actual track lengths.

my $unit = 1;

# The number of curved pieces that make a full circle. Determined
# empirically. And the angle of the segment determined by a single
# curved piece in a circle.

my $curves_in_circle = 8;
my $radians_in_curve = 2 * $PI / $curves_in_circle;

# The 15 possible pieces: a straight edge (which has $unit length) and
# is used as the measurement for everything else, a bridge (which is
# twice the length of the straight edge; it is actually supplied in
# two pieces but for the purposes of this program is considered to be
# a single piece) and a curve (using $radians_in_curve above the
# length of the straight line between the ends of the curve is
# calculated).
#
# Each entry consists of three parts:
#
# length: the length in a straight line between the ends of the piece
# at its centre.
#
# angle: the angle (in radians) between the straight line through the
# piece and a tangent to the curve at the piece's start. This only
# applies to curved pieces where the straight line is the line joining
# its two endpoints.
#
# count: how many of these pieces are supplied.
#
# Note that curves can be placed in either a clockwise or
# anticlockwise direction. The program detects this by looking at the
# angle. If it's non-zero then the piece has two orientations.

my %pieces = (
Bridge => { length => $unit * 2,
angle => 0,
count => 1 },

Straight => { length => $unit,
angle => 0,
count => 2 },

# Here's a curved piece, the angle a is $radians_in_curve, the
# length l of a side is $unit. So length is the distance between
# the points labelled s and f. Bisect the angle a you get a right
# angle triangle with hypotenuse of length $unit and angle at the
# vertex of $radians_in_curve/2. So the angle b is $PI/2 -
# $radians_in_curve/2. By simple trigonometry the length is twice
# $unit * cos($PI/2-$radians_in_curve/2).
#
# s
# C
# . . C
# . b C
# l . . C
# . C
# . . C
# . a C
# . . . . . . . C
# o f
#
# To calculate the angle to the tangent at point s (the angle c),
# note that the angle formed by os and the tangent is a right angle
# (since os comes from the centre of the circle). So b+c is $PI/2
# but b is $PI/2 - $radians_in_curve/2 and so c is
# $radians_in_curve/2
#
# s
# .
# . . .
# . b c .
# l . . .
# . .
# . . .
# . a .
# . . . . . . . . . . .
# o f
#
Curve => { length => 2 * $unit * cos($PI/2-$radians_in_curve/2),
angle => $radians_in_curve/2,
count => 16 }
);

# This structure contains the position of the end of the current
# placed track (the X and Y coordinates) and the angle of the end of
# the piece of track in radians to the horizontal. Initially this is
# defined as location (0,0) and with an angle of 0.
#
# The goal will be to place pieces such that the cursor ends up at the
# %origin.

my %origin = (
angle => 0,
x => 0,
y => 0
);

# Some basic test tracks to make sure that the code does what I
# expect. This hash contains track layouts and the expected final
# position of the cursor for tests

my %tests = (
  B => { x => 2 * $unit, y => 0, angle => 0 },
  S => { x => $unit, y => 0, angle => 0 },
  C => { x => 0.707, y => 0.293, angle => 0.785 },
  A => { x => 0.707, y => -0.293, angle => -0.785 },
  AA => { x => 1, y => -1, angle => -1.570 },
  CC => { x => 1, y => 1, angle => 1.570 },
  BSS => { x => 4 * $unit, y => 0, angle => 0 },
  CCCC => { x => 0, y => 2 * $unit, angle => 0 },
  AAAA => { x => 0, y => -2 * $unit, angle => -3.14 },
  CCCCCCCC => \%origin,
CCCCCCCCCCCCCCCC => \%origin,
AAAAAAAA => \%origin,
AAAAAAAAAAAAAAAA => \%origin,
BCCCCCCSSAAAAAA => \%origin,
CCCCSCCCCS => \%origin,
CCCCACCACCCCACCA => \%origin
);

foreach my $t (sort keys %tests) {
my %ac = %origin;
my @as = split( '', $t );
build_track( \%ac, \@as );
if ( !cursor_match( \%ac, $tests{$t} ) ) {
die "Test $t failed: $ac{x} $ac{y} $ac{angle}\n";
}
}

# To avoid getting the same track in different directions we keep
# track of tracks that we've found for testing. The keys of this hash
# are the strings that describe the track layout (e.g. CCCCCCSSAAAAAA
# without the bridge piece).

my %found;

# Since there are only two interesting pieces of track (curve and
# straight) the algorithm for trying out all the combinations of
# pieces is broken into two parts. The outer loop runs through all
# combinations of curved pieces joined together with each piece
# getting to go clockwise or anticlockwise. This is done by
# generating all binary numbers between 0 and 2^12-1 (i.e. all 12 bit
# binary numbers) and using the bits to indicate whether a track piece
# goes clockwise (a 0) or anticlockwise (a 1).
#
# To avoid seeing the same track twice in symmetrical layouts the
# curved pieces are constrained to always finish with an anticlockwise
# piece. This is done by making the top bit in this loop be 0 by
# reducing the top number to 2^11-1

foreach my $curve (0..2**($pieces{Curve}{count}-1)-1) {

# Create an array containing the instructions for inserting
# pieces. The array is alpha and has entries B, C, A and S. A means
# add curved piece anticlockwise, S means add straight piece, C
# means a curved piece clockwise and B a bridge. This array will be
# fed into move_cursor to build the final track.
#
# The following piece of code uses the sprintf function to extract
# the bits from the $curve value into an array and then transform a
# 0 bit to A and a 1 bit to C.

my @instructions = map {$_?'C':'A'}
split( '', sprintf( "%0$pieces{Curve}{count}b", $curve ) );

# Then for each combination of curved pieces it's just a question of
# inserting the straight pieces in all possible positions. If there
# are 12 curved pieces then there are 13 possible places each
# straight piece can be inserted.
#
# To make it possible to modify the number of straight pieces the
# program doesn't assume that there are only two (which could just
# have been done with a pair of loops). So an array is initialized
# that will contain the positions of the n straight pieces and it is
# used to create the combinations.
#
# The initial position of the straight pieces is 0 and then 0
# (assuming just two) indicating that they are inserted before the
# first curved piece.

my @straight;

foreach my $i (0..$pieces{Straight}{count}-1) {
$straight[$i] = 0;
}

# Now run through all the possible valid straight positions and
# insert them into the curved positions

while (1) {
my @in = @instructions;

# Add the straight pieces to the list of 'instructions' at the
# appropriate places and then place the pieces. This creates a
# new array, @build, containing the track layout.

my @build;
my $s = 0;
my $c = 0;
while ( ( $s < $pieces{Straight}{count} ) ||
( $c < $pieces{Curve}{count} ) ) {
if ( $s < $pieces{Straight}{count} ) {
if ( $straight[$s] == $c ) {
push @build, ('S');
$s += 1;
next;
}
}

if ( $c < $pieces{Curve}{count} ) {
push @build, ($in[$c]);
$c += 1;
}
}

# Now add the bridge at end. Since there's only one bridge this
# simplification is made and it makes no difference to the outcome
# since all other combinations of straights and curves will be
# generated.

push @build, ('B');

my %cursor = %origin;
build_track( \%cursor, \@build );

# Finally see if this track layout results in a loop. The test
# for this is simple: is the last piece (represented by the
# %cursor) at the %origin position and orientation?
#
# If it matches in one direction then pop off the bridge, reverse
# the track, push back the bridge and try to rebuild. This will
# eliminate tracks that don't actually match up because the final
# piece approaches the bridge from the wrong direction.

if ( cursor_match( \%cursor, \%origin ) ) {
pop @build;
%cursor = %origin;
my $build1 = join('',@build);
@build = reverse @build;
my $build2 = join('',@build);
push @build, ('B');
my $build1a = $build1;
$build1a =~ tr [AC] [CA];
my $build2a = $build2;
$build2a =~ tr [AC] [CA];

if ( !exists( $found{$build1} ) ) {
build_track( \%cursor, \@build );
if ( cursor_match( \%cursor, \%origin ) ) {
my $build3 = undef;
if ( $build1 =~ /^(.*)SS(.*)$/ ) {
$build3 = join('',reverse split('',$1)) .
'SS' . join('',reverse split('',$2));
}
if ( !defined( $build3 ) ||
!exists( $found{$build3} ) ) {
print @build, "\n";
$found{$build1} = 1;
$found{$build2} = 1;
$found{$build1a} = 1;
$found{$build2a} = 1;
if ( defined( $build3 ) ) {
$found{$build3} = 1;
}
}
}
}
}

# Move the straight pieces to their next positions, this is where
# the while loop will exit once all the positions have been tried.

foreach my $i (0..$pieces{Straight}{count}-1) {
my $j = $pieces{Straight}{count}-1-$i;
if ( $straight[$j] < $pieces{Curve}{count} ) {
$straight[$j] += 1;
last;
} else {
$straight[$j] = -1;
}
}

foreach my $i (1..$pieces{Straight}{count}-1) {
if ( $straight[$i] == -1 ) {
$straight[$i] = $straight[$i-1];
}
}

my $terminate = 1;

foreach my $i (0..$pieces{Straight}{count}-1) {
if ( $straight[$i] != $pieces{Curve}{count} ) {
$terminate = 0;
last;
}
}

if ( $terminate ) {
last;
}
}
}

# Subroutine used to place a piece of track. It updates a %cursor
# hash passed to it
sub move_cursor
{
my ( $c, # Reference to the %cursor hash to be updated
$p, # Name of the piece be added
$h ) = @_; # Handedness for curved piece (1 or -1), omit if
# non-curved

if ( !defined( $h ) ) {
$h = 0;
}

$c->{x} += $pieces{$p}{length} * cos($c->{angle} +
$h * $pieces{$p}{angle});
$c->{y} += $pieces{$p}{length} * sin($c->{angle} +
$h * $pieces{$p}{angle});

if ( $h != 0 ) {
$c->{angle} += 2 * $h * $pieces{$p}{angle};
}
}

# Subroutine to build an entire track from a list of track pieces
# (with single character identifiers)
sub build_track
{
my ( $c, # Reference to the %cursor to update
$in ) = @_; # Reference to the list of build instructions

# This maps the characters in the @$in array into piece names and
# handedness (for the curved pieces). Note that direction is
# missing for the straight pieces and the undefined value is used in
# the following loop and passed to move_cursor for those
# pieces. move_cursor knows how to deal with this.

my %m = (
'A' => { piece => 'Curve', direction => -1 },
'B' => { piece => 'Bridge' },
'C' => { piece => 'Curve', direction => 1 },
'S' => { piece => 'Straight' } );

foreach my $i (@$in) {
move_cursor( $c, $m{$i}{piece}, $m{$i}{direction} );
}
}

# Determine whether two cursors represent the same position and
# orientation. Return 1 if so, 0 otherwise.
sub cursor_match
{
my ( $c, $d ) = @_; # References to two %cursor hashes

my $dx = round($$c{x}-$$d{x});
my $dy = round($$c{y}-$$d{y});
  my $da = round(($$c{angle}-$$d{angle})/$PI);
$da -= int($da);

return ( ( $dx == 0 ) &&
( $dy == 0 ) &&
( $da == 0 ) );
}

# Perform rounding on a number to two decimal places
sub round
{
my ( $a ) = @_;

return int($a*100 + 0.5*($a <=> 0))/100;
}

PPPS Here's how I determined that a full circle was made up of eight curved pieces with a radius of one straight piece:



Update I received a message from Ikea about this:

Dear John

Thank you for sharing our passion for childrens products. It is very good for us to get input like this. I will forward your idea to the product developer who is responsible for our toys and maybe this improvements will be something for us to concider.
Thanks again for taking your time to give us feed back.

Best regards
Maria Thörn
Range manager childrens IKEA

Labels: