Monday 23 June 2008

One little, two little, three little Endian?

The date and time are now, roughly, 13:08, 23rd Jun 2008. Seems correct to you?

It makes no sense.

Why do we habitually mix endian-ness? This is a term used to denote whether the most significant digit (in this case the year) is placed on the left, with successively smaller digits on its right, or on the right, with successively smaller digits on its left. You read 1,234 as 'one thousand, two hundred and thirty-four' (big-endian), not as 'four thousand, three hundred and twenty-one' (little endian). (The name, incidentally, comes from Gulliver's Travels, where two rival factions were at war over which end of a boiled egg should be eaten first).

Big-endian and little-endian make an equal amount of sense - either is an arbitrary choice. Mixed endian-ness, however, makes no sense. No-one these days would say 'one thousand, two hundred, four-and-thirty', because that would be reading the digits out of order. Yet we are happy to do this with dates! Hour:minute:second, day/month/year is small:smaller:smallest, big/bigger/biggest (The US system even more so - 07/11/08 is the 11th July, so the numbers are ordered: bigger/big/biggest!). I, for one, would write the current time as 2008/06/23 13:16:23, which maintains a consistent endian-ness throughout. This blogging software, alas, does not support this option (neither does Microsoft Excel!), even though it is one of only two possible formats which makes logical sense (the other being 23:16:13 07/11/08, which few people would use!).

The same concept applies to domain names, which start little endian (news.bbc.co.uk) up to the first slash, when they magically become big-endian (/sport1/hi/cricket/default.stm). Tim Berners-Lee himself says that he wishes he'd made web-addresses consistently big-endian (e.g. uk.co.bbc.news/sport/) .

It is too late now...

3 comments:

Anonymous said...

It is a fallacy, surely, that something must be logical to make sense?

I am entirely able to make sense of the time and date as written by your blog. It makes sense.

WordplayGuild said...

I agree with your point, but I don't believe it contradicts mine.

I think we disagree on the interpretation of "makes sense". In this instance I am using it in the sense of "it makes no sense to choose this scheme from first principles", whereas I believe your excellent point is using it in the sense of "this scheme succeeds in imparting its meaning to the reader". I am not questioning whether a reader is able to make sense out of a mixed-endian date - clearly people can, or no-one would use it. I suggest that this is one of those cases where the widely-used method has evolved into its current state, and is not one which any rational person would choose by design in the absence of such widespread legacy usage.

Perhaps I should have been more explicit, but I rather think the post is a little on the long and dry side as it is...

Anonymous said...

Growing up on M68k and then PowerPC, I found little-endian byte-ordering to be horrendous to work with. Even now, every memory dump, I have to think about for an extra second because I'm on Intel.

The US date thing is a huge pain - I'm lucky that my birth *day* is > 12, so it's obvious when it's 'wrong'. Assuming people range validate their date fields. And the URL thing, absolutely agreed. Imagine if file paths were doc.myfile/writings/My Documents/? Yuck.