rhu: (Default)
[personal profile] rhu

I'm serious about this question, although you may not believe it.

As you probably know, computers store everything as sequences of binary numbers. It is often necessary to translate those bit sequences into another format that can be transferred from one computer to another. So, for example:

Decimal encoding simply renders them as a number, such as 127. This is great when the typical user needs to understand the number, such as when answering "How many minutes should the egg timer wait before beeping?"
Hexadecimal encoding is similar but uses the letters A-F along with the digits 0-9 so that each character represents a grouping of four bits.
Base-64 encoding translates them into strings of seemingly random characters such as QmFzZS02NCBlbmNvZGluZwo= which are great for sending large attachments through email systems that might munge raw binary data.

Now, here's my idea and my question. Often at my company, we have moderately-long bit strings (32 to 128 bytes is typical) that sometimes need to be read over the phone. It occurred to me today that how we display and enter these numbers is arbitrary and transient, so we aren't limited to the encodings I listed above. In particular, I was wondering:

How many monosyllabic words could be used to encode binary data with no ambiguity, accounting for regional pronounciations?

What I'm thinking is: assume that we can make a list of 256 such words. "Cat," "blue", etc. Then each byte of the bit string could be represented by a single word, and users would be less likely to transpose digits or confuse a "B" and a "P" on the phone. Our GUIDs or license keys would become like spam headers:

blue scoff cat pie shoot wing

Would such a scheme be practical? Could one construct a list of 256 or even 1,024 words chosen to provide error-resistant spoken encoding?

Profile

rhu: (Default)
Andrew M. Greene

January 2013

S M T W T F S
  12345
6789101112
13141516171819
20212223242526
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags