Auditory encoding of binary data
Feb. 15th, 2006 05:36 pmI'm serious about this question, although you may not believe it.
As you probably know, computers store everything as sequences of binary numbers. It is often necessary to translate those bit sequences into another format that can be transferred from one computer to another. So, for example:
Decimal encoding simply renders them as a number, such as 127. This is great when the typical user needs to understand the number, such as when answering "How many minutes should the egg timer wait before beeping?"
Hexadecimal encoding is similar but uses the letters A-F along with the digits 0-9 so that each character represents a grouping of four bits.
Base-64 encoding translates them into strings of seemingly random characters such as QmFzZS02NCBlbmNvZGluZwo= which are great for sending large attachments through email systems that might munge raw binary data.
Now, here's my idea and my question. Often at my company, we have moderately-long bit strings (32 to 128 bytes is typical) that sometimes need to be read over the phone. It occurred to me today that how we display and enter these numbers is arbitrary and transient, so we aren't limited to the encodings I listed above. In particular, I was wondering:
How many monosyllabic words could be used to encode binary data with no ambiguity, accounting for regional pronounciations?
What I'm thinking is: assume that we can make a list of 256 such words. "Cat," "blue", etc. Then each byte of the bit string could be represented by a single word, and users would be less likely to transpose digits or confuse a "B" and a "P" on the phone. Our GUIDs or license keys would become like spam headers:
blue scoff cat pie shoot wing
Would such a scheme be practical? Could one construct a list of 256 or even 1,024 words chosen to provide error-resistant spoken encoding?
(no subject)
Date: 2006-02-15 11:21 pm (UTC)(no subject)
Date: 2006-02-16 02:46 am (UTC)The mnemonic encoding presented here is a method for converting binary data into a sequence of words suitable for transmission or storage by voice, handwriting, memorization or other non-computerized means.
The encoding converts 32 bits of data into 3 words from a vocabulary of 1626 words. The words have been chosen to be easy to understand over the phone and recognizable internationally as much as possible....
(no subject)
Date: 2006-02-16 02:41 pm (UTC)