Optimal codes for human beings Piotr Zieli´ nski October 3, 2006 Abstract Standard T9 coding system is suboptimal. Dasher requires constant feedback. This paper presents a coding system that is humanfriendly, easily memorizable, and optimal in some precisely defined sense.

1

Standard T9 coding system

The T9 is one of the best known text-entry systems for mobile phones. Each letter is assigned to a button in the following way: 1:.,!? 4:ghi 7:pqrs

2:abc 3:def 5:jkl 6:mno 8:tuv 9:wxyz 0:space

To enter a word, the user presses the buttons corresponding to the letters forming that word, and the computer deduces the word. For example: 4663 346637 995674663

=⇒ =⇒ =⇒

good dinner xylophone

To distinguish the sequence of buttons, 4663, from the actual sequence of letters, good, I will call the former code and the latter word. The process of translating a code into a word is called disambiguation.

1

Disambiguation can sometimes fail. For example, good and home both have the same code 4663, so no disambiguation algorithm can reliably distinguish these two words. There are many other examples such of pairs of words. So what happens? After entering a code, say 4663, the computer just guesses the most probable word, that is, good, which the user can immediately accept by pressing space. If good is not what we want, we can press a special button next; the word changes to home, which again we can accept by pressing space. If we don’t like that one either, pressing next more times displays other words with code 4663: gone, hood, hoof, . . . Let’s now look at the disambiguation process described above from the point of view of coding theory. Each word is uniquely identified by its code and the number of times the next button has to be pressed. For example: 4663 4663N 4663NN 228 228N 228NN 228NNN

2

=⇒ =⇒ =⇒ =⇒ =⇒ =⇒ =⇒

good home gone act cat bat abu

Full prefix T9 coding system

T9 coding is sometimes inefficient. For example, in order to enter xylophone you have to press 995674663, even though after entering 9956, xylophone is already the most probable word. To deal with this problem, we will make the next button iterate over all words whose code starts with the sequence of digits we have just entered:

2

4663 4663N 4663NN 4663NNN 4663NNNN 4663NNNNN 4663NNNNNN 4663NNNNNNN

=⇒ =⇒ =⇒ =⇒ =⇒ =⇒ =⇒ =⇒

good home gone immediately immediate homes inner honest

In this system, single word can have more than one code: 46NNNNNNNNNNNNNNNNN 466 4663

=⇒ =⇒ =⇒

good good good

68NNNNNNNNNNNNNNNNNN 688NN 6887N 68878

=⇒ =⇒ =⇒ =⇒

output output output output

995NNNNNNNN 9956 995674 995674663

=⇒ =⇒ =⇒ =⇒

xylophone xylophone xylophone xylophone

From the coding theory point of view, assigning multiple codes to the same word is just waste of space. However, this property might be useful in codes designed for humans. After entering 9956, the system should recognize that the user means xylophone. On the other hand, refusing to accept 995674 as xylophone in the name of coding-theoretical perfection is just silly. This is not to say that all codes in the table above are equally useful. We can safely assume that, to type good, nobody will enter 46NNNNNNNNNNNNNNNNN if 466 does the same. Similarly, since 688NN, 6887N, 68878 all consist of five symbols and resolve into output, we can optimize our coding system by assigning other words to the first two. In this optimized coding system, the sequence 688, 688N, 688NN, 688NNN, . . . no longer contains all words with 688 as the standard T9 prefix (it does not contain output for example). For this reason, I will call any such code a (partial) prefix T9 coding system. My goal is to find the optimal one. 3

3

Prefix-optimal T9 codes

The optimal prefix T9 coding system minimizes the average number of button presses while still being human-friendly. Formally, we require the code to be prefix-optimal. The definition of prefix-optimality is recursive: A coding system is prefix-optimal with respect to prefix c1 c2 . . . cn iff 1. It is prefix-optimal with respect to all prefixes c1 c2 . . . cn cn+1 with cn+1 ∈ {2, . . . , 9}, 2. Out of all coding systems satisfying condition 1, it takes the minimum number of button presses to enter an average English text containing only words that have c1 c2 . . . cn as their standard T9 prefix. A coding system is prefix-optimal iff it is prefix-optimal with respect to the empty prefix. As the above definition suggests, prefix-optimal T9 codes can be constructed recursively bottom-up, starting from very long prefixes. Once prefixoptimal codes have been constructed for all extensions of the current prefix, the optimal code for the current prefix can be determined using a dynamic programming algorithm. Here are some parts of the resulting code, the prefix-optimal code on the left, the full prefix T9 code on the right. ε N NN NNN NNNN NNNNN

=⇒ =⇒ =⇒ =⇒ =⇒ =⇒

the and to this there their

ε N NN NNN NNNN NNNNN

=⇒ =⇒ =⇒ =⇒ =⇒ =⇒

the of and a in to

Note than many short words, such as of, a, in do not appear in the sequence ε, N, NN, . . . of the prefix-optimal code. This is because entering of as 63 or even as 6 takes so few button presses that the alternative in the form NN... is not necessary.

4

6 6N 6NN 6NNN 6NNNN

=⇒ =⇒ =⇒ =⇒ =⇒

of not one more most

6 6N 6NN 6NNN 6NNNN

=⇒ =⇒ =⇒ =⇒ =⇒

of on not or one

The same here: entering on and or as 66 and 67, respectively, is so easy that assigning codes 6N... to them is just wasting space.

3.1

The cost function

Prefix-optimal codes minimize the cost of entering an average English text. The dynamic-programming algorithm used to find the code presented above assumes that the cost of entering a code can be represented as a function f (d, n), where d is the number of digits in the code, and n is the number of Ns. (For code 4663NN, d = 4 and n = 2.) For example, f (d, n) = d + n minimizes the total number of button presses. This is probably what we want for cases where pressing buttons is the most costly operation, such as with eye-controlled input by disabled people. For other users, the thinking process might actually be more costly than pressing the buttons. Once such a user looks at the list of word, it is probably easier for him to select a word from the list by pressing next than to decide which digit button to press. In this case, f (d, n) = 2d + n, which was used in the examples above, can be more appropriate. The good thing is that once an appropriate cost function (not necessarily linear) has been designed, the above algorithm can compute the optimal code automatically. For example, setting f (d, n) = d results in the full prefix code, whereas f (d, n) = n leads to the standard T9 code.

3.2

Probability adjustments

The method, as described above, leads to anomalous situations: 2 2N

=⇒ =⇒

and a

Therefore, pressing 2 results in and, not a, which might be really confusing.

5

The algorithm prefers and to a because the former word occurs in an average English text more often than the latter. However, if the user wants to type and, he usually inputs 263, instead of pressing 2 and looking at the word list. Therefore, the probability of a provided that the user has just input 2 and is looking at the word list is higher than that of and. Making the algorithm use such conditional probabilities instead of absolute probabilities fixes the problem. In our examples, I simply assumed that looking at the word list doubles probabilities of all exact matches. The best method of calculating these conditional probabilities requires further research. Of course, the a-and problem described above is highly subjective. It is definitely a nuisance for beginners, but expert users might actually consider it a feature rather than a problem.

6

Optimal codes for human beings

Oct 3, 2006 - forming that word, and the computer deduces the word. ... the next button iterate over all words whose code starts with the sequence of digits we have just .... easier for him to select a word from the list by pressing next than to decide ... Of course, the a-and problem described above is highly subjective. It is.

80KB Sizes 1 Downloads 259 Views

Recommend Documents

Nba2k17 Codes For Ps3 327 ^ Nba2k17 Codes Without Human ...
NBA 2k17 Locker Codes 2017, Unlimited VC Glitch Free ... Generator Nba2k17 Vc Generator Android Live Free Game Generator Codes online, Free Game ...

Optimal Linear Codes over Zm
Jun 22, 2011 - where Ai,j are matrices in Zpe−i+1 . Note that this has appeared in incorrect forms often in the literature. Here the rank is simply the number of ...

Asymptotic Interference Alignment for Optimal Repair of MDS Codes in ...
Viveck R. Cadambe, Member, IEEE, Syed Ali Jafar, Senior Member, IEEE, Hamed Maleki, ... distance separable (MDS) codes, interference alignment, network.

Shadow Optimal Self-Dual Codes
Jun 22, 2011 - code obtained from C24k+8 by subtraction, with weight enumerator ∑ Bix(24k+6)−iyi. Then the weight enumerator ∑ Bix(24k+6)−iyi is uniquely determined by ∑ Aix(24k+8)−iyi. Let C48 be an extremal doubly-even [48,24,12] code.

Delay-Optimal Burst Erasure Codes for Parallel Links - IEEE Xplore
Cisco Systems, 170 West Tasman Drive, San Jose, CA 95134, USA. Email: ∗{leeoz ... implications on the code design – erasure burst and link outage.

The Resurrection of Jesus and Human Beings in ...
anic era die again and be reborn in the World to Come (i.e., will there be one resurrection or two)? Is there a difference between the messianic era and the World to Come? Will there be bodily enjoyment in the World to Come? Will the body and soul li

Evasive Bots Masquerading as Human Beings on the ...
We develop an evasive web bot system based on human behavioral ..... Human actions trigger events that train a generative engine, so the web bots can later mimic the human behaviors. The system design is summarized in Figure 2. The human data flow is

Optimal Ternary Formally Self-Dual Codes
Jun 22, 2011 - that any optimal formally self-dual [10,5,5] code is related to the ternary Golay ... Note also, as stated in [9], that φ3 corresponds to the code with ...

Digital Human Modeling for Optimal Body Armor Design
in response to necessary blast and ballistics requirements. Thus, most research and development has focused on experimental and computational evaluation and design of materials. Recently, however, the design focus has started to shift towards the com

Towards Optimal Design of Time and Color Multiplexing Codes
Towards Optimal Design of Time and Color Multiplexing Codes. 3 where x[n] ∈ RN is a vector containing the captured light intensity for N dif- ferent multiplexed illuminations at pixel n, A ∈ RN×N is a time multiplexing matrix, s[n]=[s1[n], ...,

vienno Nba2k17 Codes Without Human Verification Script
Keyforsteam ist Dein Preisvergleich von CD Key Shops und Steam Key Anbietern im ... You can also use the Playstation Store website to redeem your codes.

yoozxo Nba2k17 Codes Without Human Verification ...
Steam Users' Forums - Powered by vBulletin. Get your NBA 2K15 Locker Codes here ! ... Here are the correct answers for this weeks episode of NBA 2KTV. The video also features two locker codes featuring ... NBA 2K14 for Xbox 360 GameStop nba 2k17 free

Free Amazon Gift Card Codes 2017 No Human ...
Download 2017 Yearly Calendar Live Free Game Generator Codes on Sony smart TVs, Code Generator Free Amazon Gift Cards. Online For Suriname Code ...

OPTIMAL RESOURCE PROVISIONING FOR RAPIDLY ...
OPTIMAL RESOURCE PROVISIONING FOR RAPIDL ... UALIZED CLOUD COMPUTING ENVIRONMENTS.pdf. OPTIMAL RESOURCE PROVISIONING FOR ...