Southern Man

Saturday, July 23, 2011

Three Fails of AutoCorrect

WARNING: This post is not for the easily offended. Well, one could say that about much of the 'net. But you've been warned.

Pretty much everyone knows about this site (from which all of the following images were brazenly stolen are presented under the auspices of Fair Use) in which people post their humorous text-completion fails. All are funny, some hilariously so. More than a few are unprintable if only for the heartfelt cursing that completes many of the submissions. But few recognize that this unintentional humor is the direct result of poorly-written software. In other words, the good folks who wrote the text-completion feature for the iPhone might be fair programmers but they're poor interface designers and while their efforts have generated many a laugh they've done their employers a disservice. Really - how often does autocorrect do what you want, and how often do you curse out loud when it does something incorrect or inappropriate? How much time does autocorrect really save you, and how much time do you spend un-doing what autocorrect incorrectly did?

Fortunately, Southern Man is here to tell you three things that are broken in autocorrect and how to fix them.

(1) The autocorrect dictionary is much too large.
The reason for this is because autocorrect was written by programmers who were enamored with solving the (admittedly interesting) problem of proximate-key pattern-matching against a large database of similar words in real time and put no thought into whether their database was appropriate - in other words, they solved the problem the way programmers would but not the way a linguist would. For most people, four or five hundred words suffice for 95% of their everyday conversational needs (a fact that is handy to know when learning a new language). The purpose of autocorrect is to complete words in text messages. Thus, the autocorrect dictionary should contain, at most, a few thousand words and no more. Sadly, this will never happen because no project leader will ever approve it. Off topic, but college textbooks have the same problem; they are much too large and contain far too much information because no reviewer or editor would ever approve of removing correct information from a textbook. Texts on the three subjects Southern Man teaches - physics, mathematics, and computer science - are frequent offenders here.

One of tens of thousands of perfectly fine words that shouldn't be in this dictionary.

One would like to see the code that allows this to happen. But, another word that probably doesn't belong in the vernacular of casual texting.
(2) The autocorrect dictionary contains many words that most people would never intentionally use in a text message.

Such as proper nouns...

No offense to the good doctoror the noted scientist or to Southern Man's favorite Russian composer but your names are rarely the subjects of text messages. Unjust, perhaps, but that's the way it is.

Names of countries also make unexpected appearances simply because they're in the dictionary. How often do you intend to write "Yugoslavia" or "Zimbabwe" in a text message?

Has anyone in the English-speaking world ever intentionally send the word "f├╝hrer" in a text message?

Or the names of epic Wagner operas? Although that's actually not a bad subtitle for Deathly Hallows Part II, for which mid morning on June 14 was way too late to get tickets to the midnight shows.

...and sexually-explicit terms. Admittedly removing these would cost us a fair number of laughs. The challenge here was to find examples clean enough to post but Southern Man has your best interests at heart and willingly waded through hundreds of inappropriate submissions to select these few:

While there's some merit to this particular correction (regardless of administration) it's unlikely that anyone is going to ever intentionally text the word "whorehouse."

Another fine word that one rarely encounters in casual conversation. Perhaps the world would be a happier place if it was.
(3) The autocorrect dictionary should rank all words by common usage (and modify these rankings as they are used by the actual user) and refrain from suggesting lesser-used words.

An unlikely choice for non-farmers...

Like many men Southern Man may frequently experience both states but will generally text about only one of them and the frequency stats in the dictionary should recognize this. Come to think of it, when texting a girlfriend either will do just fine.
But give the user a chance to delete words that they've added, intentionally or not...

And now that song is stuck in Southern Man's head for the rest of the day.
All programmers who write text completion software now know what they need to do to fix autocorrect. Let us know when it's ready. Kardashian. Oops, meant KTHXBYE. Damn You Auto Correct!


