
I hear a lot of predictions about the future of mobile user interface (not surprising: it’s my field). One that always hits me wrong is the prediction that “everything” will move away from the 12-key input pad to voice activation. This tag line, from Vlingo’s website, promotes that assumption (also not surprising: it’s their field): “Why tap when you can talk?”
SMS, the world’s most popular communication means by far (see below) embodies the polar opposite interface: everything is contained in layers within 12 keys.
SpinVox stakes a claim right in between Vlingo and SMS: they offer voice-to-text translation of voicemail messages and personal notes. The best of both worlds, so to speak; a combination of voice and text usage. SpinVox claims that it’s seven times faster to speak than to type. I don’t know what kind of tests that’s based on, but let’s accept it as true.
The real question we have to ask is, What do people prefer?
This semester, T-Mobile is tapping graduate students at the Illinois Institute of Technology’s Institute of Design for research insights. To investigate how people use their mobile devices, the students are staking out their local Starbucks and asking subjects to document their phone usage with digital cameras.
The project, which will last until May and could shape future mobile plans and phones, has already spawned one insight: Young people prefer text messaging to voicemail because it’s more direct. “It’s about how we deal with information … such a rich field for designers to explore,” says Associate Professor Vijay Kumar, who heads the workshop.
(from Innovative Cellphones, Forbes.com, March 19, 2008)
First major fact: No one input means is perfect for every use case. Forget what every input technology company website tells you.
Second major fact: 350 billion voicemail messages are left globally per year. (source: SpinVox presentation at MEX) I couldn’t find any numbers on the number of voice calls completed globally, but I expect it’s less than…
Third major fact: 2.8 trillion SMS text messages are projected to be sent this year. (source: Tomi Ahonen consulting) I’ll hang on a second while you go back and re-read that last statistic. Yep, 2.8 trillion.
Personally, it’s hard to me to imagine a case where I’d use voice activation of anything, with the possible exception of in a car. I’d hate it in the car, but might use voice control to keep my hands on the steering wheel. Even that’s a stretch. But let’s leave me out of it.
People do prefer text to voice in many situations. Why? Why tap when you can talk? Let me count the ways…
- Cost. SMS is usually cheaper per message than cellular calls are per minute.
- Privacy/Discretion. Often, you don’t want to be heard by your neighbors (eg, in a meeting). Or — sorry to break the news — they don’t want to hear you (eg, in a movie theater).
- Perceived time. I don’t know what the reality is, but people may consider texting (or emailing from a computer) to be less time consuming than talking by phone. Why? Partly because no time is spent on courtesy chit-chat; partly because time spent entering text isn’t registered the same way that time spent listening to other people talk is registered; partly, perhaps, it is an illusion.
- Insecurity. Call screening used to be considered anti-social and arrogant. Now it’s a normal part of life. But calling a mobile phone implies near-certainty that the other party will hear the call come in. What if the other party rejects your call? Even if the reason is a good one — he’s in a meeting, in the bathroom, sleeping, watching a movie, has his hands full — the sense is still that “something else” was more important than talking to me. Sending a text message alleviates the need to experience that little subconscious emotional tension while the phone rings. You send the message; the other party will respond within a reasonable amount of time (or won’t).
- Avoidance of intimacy. Having a vocalized conversation encourages a level of personal connection that isn’t necessary in a test message. There’s a protocol of “how are you?” and “how’s your Mom?” and “thanks for the update; I’ll get back to you.” People are often lazy, selfish, or not interested enough to invest that kind of energy when “im 10 mis l8″ will convey the required message.
- Burden of attention. Slightly different from the previous issue of the energy demands of intimacy, this has more to do with attention and focus. SMS allows you to interact as much as you want to, for as long as you want to, when you want to. You can limit your attention. In a conversation, you need to be “on”, to listen, to provide feedback — in other words, it takes attention.
If you are watching TV or reading emails or playing your PSP while the other person is talking, they will notice and be offended. In an SMS or IM exchange, you can only pay attention when you’re the one talking (and doesn’t that make for the most interesting conversations?). No feigning interest required. In that sense, SMS is the ultimate self-centered communication medium.
In fact, you might consider micro-blogging (eg, Twitter, Facebook status), which combines broadcasting with SMS, to be the truly ultimate environments for self-centered communication. That might explain their popularity.
And with this, I think I have a reasonable answer to the question: Why tap when you can talk?

A new study adds an unexpected method to the list of ways to spur memories about our past: body position. That’s right: just holding your body in the right position means you’ll have faster, more accurate access to certain memories. If you stand as if holding a golf club, you’re quicker to remember an event that happened while you were golfing than if you position your body in a non-golfing pose. […]
Dijkstra’s team believes that the effect may be due to the way memories are stored in the brain: one theory of memory suggests that memories are composed of linked sensory fragments — odors, sights, sounds, and even body positions. Simply activating one or more of those fragments makes the entire memory more likely to be retrieved. In any case, if you’re trying to recall a particular incident in your life, putting your body in the right position might help you remember it faster and more accurately. The key appears to be your body position when the memory occurred.
[via Cognitive Daily — one of my new blog loves]
This is a really fascinating topic, and the first time I’ve seen it studied methodically.
What is the long term impact of our growing involvement in virtual life and digital communication?
If you see all movies in the same general context (on your iPod or computer screen), rather than surrounding them with a trip to a theatre, candy, theatre seats…
If your reading material consists of Word and eBook files, without the sizes, fonts and thickness of books…
If your letters are all emails or instant messages, without stamps, handwriting or envelopes…
If your conversations are all via SMS, with no body language or tonal cues beyond emoticons…
…and if all that happens within a very limited number of positions and locations (on the sofa, on a train, in the kitchen, at the desk), then what impact does that have on your personal internal journal of your life?
Is that why we need our cell phones to journal our [digital] lives, with automatic uploads of pictures to Flickr, video to YouTube, status to Facebook, and stream of consciousness to Twitter? Is it because we might experience life as a giant blur of digital events that in retrospect are difficult to tease apart from one another… or even difficult to call up as memories?
What does it take to make a digital experience memorable in the way a fully physical experience is? I doubt that color or even sound are enough. New trends in haptic and scent generation might help… but I suspect that the real trick is to tie the digital experience to the surrounding environment (or at least to a discrete digital environment) — tying the mobile ad to the location, to the store, to the time of day, to the people in the room, for example.
A whole new meaning for Location Based Services.
07 16th, 2008