Unlocking the Power of Spoken Language
—
Actions on Google
Table of contents
Page
Communicate what was understood
2
Offer examples that illustrate what people can say, and how
2
Avoid stating the obvious
2
Give users credit and save extra guidance for those who need it
3
1
Actions on Google
The advantage of speech-enabled services is that people already know how to talk. A well-designed voice user interface (VUI) is intuitive—commands don’t have to be taught, unlike the meaning of a button in a visual interface or the keys on a touchtone phone system. Still, sometimes we do have to let people know what they can say, either because they’ve asked for help or they’re unsure how to proceed (especially if they’re new users).
hearing is in fact what they asked for. Put another way, new information is presented last—which is known as the end-focus principle.
Offer examples that illustrate what people can say, and how A VUI should also offer people a range of ways they can phrase their queries, or examples that illustrate levels of complexity, as in this exchange:
Here are some pointers on building a VUI that can carry on better conversations.
User VUI
Communicate what was understood If a person asks a question or how to perform a task or action, the VUI should communicate what the system (the “recognizer”) understood (or parsed) about the question, so that people know they’ve been heard, bolstering their trust in the speech technology. These are what we call implicit confirmations of the user’s intent:
Avoid stating the obvious Uninformative messages make your voice persona (which is also your brand) sound hapless and lack depth. People don’t appreciate any device underestimating their intelligence. Let’s take the example of a person who asks, “How do I get the news?”
User How do I cancel an alarm? VUI If you have only one alarm and you want to cancel it, just say "Cancel my alarm." Otherwise, try saying things like "Cancel my 7 a.m. alarm" or "Cancel all my alarms." User VUI
How do I set an alarm? To set an alarm, try saying things like "Set an alarm for 6 a.m.," or "Wake me up in two hours," or "Set a ‘medication’ alarm for everyday at 7 p.m."
Consider the entirely unhelpful: To get the news, say “Get the news.”
Who made the statue David? David was created by Michelangelo.
And swapping words for variation doesn’t add value, either: To listen to the news, say “Hear the news.”
Remember, due to the linear nature of an audio-only interface, people have no way of skipping ahead. They have to hear one word. Before. They hear. The next. Contrast this with how a screen might return a written response all at once, and can therefore get away with saying “Michelangelo, artist” (most likely accompanied by an image). The VUI on the other hand needs to signal to the user that what they’re
The same person probably knows or has tried framing the query just that way before. People likely are trying to explore what else they can do. Instead, build in some intuitive instruction to help them infer the broader realm of possibilities:
2
Actions on Google
If you’re interested in recent headlines, you can say, for instance, “Tell me the latest news.” Or try asking for a specific category, like technology or sports news.
User VUI
Still, a VUI often needs to tell people what to say, without anyone explicitly asking for help. Here, too, you should avoid stating the obvious.
Or you can give me a tempo like “Allegro” or “Moderately fast.” (We can always speed it up or slow it down later.) 92 beats per minute. Okay, 92 beats per minute. Here you go.
This is bad behavior on the part of the VUI, asking a question and then immediately continuing to talk without handing over the turn to the user. This approach also requires the person either to wait for the lengthy initial message to finish or to interrupt (if barge-in is enabled), also causing them to be a poor conversation participant.
For example, instead of using a literal instruction like this one: Editing your shopping list is easy. To add an item, just say “Add,” followed by the item you’d like to add. To remove an item, just say “Remove,” followed by the item you’d like to remove.
Instead, consider the sequential, time-consuming nature of speech, and yield the speaking turn back to the user:
Consider being intuitively obvious instead: By the way, to edit your list, you can say something like “Add toothpaste or “Take off the ice cream.”
User VUI User VUI
Start a metronome. Sure, what tempo? [No reply] You can say, for instance, “110 beats per minute.” Or you can give me a tempo like “Allegro” or “Moderately fast.” (We can always speed it up or slow it down later.) User 92 beats per minute. VUI Okay, 92 beats per minute. Here you go.
Give users credit and save extra guidance for those who need it A VUI shouldn’t try to “teach” people what to say to protect them from veering off the so-called “happy path.” Instruction is irrelevant for those who aren’t having problems—which should be most people if you’ve designed an intuitive VUI. Instead, give instructions in fallback paths and in repair (error) prompts, as in the following example. This way, you optimize relevance for people who don’t need help, but offer help when someone seems to be stuck.
Clearly, it seems like fewer steps to offer suggestions right away. But in addition to actually taking longer, it presumes the user is a novice, which probably isn’t the case. Remember that people know what they want. Give them a chance before jumping in to help.
Don’t assume that everyone needs help knowing how to ask for what they want: User Start a metronome. VUI What tempo did you want to start with? You can say, for instance, “110 beats per minute.”
3
Actions on Google
Best practices Remember these guidelines when creating a voice experience:
→→ Avoid stating the obvious
→→ Communicate what the system understood
→→ Offer meaningful examples when letting people know what
→→ Give instructions only if needed
they can say
© 2016 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other company and product names may be trademarks of the respective companies with which they are associated.
4