411 Voice
Interfaces: a user's perspective
By
SVETLANA STENCHIKOVA
Several commercial companies have recently launched
a toll-free directory assistance (411) service driven by speech recognition. I
decided to try the voice search offerings from three companies Free411, TellMe/Microsoft,
and Goog411 in a real life situation. This article describes my informal
exploration of the 411 systems from Free 411, TellMe/Microsoft and Google.
If you live in a metropolitan
area you can probably relate to this scenario. One evening after seeing a film
at a Turkish film festival at Lincoln Center in New York City, we were in a mood
for some Turkish food, but none of us had an idea how to find a Turkish
restaurant near by. We decided to give these 411 services a try.
I also tried a more traditional
scenario supported by the current 411 system: finding a business by its name.
To be fair to the systems, I tried two types of name searches. First search
was a UPS store with more than 30 locations in NYC . I was interested to
find one in the upper west side. Second search was for a place with a single
location to the best of my knowledge - a local hair salon in New York City
called "Cuts for Girls".
I found a nice variety in the
systems handling the 411 assistance task. The systems ranged in their
confirmation strategies, in the way of narrowing the choices strategies, and in
the information presentation.
Free 411 (1-800-FREE-411)
Free 411 from Jingle Networks
greets you in a friendly voice, following after a long advertisement. The system
first asks the user for the desired city and state and then offers a search by
category: residential, business, or government: a natural set of options in my
opinion. Explicit confirmations make you feel confident that you are on a right
track. The system allows you to say a location. I tried "Columbus Circle" and it
was successfully recognized.
Entering a category of
"restaurant", the interface starts listing a whole range of places from a
fast-food delivery place to a coffee lounge. This strategy may work if you are
searching in an area with limited options, but in Manhattan where there are tens
of restaurants on each block you clearly need some kind of a filter. So, I
decide to correct my category to be a "Turkish restaurant" but unfortunately
"start over" or "go back" were not on the menu. I found myself stuck with the
only option of hearing all restaurant choices near Columbus Circle. After
calling the system second time and going through the same menus, "Turkish
restaurant" was unfortunately not understood. The system was quick to give up by
telling you "there are no listing in your category" and hanging up.
Searching for a business by
name was a mixed success. First, asking Free 411 for UPS found only 2 stores in
New York City - very hard to believe that there so few locations. While
searching for a hair salon "Cuts for Girls", a system successfully handled
misunderstanding with an explicit confirmation. Then I was notified of winning a
Cancun vacation and offered to be transferred to the resort management multiple
times. At the end the system dictated me the number, but unfortunately, it was
the wrong one.
Live Search 411
(1-800-CALL-411)
Call 411 is a voice service
created by TellMe and Microsoft for business search.
This interface has a loud voice
which does not stop talking when interrupted (although it does handle
barge-ins). The interface allows the user to narrow down by category and
location and it was successful in recognizing both the location and the category
correctly. The lack of confirmation kept me in suspense: did it catch the "upper
west side"? My suspense was resolved only in a last step after choosing an
option and hearing the description with address that was actually in the upper
west side. However, if I was not familiar with an area I would still be unsure
whether the address I had heard was in the correct neighborhood.
Once narrowed down on a
category and location, the system starts listing places by their names. Listing
of places by name makes sense if the name will mean something to you, when you
can say "aha - that's the place!" However when hearing the names for the first
time these names may not mean anything to you and some extra information, such
as a street location, would be useful.
Navigation between options was
also not that easy. After selecting one option and hearing the full description,
I could get it texted or repeated, but I could not figure out how to hear more
options in this category without starting over!
Search for a specific business
was more productive. This system found 9 UPS stores in New York city - that's 7
more than free-411! It did not let me narrow down by neighborhood, but hearing 9
choices is bearable. The fourth option turned out the one closest to me. My hair
salon was found quickly and successfully, no Cancun vacations though.
Google 411 (1-800-GOOG-411)
When I talked to goog411 I
could almost picture their graphical user interface in a browser. The implicit
confirmations were helpful and unobtrusive. Google first asks you for a
category and then it sometimes asks you to narrow down by location. The decision
to ask for a more specific location may depend how many options are available
for the category. I found narrowing down a location problematic: I tried
"Lincoln Center", "Columbus Circle", and various intersections, but no matter
what I would choose, the interface would just ignore my input. Luckily it is
possible to input a zip code which is a fault tolerant method, only if you
happen to know the zip code.
If the location and category
are recognized correctly, the information presentation was the most intuitive on
the GOOG-411's interface.
When searching for a "Cuts for
Girls", my searched place was not among the first system's choices, but it was
the second. After "go back", the system smartly offered me help on spelling the
name. This shows a nice adaptive strategy. When I asked to hear more options,
the system gave me only one more, but luckily it was the correct place! I was
curious to see if there were other options, but the system rushed to connect me.
Searching for UPS store, the
system is quick to offer the first result, without asking me to narrow to a
neighborhood. I wonder how it decides what is the "top" result among the
matches? Is it based on popularity? Then the system gave a listing of 8 stores
using the street name. 2 of the options were on Broadway which is a street
stretching for several miles along Manhattan so more specific location
information would be helpful.
Conclusions
If I had a zip-code GPS,
GOOG-411 would be my choice of a voice interface whenever I need to find a place
nearby. (surely, this can be determined from a phone call?). Searching by name
of business, I would probably choose CALL-411 although it would be nice to be
able to narrow down by neighborhood. In my opinion, it is important to adapt the
system's strategy in the directory assistance application to 1) the properties
of searched business, e.g. business' popularity, 2) the type of result, e.g.
number of results, locations of results, and 3) the area where user is trying to
search, e.g. densely or sparsely populated. Another issue and an open research
topic is a natural navigation on the extracted information.
I am looking forward to seeing the
evolution in the voice information systems and of course to the new kinds of
voice interfaces.
Since this article was written,
other informal comparisons of these systems (such as
this one) have been appearing in the blogosphere. Why not try your own
evaluation? |