IEEE Home | Shop IEEE | Join IEEE | myIEEE | Contact IEEE | IEEEXplore
IEEE

IEEE Signal Processing Society
Speech & Language Technical Committee


411 Voice Interfaces: a user's perspective

By SVETLANA STENCHIKOVA

Several commercial companies have recently launched a toll-free directory assistance (411) service driven by speech recognition. I decided to try the voice search offerings from three companies Free411, TellMe/Microsoft, and Goog411 in a real life situation. This article describes my informal exploration of the 411 systems from Free 411, TellMe/Microsoft and Google.

If you live in a metropolitan area you can probably relate to this scenario. One evening after seeing a film at a Turkish film festival at Lincoln Center in New York City, we were in a mood for some Turkish food, but none of us had an idea how to find a Turkish restaurant near by. We decided to give these 411 services a try.

I also tried a more traditional scenario supported by the current 411 system: finding a business by its name.  To be fair to the systems, I tried two types of name searches.  First search was a UPS store with  more than 30 locations in NYC . I was interested to find one in the upper west side. Second search was for a place with a single location to the best of my knowledge - a local hair salon in New York City called "Cuts for Girls".

I found a nice variety in the systems handling the 411 assistance task. The systems ranged in their confirmation strategies, in the way of narrowing the choices strategies, and in the information presentation.

Free 411 (1-800-FREE-411)

Free 411 from Jingle Networks greets you in a friendly voice, following after a long advertisement. The system first asks the user for the desired city and state and then offers a search by category: residential, business, or government: a natural set of options in my opinion. Explicit confirmations make you feel confident that you are on a right track. The system allows you to say a location. I tried "Columbus Circle" and it was successfully recognized.

Entering a category of "restaurant", the interface starts listing a whole range of places from a fast-food delivery place to a coffee lounge. This strategy may work if you are searching in an area with limited options, but in Manhattan where there are tens of restaurants on each block you clearly need some kind of a filter. So, I decide to correct my category to be a "Turkish restaurant" but unfortunately "start over" or "go back" were not on the menu.   I found myself stuck with the only option of hearing all restaurant choices near Columbus Circle. After calling the system second time and going through the same menus, "Turkish restaurant" was unfortunately not understood. The system was quick to give up by telling you "there are no listing in your category" and hanging up.

Searching for a business by name was a mixed success. First, asking Free 411 for UPS found only 2 stores in New York City - very hard to believe that there so few locations.  While searching for a hair salon "Cuts for Girls", a system successfully handled misunderstanding with an explicit confirmation. Then I was notified of winning a Cancun vacation and offered to be transferred to the resort management multiple times. At the end the system dictated me the number, but unfortunately, it was the wrong one.

Live Search 411 (1-800-CALL-411)

 Call 411 is a voice service created by TellMe and Microsoft for business search.

This interface has a loud voice which does not stop talking when interrupted (although it does handle barge-ins). The interface allows the user to narrow down by category and location and it was successful in recognizing both the location and the category correctly. The lack of confirmation kept me in suspense: did it catch the "upper west side"?  My suspense was resolved only in a last step after choosing an option and hearing the description with address that was actually in the upper west side. However, if I was not familiar with an area I would still be unsure whether the address I had heard was in the correct neighborhood.

Once narrowed down on a category and location, the system starts listing places by their names. Listing of places by name makes sense if the name will mean something to you, when you can say "aha - that's the place!" However when hearing the names for the first time these names may not mean anything to you and some extra information, such as a street location, would be useful.

Navigation between options was also not that easy. After selecting one option and hearing the full description, I could get it texted or repeated, but I could not figure out how to hear more options in this category without starting over!

Search for a specific business was more productive. This system found 9 UPS stores in New York city - that's 7 more than free-411! It did not let me narrow down by neighborhood, but hearing 9 choices is bearable. The fourth option turned out the one closest to me. My hair salon was found quickly and successfully, no Cancun vacations though.

Google 411 (1-800-GOOG-411)

When I talked to goog411 I could almost picture their graphical user interface in a browser. The implicit confirmations were helpful and unobtrusive.  Google first asks you for a category and then it sometimes asks you to narrow down by location. The decision to ask for a more specific location may depend how many options are available for the category. I found narrowing down a location problematic: I tried "Lincoln Center", "Columbus Circle", and various intersections, but no matter what I would choose, the interface would just ignore my input. Luckily it is possible to input a zip code which is a fault tolerant method, only if you happen to know the zip code.

 If the location and category are recognized correctly, the information presentation was the most intuitive on the GOOG-411's interface.

When searching for a "Cuts for Girls", my searched place was not among the first system's choices, but it was the second. After "go back", the system smartly offered me help on spelling the name. This shows a nice adaptive strategy. When I asked to hear more options, the system gave me only one more, but luckily it was the correct place! I was curious to see if there were other options, but the system rushed to connect me.

Searching for UPS store, the system is quick to offer the first result, without asking me to narrow to a neighborhood. I wonder how it decides what is the "top" result among the matches? Is it based on popularity? Then the system gave a listing of 8 stores using the street name. 2 of the options were on Broadway which is a street stretching for several miles along Manhattan so more specific location information would be helpful. 

Conclusions

If I had a zip-code GPS, GOOG-411 would be my choice of a voice interface whenever I need to find a place nearby. (surely, this can be determined from a phone call?). Searching by name of business, I would probably choose CALL-411 although it would be nice to be able to narrow down by neighborhood. In my opinion, it is important to adapt the system's strategy in the directory assistance application to 1) the properties of searched business, e.g. business' popularity,  2) the type of result, e.g. number of results, locations of results, and 3) the area where user is trying to search, e.g. densely or sparsely populated. Another issue and an open research topic is a natural navigation on the extracted information.

I am looking forward to seeing the evolution in the voice information systems and of course to the new kinds of voice interfaces.

Since this article was written, other informal comparisons of these systems (such as this one) have been appearing in the blogosphere. Why not try your own evaluation?


 
SLTC Home   |    IEEE Home   |   Privacy & Security   |    Terms & Conditions