IEEE Home | Shop IEEE | Join IEEE | myIEEE | Contact IEEE | IEEEXplore
IEEE

IEEE Signal Processing Society
Speech & Language Technical Committee


ConQuest Conquers Interspeech

By  SATANJEEV BANERJEE

Who is Presenting When? Ask ConQuest!

At the recently concluded INTERSPEECH 2006 ICSLP speech conference in Pittsburgh, Pennsylvania, finding out which paper was being presented when was as easy as making a phone call.

A team of students at Carnegie Mellon University designed and implemented ConQuest - a spoken dialog system to answer all your Conference Questions. You could ask the system what sessions were scheduled each day, which papers were in each session and when a particular person's paper was being presented. As an added attraction, you could vote for your favorite paper in the conference, just like on American Idol. A similar dialog system, DiSCoH, has been built by AT&T, ICSI and Edinburgh University to provide conference information such as location, dates, etc for the IEEE/ACL SLT 2006 Workshop.

"Our goal was to get more experience in building an end-to-end real-world dialog system," says team member Stefanie Tomko, "and to collect data in a new domain."

ConQuest, like many other dialog systems built at CMU, was based on CMU's Open Source Olympus architecture, that uses Sphinx as the speech recognizer, Theta as the speech synthesizer, and Ravenclaw as the dialog manager. Theta is proprietary software from Cepstral LLC. When asked which part of the experience of building ConQuest he found most interesting, project leader Dan Bohus said he was pleasantly surprised how seemingly difficult problems - like recognizing paper titles - had simple solutions that worked well. "We just lexicalized each paper title," said Bohus.

"On the other hand, we landed up spending a lot of time on making sure that the different data sources like the database of paper titles, the language model, the dictionary, were always synchronized," he continued. "We hadn't anticipated how much time that would take."

ConQuest was advertised through a demonstration on Sunday afternoon, and through posted signs and leaflets reminding people to try the system out. During the 4 days of the conference, a total of 175 calls were made to the system. The team plans to release this data to the research community.

And was the system useful to the callers?

"It was useful to me!" exclaims Ravi Mosur, Senior Systems Scientist at Carnegie Mellon University, and long time speech recognition expert who knows better than most how difficult speech recognition is, especially in noisy conference conditions. "I wanted to know when someone's paper was scheduled, and the system told me!"

Even though the conference is over, the team plans to continue working on ConQuest.

"We want to generalize the system so it can be rapidly adapted to different conferences," says team member Rohit Kumar.

In other words, keep your phone handy, ConQuest may be coming soon to a conference near you!

The ConQuest team (in alphabetic order): Dan Bohus, Venkatesh Keri, Gopala Krishna, Rohit Kumar, Sergio Grau Puerto, Antoine Raux and Stefanie Tomko, all current students or visiting researchers at Carnegie Mellon University.


 
SLTC Home   |    IEEE Home   |   Privacy & Security   |    Terms & Conditions