ConQuest Conquers
Interspeech
By
SATANJEEV BANERJEE
Who is Presenting When? Ask ConQuest!
At the recently concluded INTERSPEECH 2006 ICSLP
speech conference in Pittsburgh, Pennsylvania, finding out which paper was being
presented when was as easy as making a phone call.
A team of students at Carnegie Mellon University
designed and implemented
ConQuest - a
spoken dialog system to answer all your Conference Questions.
You could ask the system what sessions were scheduled each day, which papers
were in each session and when a particular person's paper was being presented.
As an added attraction, you could vote for your favorite paper in the
conference, just like on American Idol. A similar dialog system,
DiSCoH, has
been built by AT&T, ICSI and Edinburgh University to provide conference
information such as location, dates, etc for the IEEE/ACL SLT 2006 Workshop.
"Our goal was to get more experience in building
an end-to-end real-world dialog system," says team member Stefanie Tomko, "and
to collect data in a new domain."
ConQuest, like many other dialog systems built at
CMU, was based on CMU's Open Source
Olympus
architecture, that uses Sphinx as the speech recognizer, Theta as the speech
synthesizer, and Ravenclaw as the dialog manager. Theta is proprietary software
from Cepstral LLC. When asked which part of the experience of building ConQuest
he found most interesting, project leader Dan Bohus said he was pleasantly
surprised how seemingly difficult problems - like recognizing paper titles - had
simple solutions that worked well. "We just lexicalized each paper title," said
Bohus.
"On the other hand, we landed up spending a lot
of time on making sure that the different data sources like the database of
paper titles, the language model, the dictionary, were always synchronized," he
continued. "We hadn't anticipated how much time that would take."
ConQuest was advertised through a demonstration
on Sunday afternoon, and through posted signs and leaflets reminding people to
try the system out. During the 4 days of the conference, a total of 175 calls
were made to the system. The team plans to release this data to the research
community.
And was the system useful to the callers?
"It was useful to me!" exclaims Ravi Mosur,
Senior Systems Scientist at Carnegie Mellon University, and long time speech
recognition expert who knows better than most how difficult speech recognition
is, especially in noisy conference conditions. "I wanted to know when someone's
paper was scheduled, and the system told me!"
Even though the conference is over, the team
plans to continue working on ConQuest.
"We want to generalize the system so it can be
rapidly adapted to different conferences," says team member Rohit Kumar.
In other words, keep your phone handy, ConQuest
may be coming soon to a conference near you!
The ConQuest team (in alphabetic order): Dan
Bohus, Venkatesh Keri, Gopala Krishna, Rohit Kumar, Sergio Grau Puerto, Antoine
Raux and Stefanie Tomko, all current students or visiting researchers at
Carnegie Mellon University. |