IEEE Home | Shop IEEE | Join IEEE | myIEEE | Contact IEEE | IEEEXplore
IEEE

IEEE Signal Processing Society
Speech & Language Technical Committee


JHU 2007 Summer Workshops Tackle Language and Speech Topics

By BRIAN MAK

The Center for Language and Speech Processing (CLSP) of Johns Hopkins University continues to hold its annual JHU Summer Workshop on Human Language Technology, this year for 6 weeks from July 16th to Aug 23rd, 2007. During the workshop, speech and language scientists from universities, industry, and government research institutes will work with graduate and undergraduate students to probe further in two specific areas with the hope to advance the current state-of-the-art speech and language technologies:

  • Exploiting Lexical & Encyclopedic Resources For Entity Disambiguation
  • Recovery from Model Inconsistency in Multilingual Speech Recognition

We did an "email interview" with the 2 team leaders, and asked them (1) how they came up with these topics (2) how they found their workshop team (3) what outcomes they were expecting. Below are brief descriptions of the two projects, and the comments from the project leaders.



Exploiting Lexical & Encyclopedic Resources For Entity Disambiguation

Group Leader: Massimo Poesio

Description: To improve entity disambiguation by developing better techniques for tracking entities and for extracting their properties. A particular focus will be improving entity tracking by using lexical and encyclopedic knowledge extracted both from structured lexical databases and from semi-structured repositories such as Wikipedia.

Comments from Massimo Poesio:

The topic of the project was born out of two considerations that have informed our work on coreference resolution for a long time; the decision to have it this year was taken because of a number of recent advances.

The first consideration is that it is clear that the lack of commonsense knowledge, or inability to use it, is still the main problem facing systems doing semantic interpretation tasks such as coreference resolution and entity disambiguation. In the '80s, we tried to solve the problem by hand-coding the required knowledge and the inference rules. While this type of work greatly increased our understanding of the required inference processes, the methods that were developed could only be applied to systems working on very specific domains. In the '90s the emphasis in HLT shifted to work with large amounts of data; as a result, those early methods were abandoned, to concentrate on trying to achieve as much as possible using surface information. In the meantime however a great deal of effort was put in developing techniques for acquiring the desired knowledge automatically. This effort is beginning to pay off: recent methods for extracting knowledge from Web-sized corpora and from resources such as Wikipedia, such as those developed by myself, Ponzetto, Strube, and Versley (all participants in the workshop) have been shown to lead to improvements in coreference over standard methods using only surface information. In particular Wikipedia is proving a tremendous source of encyclopedic knowledge.

The second consideration is that ultimately HLT systems are only incorporated in real applications when shown to lead to performance improvements in actual applications. In other work we showed that coreference technology is coming to the point when it can lead to improvements in performance in automatic summarizers. Entity disambiguation is a task in which we can expect an even bigger improvement from the use of coreference resolvers.

The advances that made us decide to have the workshop include, in addition to the already mentioned improvements in our techniques for extracting lexical and encyclopedic knowledge, improvements in machine learning technology, and the availability of new resources. Work by Moschitti and Yang on using tree kernels for semantic interpretation, by Yang on models of training instances, and by Andrew McCallum's group (among which Rob Hall and Michael Wick, who participate in the workshop) on global models for entity disambiguation, is resulting in models much more appropriate for entity disambiguation, particularly for using lexical and encyclopedic knowledge. Finally, the recent release of the OntoNotes corpus, annotated for coreference, will provide a much more solid foundation to work on coreference, whereas new corpora for entity disambiguation have also been released; our group is also busy creating resources for cross-document coreference under the coordination of David Day and Janet Hitzeman, who also participate (David Day is co-chair).

Based on these considerations, the team in a sense it chose itself. (A number of other groups will participate in the effort as external collaborators.) The outcomes we hope for: first of all, a clear demonstration that automatically extracted commonsense knowledge does help coreference resolution. Secondly, that coreference leads to improvements in entity disambiguation. As bonuses, we expect to deliver a publicly available system for coreference resolution that incorporates the best ideas around, and new resources such as annotated corpora.


Recovery from Model Inconsistency in Multilingual Speech Recognition

Group Leader: Hynek Hermansky

Description:
Current ASR has difficulties in handling unexpected words that are typically replaced by acoustically acceptable high prior probability words. Identifying parts of the message where such a replacement could have happened may allow for corrective strategies. The project will focus on on detection and description of out-of-vocabulary and mispronounced words the 6-language CallHome database. Additionally, to describe the suspect parts of messages, a language-independent recognizer of speech sounds will be developed and applied for phonetic transcription of identified suspect parts of the recognized message.

Comments from Hynek Hermansky:

The project was proposed since we believe that current ASR is too top-down heavy and behaves like an idiot who is quite willing to give a wrong answer regardless of the data evidence. Clearly, the issue of OOV words, mispronunciations, etc. is big and important - would not you agree?

As an outcome, we hope to know more about this by the end of the summer than we know now and have an idea how to work towards ASR that would rely on top-down constraints only when appropriate and also be able to tell when the data are inconsistent with the prior knowledge. A proposed architecture for such a system is shown in Figure 1 below.

The team was formed from interested people who participated at the preparatory meeting (Chin, Geoff and me) and colleagues who were judged appropriate.


Figure 1: A new framework in developing acoustic models
 


 
SLTC Home   |    IEEE Home   |   Privacy & Security   |    Terms & Conditions