IEEE Home | Shop IEEE | Join IEEE | myIEEE | Contact IEEE | IEEEXplore
IEEE

IEEE Signal Processing Society
Speech & Language Technical Committee


Saras Institute Records the

History of Speech and Language Technology

BY ANTONIO ROQUE

 

The History of Speech and Language Technology Project, coordinated by the non-profit Saras Institute, is in the process of recording interviews with seminal researchers in the field and making content from these interviews available on the web.  The Saras Institute states that this effort is being done "so that future generations can better appreciate the insights and perspectives that created this technology."

Dr. Janet Baker, head of the Saras Institute, says, "There has been a significant level of interest in these materials and we are eager to make them available. All of the work that has been done to date with Saras Institute, including making these interviews, has been done on a volunteer basis." Dr. Baker continues, "Saras Institute has also collected an extensive archive of historical technology artifacts, from research materials and historical videos to products and marketing "tchotkes". Several popular presentations including live demonstrations and exhibits of these materials have been made in recent years at international technical conferences and institutes."

A number of interviews have already been recorded by the Saras Institute, and several are in the process of being transcribed by ISCA's Student Advisory Committee, which welcomes additional volunteers; interested parties may contact volunteers [at] isca-students [dot] org.

Two interview excerpts are included below.  The interviews were conducted in 2005 by Dr. Baker, and in them Dr. Sadaoki Furui and Dr. Isabel Trancoso discuss their early years in the field.  We look forward to providing additional excerpts in the coming months.

Sadaoki Furui

Q: We're very interested, as I had mentioned, in trying to understand sort of how people got into this field, and why they do what they do... what brought you into speech technology?

A: I was studying signal processing at the university graduate school and I used many signals and one of them was speech. So I started studying signal but my interest moved from signal to speech. So I wanted to find a place to work on speech research because speech is so interesting and also so difficult so I thought speech was very interesting area for my life's subject of research. So I was looking for some places to work in on speech in Japan and definitely NTT - Nippon [Telegraph and] Telephone was the leader of the speech research at the time, late 1970, it was 35 years ago.

And worldwide, AT&T was the leader at the time, and NTT and AT&T were working together collaborating, they had very good collaboration. That was very lucky for me. After starting my career at NTT I had the chance to have communication with AT&T people and I got a chance to come to the US for just one year, 1978-1979 staying at Bell Labs Acoustics Research Laboratory and ... Jim Flanagan and that was my great time to make some more steps of my research career.

Q: Professor [Fumitada] Itakura who was good enough to speak with us yesterday mentioned that he was the first person to go from NTT over to Bell Labs and work with Jim Flanagan's group and so forth.

A: Yes, yes, always he was and he is my leader, and when I started my career in 1970 at NTT he was my leader, he was my supervisor.

Q: Yes, and you went from that position to taking charge of that laboratory!

A: (laughs) Right!

 

Isabel Trancoso

Q: So one of the questions that we are asking everybody is: how did you get into the field?

A: Believe it or not, I started working on speech encryption, for the Portuguese Army! It was a very simple project, but it had to be developed in microprocessors, so it was assembly code...! And we had to split the input speech into frames, frequency-invert some of them according to permutation code, and send it over. That was the way I got involved in speech! So it was a kind of final graduation project.

Then I started working on a PhD thesis, and the topic was speech coding, narrow-band speech coding, and I presented a paper at ICASSP, it was a paper on trying to improve multipulse which had just been presented a couple of years before, and Bishnu [Atal] was in the audience, he thought the work showed some promise, so he invited me to spend a year at Bell Labs - actually it turned out to be less than that for family reasons - and it was great. I mean the first three months were awful, I missed my family like crazy, and the work was not progressing that much, but then Bishnu showed me a coder that he had which he called his 'impractical coder', he had finished working with it because it took like a hundred times processing real-time... in a Cray! So... But the quality was so amazing for the standards at that time that I said 'Bishnu I'd rather work on that,' and so that turned out the best decision I ever made. I had a fantastic time working with him.

Q: What year was that?

A: 84-85. So it was really great, he's a fantastic person to work with, he knows when we have time to explore new things and when it's time to get down to business and really produce something, because we were working under contract for the Department of Defense. So I mean everything turned out all right.

And then I came back to Portugal and working on coding didn't become that relevant. I had a project with some companies, some international companies still, but we didn't have the resources to follow all the testing, all the evaluation involved and standards and things like that, so working on coding became less and less relevant, and I got very enthusiastic in working for Portuguese.

That meant that I-- there was nothing. There were no databases, nothing, so I spent a lot of years building resources - too many, by the time the resources were ready we were far behind in technology I should say - but I got a fantastic group of people, I mean we it's the best working environment I could dream with. I teach at University and we work in Portuguese. In terms of paper production it was not the good decision to do. I mean it's very hard. By the time we produce something and submit it to a journal, our most typical review is 'why don't you test on a standard corpora?' the standard corpora obviously not being in Portuguese. I mean even if we tried in the past to do things that were comparable in dimension, like having a Portuguese version of Wall Street Journal and things like that, Portuguese version of the SPEECHDAT telephone corpus, everything comparable, even so it gets difficult. So that has, you know...

but then I got very wonderful cooperation with linguists and more recently a wonderful cooperation with computer science people so we have I guess all the right ingredients.

Q: In Portugal? 

A: Yes. So we have the right ingredients now to expand and work on things that are more spoken-language oriented rather than signal-processing oriented.


 
SLTC Home   |    IEEE Home   |   Privacy & Security   |    Terms & Conditions