Saras Institute Records the
History of Speech and Language
Technology
BY ANTONIO
ROQUE
The
History of Speech
and Language Technology Project, coordinated by the non-profit Saras
Institute, is in the process of recording interviews with
seminal researchers in the field and making
content from these interviews available on the web. The Saras Institute states
that this effort is being done "so that future generations can better appreciate
the insights and perspectives that created this technology."
Dr. Janet Baker, head of the
Saras Institute, says, "There has been a significant level of interest in these
materials and we are eager to make them available. All of the work that has been
done to date with Saras Institute, including making these interviews, has been
done on a volunteer basis." Dr. Baker continues, "Saras Institute has also
collected an extensive archive of historical technology artifacts, from research
materials and historical videos to products and marketing "tchotkes". Several
popular presentations including live demonstrations and exhibits of these
materials have been made in recent years at international technical conferences
and institutes."
A number of interviews have
already been recorded by the Saras Institute, and several are in the process of
being transcribed by ISCA's Student Advisory Committee, which welcomes
additional volunteers; interested parties may contact volunteers [at] isca-students
[dot] org.
Two interview excerpts are
included below. The interviews were conducted in 2005 by Dr. Baker, and in them
Dr. Sadaoki Furui and Dr. Isabel Trancoso discuss their early years in the
field. We look forward to providing additional excerpts in the coming months.
Sadaoki Furui

Q:
We're very interested, as I had mentioned, in trying
to understand sort of how people got into this field, and why they do what they
do... what brought you into speech technology?
A:
I was studying signal processing at the university
graduate school and I used many signals and one of them was speech. So I started
studying signal but my interest moved from signal to speech. So I wanted to find
a place to work on speech research because speech is so interesting and also so
difficult so I thought speech was very interesting area for my life's subject of
research. So I was looking for some places to work in on speech in Japan and
definitely NTT - Nippon [Telegraph and] Telephone was the leader of the speech
research at the time, late 1970, it was 35 years ago.
And worldwide, AT&T was the
leader at the time, and NTT and AT&T were working together collaborating, they
had very good collaboration. That was very lucky for me. After starting my
career at NTT I had the chance to have communication with AT&T people and I got
a chance to come to the US for just one year, 1978-1979 staying at Bell Labs
Acoustics Research Laboratory and ... Jim Flanagan and that was my great time to
make some more steps of my research career.
Q:
Professor [Fumitada] Itakura who was good enough to
speak with us yesterday mentioned that he was the first person to go from NTT
over to Bell Labs and work with Jim Flanagan's group and so forth.
A:
Yes, yes, always he was and he is my leader, and when
I started my career in 1970 at NTT he was my leader, he was my supervisor.
Q:
Yes, and you went from that position to taking charge
of that laboratory!
A:
(laughs) Right!
Isabel Trancoso

Q:
So one of the questions that we are asking everybody
is: how did you get into the field?
A:
Believe it or not, I started working on speech
encryption, for the Portuguese Army! It was a very simple project, but it had to
be developed in microprocessors, so it was assembly code...! And we had to split
the input speech into frames, frequency-invert some of them according to
permutation code, and send it over. That was the way I got involved in speech!
So it was a kind of final graduation project.
Then I started working on a
PhD thesis, and the topic was speech coding, narrow-band speech coding, and I
presented a paper at ICASSP, it was a paper on trying to improve multipulse
which had just been presented a couple of years before, and Bishnu [Atal] was in
the audience, he thought the work showed some promise, so he invited me to spend
a year at Bell Labs - actually it turned out to be less than that for family
reasons - and it was great. I mean the first three months were awful, I missed
my family like crazy, and the work was not progressing that much, but then
Bishnu showed me a coder that he had which he called his 'impractical coder', he
had finished working with it because it took like a hundred times processing
real-time... in a Cray! So... But the quality was so amazing for the standards
at that time that I said 'Bishnu I'd rather work on that,' and so that turned
out the best decision I ever made. I had a fantastic time working with him.
Q:
What year was that?
A:
84-85. So it was really great, he's a fantastic person
to work with, he knows when we have time to explore new things and when it's
time to get down to business and really produce something, because we were
working under contract for the Department of Defense. So I mean everything
turned out all right.
And then I came back to
Portugal and working on coding didn't become that relevant. I had a project with
some companies, some international companies still, but we didn't have the
resources to follow all the testing, all the evaluation involved and standards
and things like that, so working on coding became less and less relevant, and I
got very enthusiastic in working for Portuguese.
That meant that I-- there
was nothing. There were no databases, nothing, so I spent a lot of years
building resources - too many, by the time the resources were ready we were far
behind in technology I should say - but I got a fantastic group of people, I
mean we it's the best working environment I could dream with. I teach at
University and we work in Portuguese. In terms of paper production it was not
the good decision to do. I mean it's very hard. By the time we produce something
and submit it to a journal, our most typical review is 'why don't you test on a
standard corpora?' the standard corpora obviously not being in Portuguese. I
mean even if we tried in the past to do things that were comparable in
dimension, like having a Portuguese version of Wall Street Journal and things
like that, Portuguese version of the SPEECHDAT telephone corpus, everything
comparable, even so it gets difficult. So that has, you know...
but then I got very
wonderful cooperation with linguists and more recently a wonderful cooperation
with computer science people so we have I guess all the right ingredients.
Q:
In Portugal?
A:
Yes. So we have the right ingredients now to expand
and work on things that are more spoken-language oriented rather than
signal-processing oriented.