|
Voices Across
Birmingham
BY MARTIN
RUSSELL
One of the characteristics of British
English is the range and diversity of its regional accents
and dialects. Although it is acknowledged that inter-accent
variation has implications for speech technology, most of
the evidence to-date is anecdotal. The dearth of hard
experimental evidence on the effect of accent on, for
example, automatic speech recognition performance, is due
mainly to an absence of suitable data. For this reason, in
2003 Aurix Limited funded the
University of Birmingham
to create the "Accents of the British Isles" (ABI) corpus,
comprising approximately 100 hours of transcribed recordings
of accented English from 14 different locations in the
British Isles (Elgin (Scottish Highlands), Glasgow,
Newcastle, Ulster, Dublin, Liverpool, Hull (East Yorkshire),
Burnley (Lancashire), Denbigh (North Wales), Birmingham,
Lowestoft (East Anglia), Truro (Cornwall) and Inner
London). At each town or city the goal was to record ten
men and ten women who were born in that location and had
lived there all of their lives. ABI also contains
recordings of 'Standard Southern English'.
In 2006 the University created a spin-out
company, The Speech Ark, as a vehicle for creating further
corpora. It’s first project, ABI-2, was to record a new
corpus which extends the original ABI corpus to include
thirteen new regional accents (Edinburgh, Hartlepool, Leeds,
Stoke-on-Trent, Coalville (Leicestershire), Shrewsbury
(Shropshire), Hereford, Caernarfon (North Wales), Cardiff,
Bristol, Yeovil (West Country), Gornall (Black Country) and
Southend-on-Sea). Together, the two ABI corpora contain
almost 200 hours of recordings. They provide a unique
resource for speech science and technology research and a
‘snapshot’ of British English regional accents at the start
of the 21st century.
The Speech Ark's most recent project is
"Voices across Birmingham". The objective is to record 200
hours of telephone conversational speech between people from
the West Midlands. This is a multi-cultural community,
where the most recent census indicates that in 2001 the
broad ethnic background of 20% of the population was Asian.
One of the challenges of "Voices across Birmingham" is to
represent this diversity. Once the Birmingham corpus is
complete, The Speech Ark plans to create similar corpora for
other regions of the British Isles.
For further information see
The Speech Ark.
|