The ink isn’t quite dry on the contracts yet but we took the opportunity
of having the Speech Science and Technology
at Macquarie University to run a short workshop on the Virtual Laboratory. The
goal of the workshop was partly to publicise the project but also
to get some insight into the workflow that people were using in their research.
The SST Conference attracts a range of researchers with the common interest in
the analysis of speech acoustics. It is interesting in that it covers both the
technical end of speech and speaker recognition to the linguistics of spoken
language. It is often a place where we hear about cross-disciplinary work
between these areas, where computer scientists collaborate with linguists
and psychologists. As such, it is an excellent place to talk about the
goals of the HCSvLab.
We managed to get around 30 people to join us during the lunch break
and had a really useful discussion about corpora and tools and what we might
be able to do in the project to facilitate research in speech science.
We started by asking about the data that was being used in the research
being presented at the conference. Almost everyone said that they had
collected at least some data themselves as part of their study. In some cases
this was compared with data from existing corpora but only a few people relied
only on data collected by others. It was agreed that there might be more
scope for sharing or re-using data if it were easier to do; however,
there are well known issues with ethics and spoken language data that
mean this isn’t going to be an easy task.
It was interesting that only a small number of tools were used to annotate
and analyse the data. People mentioned Emu,
and ELAN for annotation with R and Matlab
being used for data analysis. While a few other tools were mentioned, these
covered the majority of the work that was being reported at the conference. There
were suggestions that some way of sharing and documenting snippets of code
for R or Matlab would be useful with one of the pain points being code that
didn’t work any more for some unknown reason.
Another significant pain point was the manual annotation or coding of
speech data. While the technologists in the room were quick to suggest
the use of speech recognition systems to do transcription or
forced-alignment, this was moderated by some of the acoustic phonetics
researchers who were wary of the reliability of these systems. The point
was made though that if it were much easier to use these
systems to generate annotation that it is now, more people would try it
out and evaluate the results.
Another idea that was raised was of crowd sourcing some part of the research process. It
was agreed that most of the annotation tasks in this field would require too much
expertise to be done by untrained people, but one might ask people make judgements about the similarty or acceptability of stimuli which would be useful in some studies.
Overall this was a really useful kick-off meeting for the project. We gained
some insight into the workflow that goes into research in this field and got
some enthusiastic feedback from researchers about the project. Hopefully by the
time the next SST comes around (in two years in Christchurch, NZ) we’ll be
listening to papers that have been written with the help of the HCSvLab.
First HCSvLab Workshop at SST2012 by Steve Cassidy is licensed under a Creative Commons Attribution 3.0 Unported License.