Skip to main content

ISCA SAC is looking for volunteers! If you are interested, get in touch with us here!

Notes from the Interspeech Roundtable International Senior: Hiroya Fujisaki, Local Host: Giuliano Bocci

 

First of all each participant presented himself/herself and gave a brief overview of his/her work. Andrea deMarco's main interest area is automatic speaker identification. David Martinez is also interested in identification, but his focus is on the identification of language; he is also interested in noisy robust techniques. John Taylor is doing research on pitch estimation. Keng-hao Chang's interests are in the area of human-machine-interaction; he is building a mental health monitor via the human voice. Yan Tang is focussing on the speech enhancement in noisy conditions. Charlotte Wollermann is interested in audiovisual prosody and focus. Last, but not least, our local host Giuliano Bocci's interests comprise the interplay between syntax and prosody in Italian. Hiroya Fujisaki's main research areas cover speech communication and language processing, natural language processing, human and artificial intelligence etc. He also developed a model for the process of fundamental frequency control in speech.

 

After the presentation round we realized that we are all from different countries and this brought us to the topic of entrainment, which was addressed by Julia Hirschberg in her keynote talk. Often when we travel, the phenomenon of entrainment can be observed. We are trying to speak like the other person who is coming from a different culture. The aim is to be understood by the communication partner, such that the successful communication can be seen as overall goal.

 

The next topic was focus. Defining focus is not trivial since there are many different types of focus in natural language and therefore focus may have different functions. The marking of focus is language-dependant, so there are various strategies, but also the dialect can play a role.

 

Another topic of discussion was the difference between pitch and fundamental frequency. Pitch is temporal, relative and perceptually relevant. In contrast to that the fundamental frequency is precise, but not necessarily perceptually important. We also talked about algorithms for pitch extraction,

 

Discussing the fundamental frequency we came to the phenomenon of accentuation. We talked about the ABI (the accent of the british isles) corpus. This corpus contains recordings from different accent regions of the british isles. We also discussed that there are different accent types, e.g. word accent, sentence accent. Accentuation can be used by humans for highlighting particular information, but what about animals? Animals do not have linguistic knowledge. However it is known that the variation of fundamental frequency can be also observed in animals, e.g. the fundamental frequency of sound plays a role for male songbirds in order to attract female songbirds.