Skip to main content

ISCA SAC is looking for volunteers! If you are interested, get in touch with us here!

Student meet senoirs event - group "Applications: Robotics, speech, industry"

 We had students from different institutes, and a senior whose works strongly reflected the demand of the industry. In general, we discussed our concerns and interest in related to speech processing and robotics. This provided a valuable opportunity for us to exchange ideas, get to know each other's work, and find potential opportunities for collaboration. Given a large scale conference like InterSpeech, such a conversation is indeed necessary for students to form a direct benefit from the research community. Following are some of the main areas of discussion:

Robotics and human-robot interaction in particular involve various challenges related to complex systems. This is the reason for a wide variety of disciplines in our group. In common are some particular issues including - training, intelligibility and quality of speech in noisy environments, speech recognition, reverberation and separation. Solving one problem is sometimes strongly dependent on results of others.

Considering a large amount of active research resulting in new ideas, Dr. Mauro Falcone raised two interesting and practical points confronting the industry. The first one is - how backward compatible are our solutions? Researchers often focus on optimizing and increasing the performance of their systems. However, the industry is more concerned with how a new solution can be implemented on top of existing systems. With particular focus on the cost of rebuilding new infrastructure, services and related components, researchers should keep in mind the need for easy integration and backward compatibility. The second point raised was the current gap between what is possible and what has been achieved. Many laboratory achievements show significant potential. In most countries, researchers and end-users do not have a great deal of contact, consequently we are missing opportunities to ensure our systems meet their requirements.

In the last part of the discussion, we covered two related topics. Firstly our expectation of having an accessible and widely recognized data base for speech and video. We found AMI data corpus is currently adequate, though there are limitations. Last but not least a similar perceived need for a 'standardized libary'

of implementations for existing methods. This would be useful to compare and evaluate new methods. If no code or software is provided, contacting the authors of the methods currently is the only option.