Jul 27, 2011 , by
Public Summary Month 7/2011
In the BABIR experiment, Aldebaran Robotics and Vocally aim at improving the quality of interaction between a human and a domestic robot thanks to the introduction of low level audio processing. Vocal interface (speech and listening) is the most intuitive way to communicate between people. That is the reason why the co-worker or companion robot needs a good audition and a good voice. Of course, the primary sense for a robot is vision that allows action. Many developments have been done in this domain and a lot has still to be done. But if we consider that the new robotic generation has to be closer and closer to the human, the communication become more and more important. So it is necessary that robots can speak and, above all, listen.
Today, speech recognition units are not satisfactory as soon as the acoustic environment is not perfectly controlled. The challenge we take up is to develop a speech recognition unit that works in a very robust way on a 57cm moving humanoid robot, Nao, with a limited embedded computing power and who can be up to 4m from the speaker in a domestic and possibly noisy environment.
BABIR partners work currently on the specifications. The scenario on which they agreed is the following : the Nao robot is walking around when someone calls it. The robot is able to distinguish the call in spite of the surrounding noise and its internal noises (fan, mechanical noises, and collisions of the feet on the ground when walking).
A list of words or expressions to be recognized by Nao both in French and in English was fixed. At the first stage, there are 18 of them, such as Yes, No, Nao, Don’t move, Silence, Higher, Thank you… After recognizing these expressions, Nao will act according to them.