Description:
We describe a speech system for commanding robots in human-occupied outdoor military supply depots. To operate in such environments, the robots must be as easy to interact with as are humans, i.e. they must reliably understand ordinary spoken instructions, such as orders to move supplies, as well as commands and warnings, spoken or shouted from distances of tens of meters. These design goals preclude close-talking microphones and “push-to-talk” buttons that are typically used to isolate commands from the sounds of vehicles, machinery and non-relevant speech. We used multiple microphones to provide omnidirectional coverage. A novel voice activity detector was developed to detect speech and select the appropriate microphone to listen to. Finally, we developed a recognizer model that could successfully recognize commands when heard amidst other speech within a noisy environment. When evaluated on speech data in the field, this system performed significantly better than a more computationally intensive baseline system, reducing the effective false alarm rate by a factor of 40, while maintaining the same level of precision.