Voice Recognition using Dictionaries

3 min readFeb 26, 2022

Objective: Use a dictionary of commands to move to or look at objects.

Voice commands used to move Main Camera to object or look at an object.

Create a new scene and add three objects: Cube, Sphere and Cylinder. Now select the main Camera and add a C# script named VoiceCommands. The voice commands requires five namespaces. Windows speech recognition is used for this article.

Two variables must be assigned, the first _keyWordRecognizer is a KeyWordRecognizer class, which can be thought of as a audio string for speech. The dictionary, actions, will be used to to create command keys which are linked to a value, except our value is an Action. An Action is part of the namespace System and allows a method to be used as the value.

In void Start() the commands are added to the actions dictionary. The key or string is the actual command that must be spoken and either Teleport or LookAt method is the action, depending on the key. Once the dictionary is ready it is used to create an array of recognized keys for the _keyWordRecognizer. OnPhrasedRecognized calls the RecognizedSpeech method if the a actions.key is spoken. Finally _keywordRecognizer.Start() initiates the listening process for Windows speech.

Populate the dictionary and initialize Windows Speech.

The RecognizedSpeech() method accepts the PhraseRecognizedEventArgs, speech, and converts it to text, using this as the key for the dictionary. Invoke is used to call the action value or method linked to the key from the dictionary.

The Teleport() and LookAt() methods are similar in that each receives the speech.text string as command. Command is Split() into a words array. The last word of the phrase is used to find the target object. The Main Camera then either moves to the new target, with an offset so that the object can be seen, or looks at the target.

Using an action as the value for a dictionary is a powerful tool.

Voice Recognition using Dictionaries

Written by Hal Brooks

No responses yet