II. NAVIGATION SOFTWARE

Back to Thesis Main page

Voice navigation software is basically a command and control type of application, as previously explained in chapter II of this paper. It allows the user to open, close, and to perform many menu driven commands within specific applications. The two navigation software packages implemented and evaluated for this study are Microsoft's Voice Pilot 2.0 - a part of the Windows Sound System software package, and Command Corp.'s IN ³ Voice Command For SPARCstation. The latter will be installed on a SPARCstation running Sun OS 4.1.3.

A. Microsoft windows Voice Pilot 2.0

Voice Pilot works with the Microsoft Windows 3.x operating systems. It is compatible with all MS Windows compatible applications. Once installed, the application is fairly easy to use. It comes with several "wizards" - macros that automate or simplify the setup or usage of an application, which enhance its simplicity (Figure 9). These macros aid in the creation voice commands, new vocabularies, setting user preferences, and training voice commands.

B. Evaluation

The evaluation of both IN³ and VoicePilot consisted of giving navigational commands and taking note of all errors that occurred. Usability and ease of training the vocabulary and adding commands, were also taken into consideration while evaluating both software packages.

1. VoicePilot Version 2.0

VoicePilot performed reasonably well in a moderately quiet environment. Moderately quiet means in this case that the environment was less quiet than that of a normal office. In this environment the navigational ability of VoicePilot was nowhere close to the level of accuracy that would be required in a noisy shipboard environment.

Figure 16 depicts the range of accuracy of VoicePilot over a period of six trials. Using 114 trained commands within supported programs (MS Word, WordPerfect, and Program Manager), VoicePilot was evaluated by actually navigating the supported Windows applications. The maximum accuracy reached by VoicePilot was 77.77%. Most users would probably desire a minimum of 90% accuracy. Any less than that and it would be easier to do navigation by hand.

Figure 16. VoicePilot accuracy

The errors made by VoicePilot were categorized into three types:

Commands that were unrecognized or not heard by VoicePilot,
Commands that were unable to be corrected within the VoicePilot correction dialogue, and
Commands that were incorrectly recognized by VoicePilot which resulted in unwanted actions being performed by the software.

The percentage of these type errors as a part of the overall amount of errors is shown in Figure 17.

Figure 17. Error type percentages

The number of errors made by VoicePilot that resulted in some unwanted action was very high, as shown in Figure 17. Though no major setbacks were experienced, the potential for disaster is quite extreme. Although the majority of the errors were corrected, many of the commands were not able to be corrected using VoicePilots correction dialogue window. There appears to be no true pattern of improvement. The same commands can be incorrectly recognized time after time, even with corrections being made. Even then the same errors still occur and sometimes the word needing to be replaced for what VoicePilot recognized is not listed among the choices of commands.

a. Adding Vocabularies And Commands

Adding new vocabularies was simple and quick with VoicePilot. All the user needed to do was to open the application for which the new vocabulary was to be used and then open VoicePilot. After opening VoicePilot, the user needed to choose the menu item "Vocabulary" and then choose "New Vocabulary." Once in this dialogue the user need only choose the target application and to check the radio button for adding the new vocabulary by automatic extraction (Figure 12). VoicePilot then extracts the vocabulary from the menu items of the target application and then offers to allow the user to conduct training for the new vocabulary of commands. The new vocabulary will be opened automatically by VoicePilot any time that the associated application is started while VoicePilot is active.

Adding individual voice commands is a different series of operations. In order to do this the user chooses "New Voice Command" from the "Vocabulary" menu. VoicePilot then opens the "Add New Command" dialogue window (Figure 18). The User then selects the application for which the new command is to be associated, the name of the new command, and the keystrokes associated with the command that are to be replace. This is a very easy way of creating a new command, though being able to record the mouse movements and then substituting them with the voice would probably be much easier. Not every user is going to be familiar enough with every application to know exactly which keystrokes perform which function. Most functions are easily accomplished by pressing a button on a toolbar with the mouse. The user must then train the new command in order for it to be recognized by VoicePilot.

Figure 18. Add New Command window

b. Ease of Use

VoicePilots interfaces made the program extremely "user-friendly" that is, the program was not very hard for even the novice computer user to operate. The many "wizards" included with the program made training and adding new vocabularies even simpler. The "User Preferences" wizard enabled the program to optimize its settings just by asking the user to say nine phrases (standard phrases that were the same each time the wizard was used) into the microphone/headset. The user never had to worry about manually setting any sound card settings or voice input levels. Though there is a manual setting choice, it was never used. The software will alert the user if the automatic setting was not able to be set and would then instruct the user to manually set the device input level.

2. IN³Voice Command

IN3 Voice Command by Command Corp. works under all audio-equipped SPARCstations using the following operating systems [Ref. 15: p. 2.]:

OpenWindows 3.x
Solaris 2.x (Sun OS5.x) .
Solaris 1.x (Sun OS 4.1.2 or 4.1.3).
Sun OS 4.1.1. - disregard warning messages from 1d.so that libc.so.1.6 has an older revision than expected.

IN3 speech recognition technology uses voice templates created for each command and stores them in a lexicon. When in recognition mode, the program compares the templates and matches them to the input data coming from the microphone [Ref. 15: p. 6]. The software performs these comparisons continuously and in real time. It is for this reason that it is important to create these templates in a quiet environment with a strong voice signal. Such templates will normally be well-matched and correctly recognized in an environment with typical office noise.

IN³ Voice Command performed very well under identical environmental conditions as VoicePilot. IN³ Voice Command was installed and operated on a SPARCstation using SunOS 4.1.3 and OpenWindows version 3. An Audio-Technica MT858 microphone was used as an input device. The microphone was very sensitive and could pick up the low pitched whine of the CPU cooling fan inside the SPARCstation. The user was able to position the microphone up to two feet away and still have a good input signal for the operation of IN³.

Figure 19. IN³ Accuracy over time

114 commands using the vocabulary listed in Appendix B were used to evaluate IN³. The accuracy of IN³ was very poor during the initial use of the application. With continued correction of errors and refinement of the voice commands, the accuracy of IN³ was able to be improved to 90.91%. Figure 19 shows the progressive improvement of accuracy with each use of IN³. Most users would feel very comfortable using IN³ at 90% or better. With increased use and refinement, the accuracy of IN³ should be able to be improved to well over 90%.

The errors made by IN³ were able to be categorized into three types of mistakes:

The command was not heard or recognized by IN³.
The command was recognized but there was no action performed by the software, or
The command was recognized but the wrong action was taken by the software.

As depicted in the chart in Figure 20, even in times of great accuracy the number of errors that resulted in some unwanted action was high. Though most of the unwanted actions were of a benign nature and were easily corrected by resetting the movements or modifying the command to perform the correct actions, the consequences of these unwanted actions could potentially be disastrous.

Figure 20. Error Type percentages committed by IN³

a. Adding New Vocabularies And Commands

Adding new vocabularies in IN³ were as simple as just opening the "File" menu selection and choosing the "Add lexicon," "Add starter lexicon," or "Include lexicon" selections. The "Add lexicon" selection adds a template located in the users directory. This template could be one of several that the user may have created or modified from the lexicons included with the program. The "Load starter lexicon" selection allows the user to select and load any one of the nine included lexicons. The difference between these lexicons and those that are added by the "Add lexicon" selection is that these starter lexicons are not yet trained, and those loaded using the "Add lexicon" selection may or may not be trained. The "Include lexicon" selection allows the user to add vocabulary commands from different lexicons into one large lexicon, creating one large vocabulary file. The advantage of doing this is that the user will not have to switch templates when different applications are started or selected for use.

Adding individual commands is done using the "Edit Command" dialogue as previously described. Learning how to use embedded commands, capturing keystrokes, and enabling commands to operate within specific applications is the tricky part. Learning the use of embedded commands is almost like learning a new programming language. The examples given in the User's Guide are not very clear, and the User's Guide itself reads more like a technical manual than a guide. It is extremely helpful if the user has some general or basic knowledge of UNIX or OpenWindows. Several calls were made to Command Corp. for technical help on how to program some of the commands, especially commands dealing with applications using multiple windows. The result of the technical help was the use of the "Front" command previously described. This technique is described in the IN Cube Voice Command for SPARCstation version 2.2.2 Release Notes that are installed in the usr/lib/in3/info/ directory in the file "relnotes.ps". This document contains notes, changes, and corrections to the documentation included in the package with the software.

C. Summary

In this chapter we have looked at the two navigational software packages evaluated in this study, VoicePilot and IN³ Voice Command. We have seen that both were produced to perform the same type of operations, that is to navigate between applications in a windows environment. As navigational input devices for windows operating system environment, VoicePilot was found to be less than desirable due to its low accuracy. In contrast, IN³ performed well as a navigational device, reaching a 90.91 % accuracy rate after continuous use. IN³ and VoicePilot showed that there is a propensity for both packages to perform unwanted actions when there is an error made in the recognition of a command. This is not an attribute that any user would want. In this study the unwanted actions were benign, but the consequences of such error types in other situations could be potentially disastrous.