Newletter Masthead
September 1998 · Vol. 23, No. 2, pp. 12-13

Computer voice recognition software an aid to forensic psychiatrists

Stephen G. Noffsinger MD
AAPL Computers and Forensic Psychiatry Committee

Voice recognition computer software has been in development since the 1950s, but only in the past three or four years has it been available and affordable for use by the general public. The overall principle of voice recognition software is to allow the user to dictate words into a microphone that are transformed into either text for word processing or commands for navigation, without the use of a computer keyboard. This is helpful for physically-challenged users who are unable to use a keyboard, and also as a time-saver for those of us with less-than-optimal typing skills. Due to the many reports required by our profession, forensic psychiatrists may have an interest in voice recognition software as a time-saving device. For clinicians who dictate reports and pay for transcription services, voice recognition software may eliminate transcription costs.

Previous incarnations of voice recognition software required the user to dictate in "discrete speech", in which each word was clearly enunciated in a staccato fashion, separated by a brief pause from surrounding words. This allowed the software program to recognize each word with a higher accuracy rate, but required the user to speak in an artificial pattern. Newer versions of voice recognition software now allow the user to dictate using "continuous speech", which is essentially a normal speaking pattern.

Voice recognition software requires the user to invest several hours adapting the program to recognize his/her individual voice and speaking patterns. Once installed on the computer, voice recognition software programs take the user through an hour-long setup process in which the user dictates a series of basic words and sounds into the computer, to familiarize the program with the user's speaking patterns. Continued use of the program requires the user to correct recognition errors made by the program. This continued cycle of correction and learning for the program results in higher word recognition accuracy rates. Thus, the more you use the program, the better trained it is to recognize your speaking patterns, and the accuracy rate increases. Newer voice recognition software also allows for multiple users, although each user must train the computer to recognize their individual speech patterns.

Available products

There are three popular voice recognition software programs on the market at present, with additional products being added every month. Dragon NaturallySpeaking is produced by Dragon Systems, Inc. This program allows the user to dictate directly into a word processor program, such as Microsoft Word 97 or Corel WordPerfect 8, using continuous speech. PC World reports Dragon NaturallySpeaking's word recognition accuracy rate at 98%. (I would view word recognition statistics with a pessimistic eye, as the novice user's recognition rate will be much lower, probably in the 60-70% accuracy range). NaturallySpeaking is able to recognize words at a rate of up to 160 words/minute. It comes equipped with a vocabulary of 230,000 words, with space for the user to add another 25,000 of their own words. Macros (multi-word phrases) are able to be used. A unique feature allows the computer to read the dictation out loud for proofreading. NaturallySpeaking requires Windows 95 or Windows NT 4.0, minimum 133 MHz Pentium Processor for IBM compatible PCs, 32 MB of RAM (48 MB of RAM to dictate directly into Microsoft Word), and 60 MB of hard disk space. Dragon Systems Inc. has been developing voice recognition software since 1982, and has been responsible for many of the innovations in this area. Three versions of NaturallySpeaking are on the market, with prices starting at $79. For more information go to their website at www.naturallyspeaking.com.

VoicePad and VoicePro are two voice recognition programs offered by Kurzweil Applied Intelligence, Inc. VoicePad is the entry level voice recognition software program, and is a bit unwieldy because the user dictates into a clipboard, and then transfers the dictation into the word processor. The upgrade VoicePro allows direct dictation into the word processor. However, both programs use discrete speech, not continuous speech, which is a drawback. VoicePad is equipped with a 60,000-word vocabulary, and allows you to enter several thousand of your own words. VoicePro has the ability to handle macros. VoicePro is also highly accurate, boasting a 97% accuracy rate. VoicePro retails for approximately $70, and requirements include a Pentium 90 MHz processor, Microsoft Windows 95 or Windows NT 4.0, 16 MB of RAM, 35 MB of hard drive space and a CD-ROM drive. For more information, go to www.lhs.com/kurzweil/pcapps/voicepluspro/description.htm on the web.

IBM's ViaVoice is another voice recognition software program, available for purchase at $75. This is also a continuous speech program. ViaVoice contains a vocabulary of 22,000 words, and you can add as many as 42,000 of your own words. ViaVoice also has good word recognition capacity (88%), although not quite as good as Dragon NaturallySpeaking or VoicePro. ViaVoice includes the handy text-to-speech feature, which will read your dictation out loud. An upgrade to ViaVoiceGold is available for $120. Visit www.compu-media.com/ibm.htm for more information.

I chose to purchase Dragon NaturallySpeaking because I had used an earlier version of this product. My experience was mixed. Installation was easy, but before I ran the program I received an error message that one of my Windows 95 files was obsolete, and would have to be updated from the Microsoft web page before I could run NaturallySpeaking. I did this without difficulty, and was off to the races with the program. Or so I thought. I next received an error message that the compatibility between my computer's sound card and the NaturallySpeaking microphone was unacceptable. Despite this, I proceeded anyway, and found that it was able to recognize my voice adequately for dictation purposes. I then spent an hour training the computer to recognize my voice and speech patterns. This went off without a hitch, and I was ready to dictate. NaturallySpeaking accurately recognized my dictation about 70% of the time. The accuracy rate increased as I continually corrected the recognition errors and trained the computer. Presently, NaturallySpeaking is recognizing 80-85% of my words, which is short of the promised 98% accuracy rate, but certainly acceptable for me.

Bill Gates has identified voice recognition software as a key technological advance. You can be sure that voice recognition software will continue to evolve, becoming cheaper, easier to use, more accurate and more ubiquitous in our society. With the advent of continuous speech recognition, now may be the time to enter the voice recognition software market, as long as you are willing to invest sufficient time to train your software to recognize your individual speech patterns.