Skip Navigation | ANU Home | Search ANU | Students | Staff
The Australian National University
Division of Registrar & Student Services
Disability Services Unit
Printer Friendly Version of this Document
AT Project Logo and Link to Index page

Progress Report 3 on Tertiary Education Applications of Voice Recognition Software for Students with Disabilities

12 November 1999

Introduction

The project has been operating at a less intensive level over the past few months due to several changes in the circumstances of the Project Team. I have completed my contract as Northern NSW RDLO, taken up the position of DLO at The Australian National University in Canberra, and I am recovering from a serious fracture of the right arm after a Motor Vehicle accident. Trevor Wilks has continued in his role as manager of the Adtech Centre at the University of Newcastle, but the regular contact and interaction between the team members has been decreased with the changes.

The changes have also brought some positive results. Firstly, it has given me a direct personal insight into the benefits and value of Voice Recognition technology, by being forced to rely on the software to complete work with a broken arm. The difference in attitude to the now relatively few misrecognitions is remarkable. Once I was using Voice Recognition out of necessity and not choice, my whole attitude to misrecognitions changed. Before, it was a matter of "I said it correctly, why didn't you write it correctly?" After the broken arm, I viewed the misrecognitions as similar to typing errors in my admittedly limited typing style. They were a necessary minor annoyance in the normal drafting and composition process of writing. They became less of a frustration with the limitations of the program, and more of an appreciation of the capabilities of the software to enable work that would otherwise have been extremely difficult, time-consuming and frustrating.

Another benefit in the move to Canberra has been the direct involvement in the provision of support and training for a different group of students using voice recognition. ANU has a significant group of students with RSI, Learning Disabilities and other disabilities using Voice Recognition, and this different perspective and experience has been invaluable in developing a more effective understanding of the contribution of this software in the practicalities of university study. When combined with the experience and involvement of Neil Rice, The Information Technology Adviser at ANU, the availability of a new Pentium 111 computer, and the opportunity to follow the experiences of students more directly this has been a very rich source of information and insight.

Interest in the software is still high. I have had people tracking me down at my new position to seek information and advice about the use, applications and hardware requirements of Voice Recognition software. Students and staff at ANU have been very interested in the developments, and my experience with this project has been invaluable in my role as a DLO. I have also been able to provide information and advice to staff at the University of Canberra and the Canberra Institute of Technology.

Since we have decided to standardise our use and evaluation procedures on Dragon NaturallySpeaking, the amount of work, new programs, research and information processing has been significantly reduced. We are not ignoring developments in the other programs, but we are concentrating our efforts on the Dragon products. If we become aware of significant developments in other areas, we will incorporate that information in future reports.

Trevor Wilks provided a very effective Workshop on the technology, including an impressive demonstration of the abilities of the Maths programs, at the Northern NSW Disability Forum held in Coffs Harbour in June.

Recent Developments

There have been a number of significant developments recently which offer the promise of major improvements in the use of this technology. Major areas for this report will be:

The Pentium 111 Chip

The most significant development for Voice Recognition technology with the introduction of the Pentium 111 Chip has been the incorporation of the basic Voice Algorithms into the chip itself. By being hardware based, rather than software based, these algorithms work faster and more accurately, with memory and software operation being freed up for more and different features. Even though we have not yet had the opportunity to test the latest programs, which are optimised for use by the P111, the operation of the older software on the P111 is noticeably faster, more accurate, and easier to train and use. Information and advice coming from overseas, where Dragon NaturallySpeaking Version 4.0 has recently been released, describes much faster (about 10 to 15 minutes) initial training, improved accuracy, and faster operation.

One very significant factor to note is that future versions of the program will be designed to operate only on a P111 or its equivalent, so to take advantage of the future software developments, a P111 or its equivalent will be necessary. P111 Laptops have only very recently become available in the USA, and will be premium-priced and not very widespread in the near future.

At ANU, we have been using a P111 500 MHz computer with 256 MB of RAM for the past 3½ months, with very pleasing results. We have both undergraduate and postgraduate students using the equipment, with generally successful results. One Postgraduate student, after doing the initial training and running the Vocabulary Builder on several documents (about 2 - 3 hours) was able to begin dictating her thesis, with about 1 misrecognition per paragraph. This level of accuracy has improved with further use.

Dragon Naturally Mobile

Dragon NaturallySpeaking Mobile incorporates NaturallySpeaking Preferred Edition, a Digital Recorder, a Serial Link Cable and Voice-It Link, a program to transfer files between the Recorder and the computer. The idea is that a person can record material onto the tape recorder, then transfer it to the computer, then transcribe it into text, using NaturallySpeaking.

To use NatSpeak Mobile, a new Voice File must be trained specifically for use with the mobile recorder. Going through the General Training, there is an option to dictate directly into the computer, or into a recorder. Selecting the option "Into a Recorder" allows you to view the texts for training, print them out and record on the recorder. I found it far better to print out the chosen text, and record off that, rather than read off the computer screen. The computer screen approach tends to be somewhat more complicated, since you have to keep scrolling down to follow the text, and when combined with operating the recorder, becomes a bit complicated. It is much more straightforward to simply print out the training text, then read that directly into the recorder.

Once the text is in the recorder, it is simply a matter of transferring it to the computer by plugging the supplied cable into the recorder and the serial port of the computer, then using the Voice-It Link program, transfer the file to a folder in the computer. The Files are in the .wav format, and the standard recorder can hold about 40 minutes of material. It is possible to purchase extra memory cards to increase recording capacity.

Once the file is in the computer, it is simply a matter of continuing the training process using this file as the source material for the development of the Mobile Voice File. Once this voice file is established, material can be recorded on the digital recorder, transferred to the computer via the Voice-It Link, then transcribed by simply selecting the file, and asking NatSpeak to "Transcribe".

The process works quite effectively, with the recording, transferring and transcription of material all operating as intended. One major problem is that the accuracy of transcription is substantially affected by using the digital recorder as a hand held recorder, using the inbuilt microphone. The level of accuracy is well below the standard achieved by using NaturallySpeaking in the normal mode. However, by using the headset microphone to run the training and record future material, the level of accuracy is substantially improved, approaching normal levels of accuracy. The headset microphone simply plugs into the recorder, and the user dictates normaly.

If there is a need to use NatSpeak Mobile with multiple users, it is not necessary to purchase multiple copies of the program to obtain extra Digital Recorders. The same recorder is available from Dick Smith Electronics for $249 (Voice It VR 5000 Catalogue Number A 0052.) Also available are additional 50 minute voice chips for $99 (Catalogue Number A 0053.)

It would be possible to use this system for recording and transcribing lectures, but again it would require the Lecturer to create a voice file for the mobile recorder, and the issue of developing an appropriate subject-specific vocabulary is still there. However, the Mobile version of NaturallySpeaking has enormous potential.

Dragon NaturallySpeaking 4.0

Dragon NaturallySpeaking Version 4 has recently been released, and we have an upgrade on order. Upgrades are available from Auscript for $199, and prices for V4 products are roughly the same as V3.52. At the time of writing there was no V4 Professional upgrade available. There are a number of exciting developments with Version 4, which have been discussed on the Voice Recognition Users Group List Server. The main advances seem to be substantially reduced training time, improved accuracy and better integration with other programs. We will have a much better capacity to evaluate the developments when we obtain our upgrade.

Recent PC Magazine Article

A recent article from PC Magazine can be found at: http://www.zdnet.com/pcmag/stories/reviews/0,6755,2388289,00.html

Conclusion

Although the recent changes have necessitated alterations in the way the project has operated, including limited opportunities to travel to other universities to demonstrate the technology, the project continues and is still producing valuable information, skills and resources for the Tertiary Education community. Interest in the technology is still high and increasing, and the availability of independent advice and information is much appreciated. There have been a number of examples of misinformation being given to people by Sales people, and others being unaware of hardware requirements, software limitations and ways of using the technology effectively. The demand for information is still very strong.

Now that things have begun to settle down again after the major changes of the last 6 months, and the team has had a chance to adjust its working to the changed circumstances, we have plans for the continuation of the project into the future. The skills, knowledge and understanding gained from this project, and the exciting developments in the technology are too important to ignore. There is a high demand from staff and students, significant developments are occurring, and this project is making an effective contribution to the appropriate use of this technology for students with a disability in tertiary education.

Trevor Allan
Disability Liaison Officer, ANU
Project Co-ordinator. Voice Recognition Report 3