Medical Speech-to-Text

Voice recognition software allows physicians to convert medical speech to text and even to give commands to prompt specific actions on a computer. The vast improvements in the ability of computer software to recognize speech have led to increased use in education and in the assistance of those with disabilities, allowing for more independence and better quality of life. Voice recognition software also helps those who struggle to commit thoughts to paper and allows for improved multitasking, often being used to document progress during a task.

Not all applications are equal, however, and the complexities of medical vernacular necessitate the use of specific programs to improve accuracy. Physicians and surgeons will likely need to spend some time getting to know their chosen software and ensure that their software also gets to know them.

Luckily, the clumsy, error-riddled, voice recognition software of the nineties has been vastly improved and a number of surgeons are finding a place for these programs even during operations. The necessity of enunciating each word separately and providing pauses between every word is long gone and continuous speech recognition software has largely replaced discrete speech software. Many of the new programs can recognize speech at a rate of 160 words a minute, with the average speech pattern estimated at around 110-150 words per minute when someone is speaking in their native language during a friendly and relaxed conversation.

As well as providing a useful notation service during procedures or consultations, medical speech-to-text applications give physicians the opportunity to verbalize commands to open specific computer programs or files for handy reference and even to capture images during surgery or diagnostic procedures, given the right equipment set-up. Users of medical speech-to-text software will go through an initial exercise to train the application to recognize their pattern of speech and this specific information will be added to the general patterns of speech included in the program in order to provide a best-guess for every word.

Good medical speech-to-text software also accounts for usual grammar and relationships between words to improve results. Some users of voice recognition software do encounter issues where they have a complex or unusual speaking style. Many types of speech-to-text program include a dictation service so that the transcript can be compared to a recording in order to clarify the script at a later date.

Those looking into using medical speech-to-text software will want to consider the compatibility of any such program with their existing computer and platform, the languages included in the software, how easy it is to train the application and the potential for integration with other software, such as Word or Excel. Other things to think about before choosing speech-to-text software include:

  • The ability to use wireless dictation, i.e. a Bluetooth headset
  • The ability to transcribe from a digital recording
  • Whether existing word lists and user profiles can be imported or exported.

Specific applications available for converting medical speech to text include Nuance (MacSpeech Dictate Medical), Trigram Technology, and M*Modal from the DrChrono platform. Open source voice recognition software for Linux includes Gnome Voice Control, Open Mind Speech and Perlbox.

DrChrono’s M*Modal’s Speech Understanding technology is part of a package for the iPad that collates electronic health records (EHR). Physicians can use the software to access EHRs on the iPad, or through a web browser or Android device. A Bluetooth headset can be used to dictate medical notes for immediate additions to patients’ medical records and the physician can also record notes in the clinical practice setting. Many US physicians are already using an iPad to augment their work and many more are considering the use of such a device.

A survey by Manhattan Research found some 30% of physicians already had an iPad for work and another 28% were thinking of purchasing one soon. The ability to access patients’ records through the handheld device was seen as particularly attractive amongst survey respondents and so many developers of software to convert medical speech to text are ensuring iPad compatibility for their programs.

Minimizing the energy expended on manually updating patients’ medical files, as well as reducing the risks of sloppy typing can be beneficial to both patient and physician in terms of cost, time and money. Digitizing such records also provides better accountability than current paper methods or localized record-keeping.

There are some potential downsides to the voice recognition software currently available, however, such as poor performance with ambient noise, lengthy load time, difficulty discriminating some common words and problems using a microphone, keyboard and mouse during procedures. Many programs are overcoming these issues but the problem recurs as they are translated to other devices, such as the iPad, making many medical speech-to-text devices and applications prohibitively expensive for most general practitioners.