Please can you give a brief introduction to the different causes of speech impairment?
There are many different causes of speech impairment but the one we are particularly interested in is actually the most common cause of speech impediment, which is dysarthria.
In dysarthria, people have difficulty controlling speech articulators, such as the tongue and lips. This may be caused by stroke, by congenital conditions such as cerebral palsy or by progressive neurological disorders such as multiple sclerosis.
When people lose control of these articulators, the speech can take a number of forms but typically, the speech is slurred and often extremely difficult to understand, especially in cases of severe dysarthria.
What is VIVOCA (voice input voice output communication aid) and how was it developed?
The idea of VIVOCA really comes from work I was carrying out for the NHS, which involved providing assistive technology to people with severe physical disabilities.
A lot of physical disabilities are associated with speech impairment for the same reasons people with motor disorders have difficulty controlling other parts of their body; they have similar problems controlling their speech articulators.
While trying to assist these people with the use of technological equipment, I became very dissatisfied with how limited the assistance was, especially for those at the more severe end of the scale.
For example, trying to help a person produce speech through a normal communication aid can be very time consuming and difficult if they only have command of one single movement such as lifting one eyebrow or moving one finger. Trying to compose messages on a communication aid with input that limited is extremely time consuming and labour intensive.
At the same time, a lot of these people do have some speech and may have much better control of their speech than they do of their movement. However, their speech is still very difficult to understand.
This led to my idea of using the speech they do have to interface with the communication aid in a way that programs the aid to translate the speech into clear, synthesized messages that can be easily heard. In effect, the technology is used to turn impaired speech into clear speech.
How does VIVOCA interpret disordered speech?
VIVOCA has to be programmed to interpret each individual’s speech. We use fairly common but also advanced speech recognition techniques. We ask the person to say a number of words a number of times, which we record and then use to program the aid to recognize what the person is saying. We then map the input words against the output words the person wants to produce and get the machine to interpret those inputs into the outputs.
One of the easiest ways of doing this is for the inputs to be the letters of the alphabet so that people can spell what they want to say, but the technology can be a lot more sophisticated than that, interpreting whole words into clear output phrases.
How long does that process take – to get VIVOCA to calibrate with an individual’s voice?
For some individuals, this can take a fairly long time. Personally, I could produce enough data to achieve a reasonable degree of calibration within half an hour, but the process can take much longer for someone with a speech impairment because speaking can be quite tiring for them.
Persisting with speech for long enough to produce all of the data in one go can be difficult and a few sessions may be needed for someone to record all the words that they want to be able to use. The time taken to do this also depends on how many outputs they want to produce and are able to produce as well as how many the machine is able to recognise.
Does VIVOCA work for all people with impaired speech?
I would say yes and certainly for people with dysarthria because the aid is programmed to recognize and interpret speech in a personalized way. So long as people are reasonably consistent in how they say things, the aid should be useful to them.
People don't have to be absolutely consistent in their speech as the issue of recognizing speech variability was addressed during the technical development of VIVOCA, although the system may fail to recognize speech in cases of severe impairment and significant variability. However, it is very rare that we cannot find a way to recognize the speech.
What stage of development is VIVOCA currently at and what still needs to be done in order to bring it to market?
As far as I can see, VIVOCA is ready for marketing. We have actually produced the software in a way that is compatible with the industry standards for software. We have been working with a commercial organization that produced the hardware for the device.
In order to bring it to market, we just need a company who can see the potential of this device and actually make it and sell it.
What excites you most about your work on VIVOCA?
What excites me the most is seeing people actually benefit from using VIVOCA: seeing the faces of people who usually have difficulty getting their message across using it and then finding they can communicate in a way that they couldn’t before.
Sometimes, people are even delighted just on realising that the machine can recognize what they are saying, because often they have had a lifetime of their speech not being recognized and suddenly having a machine being able to do so can be quite a lift for them in itself. That’s before they even see the next effect, of their input being translated into clear output speech.
I am excited by the fact that it may be the right way for some people with speech impairments to communicate.
What are your plans for the future?
We are in the middle of evaluating the device for a number of people who will be using it, in their own homes and in their everyday lives. If that works out well, then we are hoping to have discussions with some companies to see whether they would be interested in taking this forward to the market.
Hopefully, one of them will take this up and be able to sell it so it will be available worldwide to the people who might need it.
What do you think the future holds for people with impaired speech?
I think that in terms of these types of speech recognition technologies, we are making great progress. Usually, speech recognition technology is only targeted at those with standard speech. The commercial products out there that are becoming more widespread such as speech recognition systems on mobile phones are not suitable for people with very impaired speech.
The sorts of advances that we’re making here in Sheffield and also that people are making in several centres around the world, will mean that speech recognition becomes a reality for people with impaired speech in the future. We will be able to recognize their speech without needing much training because speech recognition aids will be able to simply adapt to people’s speech, which will be more convenient for people in the future.
Where can readers find more information?
http://www.catch.org.uk/
About Professor Mark Hawley
Mark Hawley is Professor of Health Services Research at the University of Sheffield, UK, where he leads the Rehabilitation and Assistive Technology Research Group. He is also Honorary Consultant Clinical Scientist at Barnsley Hospital, where his Assistive Technology Team provides specialist electronic assistive technology services within Yorkshire. Over the last 20 years, he has worked as a clinician and researcher – providing, researching, developing and evaluating assistive technology, telehealth and telecare products and services for disabled people, older people and people with long-term conditions.
Mark is Director of the Centre for Assistive Technology and Connected Healthcare (CATCH) at the university. He leads a number of projects funded by the National Institute for Health Research and Technology Strategy Board, and leads the Assistive Technology theme of the Devices for Dignity Healthcare Technology Cooperative. In 2007, he was awarded the Honorary Fellowship of The Royal College of Speech and Language Therapists for his service to speech therapy research.