Podcast: Play in new window | Download
Your weekly dose of information that keeps you up to date on the latest developments in the field of technology designed to assist people with disabilities and special needs. Show Notes: Dr. Tim Bunnell, Nemours Alfred I. duPont Hospital for Children Director, Center for Pediatric Auditory and Speech Sciences | www.modeltalker.com Cherokee language now available in Braille http://buff.ly/1rtSBHb Social media and tech sites must be accessible to everyone – TechRepublic http://buff.ly/1rtRsPU IBM names West accessibility chief | ZDNet http://buff.ly/1pbrV9w App: Kahn Academy www.BridgingApps.org —————————— Listen 24/7 at www.AssistiveTechnologyRadio.com If you have an AT question, leave us a voice mail at: 317-721-7124 or email tech@eastersealscrossroads.org Check out our web site: https://www.eastersealstech.com Follow us on Twitter: @INDATAproject Like us on Facebook: www.Facebook.com/INDATA —–transcript follows—– TIM BUNNELL: This is Tim Bunnell, and I’m the director of the Center for Pediatric, Auditory, and Speech Science at the Alfred I. DuPont Hospital for Children, and this is your Assistive Technology Update. WADE WINGLER: Hi, this is Wade Wingler with the INDATA Project at Easter Seals Crossroads in Indiana with your Assistive Technology Update, a weekly dose of information that keeps you up-to-date on the latest developments in the field of technology, designed to assist people with disabilities and special needs. Welcome to episode number 165 of Assistive Technology Update. It’s scheduled to be released on July 25 of 2014. My guest today is Dr. Tim Bunnell who is with the Alfred duPont Hospital for Children and Model Talker, and he’s going to talk today about voice banking for people who use augmentative and alternative communication devices, specifically those who have ALS. In other news we talk about the fact that the Cherokee language is now available in braille, what the FCC is saying about social media sites and accessibility, and the fact that IBM has named a new Chief Accessibility Officer. We hope you’ll check out our website at www.eastersealstech.com, give us a call on our listener line at 317-721-7124, or shoot us a note on Twitter at INDATA Project. I am not a Native American, but there is a press release from the Cherokee nation talking about the fact that their written language, the Cherokee language, is now available in braille. The Cherokee language has been available in a written format since 1821. It’s been translated by Apple, Microsoft, and Google for use in their products. It was originally encoded into Unicode in the year 2000, but now for the first time there is a braille representation of the Cherokee language. I’m looking at a website that’s Cherokee.org that has a news release about that and also includes a graphic that shows the visual aspects of the Cherokee language and how it translates over to braille. To me, it looks like a fairly standard braille cell with six dots and then some letters being represented by more than one braille cell to get the characters and the phonemes that are unique to the Cherokee nation’s language. Interesting stuff here. I didn’t really realize that was an issue but now I’m more informed and help you be more informed. I’m going to pop a link in the show notes over to Cherokee.org and you can check out this article about the Cherokee language that is now available in braille. From TechRepublic, there’s an article entitled, “Social media and tech sites must be accessible to everyone.” They talk about the fact that the FCC recently held an event on accessibility and social media in Washington DC. Among the social media giants, LinkedIn was the only one that showed up. Apparently there was a hearing room full of advocates, nonprofit workers, government staffers, and other concerned citizens talking about the importance of accessibility related to social media. This article from TechRepublic is fascinating and goes into quite a bit of depth about some of things that were discussed, including the fact that the FCC is the first federal agency to set up a consumer support line in American Sign Language and that the Department of Justice still has not completed a rulemaking on the modernization of the ADA, or the Americans with Disabilities Act and how that might impact websites, social media come a and other more modern technologies. I’m going to encourage you to check out this article by Alex Howard and I’ll pop a link in the show notes over to TechRepublic where you can learn more about how social media shall be made accessible to everybody. It’s all over the web that IBM has named Francis West as their first Chief Accessibility Officer. She’s not new to IBM. She’s currently the director of IBM Research’s Human Ability and Accessibility Center, and she’s getting ready to expand those duties as their first Chief Accessibility Officer. They note that there are over 1 billion people with disabilities and technology and mobile devices can increase accessibility. Right now it’s unclear if her role will encourage other tech vendors to increase accessibility in their products or to do things more internally with IBM. She’s a trustee at the National Braille Press and advisor to the National Business and Disability Council and has been on other boards of groups that deal with disability and technology. More to come. We’re excited to note that IBM is taking the bull by the horns and has created a new position, Chief Accessibility Officer. I’ll pop a link in the show was over to ZDNet where can learn more about this development. And now we’ve got a question from our listener line. >> Yes. I have a question. I just listened to the podcast where the Gilman from National Braille Press was talking about the G2G. Years ago, I used a device called an Opticon, and I still have one. It doesn’t work anymore. I was wondering if any places doing something to revitalize that or remake one. It was really handy for me. I hope maybe you can find somewhere I can find out about this. Thank you. WADE WINGLER: Thank you so much for your question. I don’t have a great answer for that. I happen to have an Opticon myself that doesn’t work also. I’m familiar with the technology and how it works. For listeners who don’t know, it was a device made many years ago that would allow you to hold a probe over anything, text, photograph, or even your computer screen and get a raised line representation of whatever was being visually presented. I have lots of blind friends and colleagues who have relied on that technology for years. I’m not aware that there’s anything like that being made or even developed at this point. I am aware that the folks at the National Braille Press and MIT and other places are very fascinated about the use of haptic interfaces where a piece of glass on a tablet would be able to re-create raise fines and other kinds of images. I would guess that there might be some opportunities there with haptic interfaces to create something kind of like an Opticon. Because I don’t know, I’m going to throw this out to the audience. If anybody in the audience is familiar with the Opticon and anything new that might be doing something similar, please give us a call. You can call our listener line at 317-721-7124. Or you can drop us an email at tech@eastersealscrossroads.org, so let us know what you know about the Opticon and what might be happening like it that’s new. Each week, one of our partners tells us what’s happening in the ever-changing world of apps, so here’s an app worth mentioning. >> This is Amy Barry with BridgingApps, and this is an app worth mentioning. Today I’m going to share the Kahn Academy app. This app is a comprehensive iOS and Android app that allows all students to learn almost anything for free. The app has a growing library of over 4200 educational videos and articles. Topics include kindergarten through 12th grade math, science, humanities, history, civics, and finance. Kahn Academy provides access to all the videos, articles to read, the ability to track progress, download videos to watch off-line, and subtitles. Along with the Kahn Academy app, we highly suggest users create an online account and engage in the knowledge map. When students begin the math program online, they are asked to take a pretest to assess their skills. The assessment helps identify their math level and what concepts they are knowledgeable in and what concepts they need to master. When concepts are mastered, students earn badges. This game aspect keep students engaged and motivated to learn more. Kahn Academy is great for students with learning disabilities because it gives them the opportunity to learn at their own pace. They can watch and rewatch video tutorials on concepts. Then they can work through the practice lesson with Khan’s adaptive assessment environment. Students can start at 1+1 and work all the way into calculus or jump into whatever topic they need help in. Kahn covers every math curriculum standard in an easy to learn format. We have used Kahn Academy with users of all ages and abilities. The success rate is phenomenal. Again, one of the best features of Kahn Academy is having the ability to learn at your own pace. We recommend con Academy and consider it one of the best online app learning programs available right now. Kahn is excellent for teachers, homeschool parents, and parents like myself looking for additional resources to support public school curriculum. We have even seen college students use Kahn Academy to get through some high advanced math courses. The Kahn Academy app is free at the iTunes and Google Play stores. This app can be used on iOS and Android devices. For more information on this app and others like it, visit BridgingApps.org. WADE WINGLER: I of all people probably am very interested in verbal communication and the importance of the spoken voice. Recently I was made aware of a concept called voice banking. If you google that term, you’re going to find a lot of things from the finance industry, but then you’re also going to find some information about people recording their voices for future use. I reached out to my friend and colleague Alisa Brownlee from the ALS Association and she made an introduction with me to Dr. Tim Bunnell come a who is with the Alfred I. DuPont Hospital for Children. He is the director of the Center for Pediatric, Auditory, and Speech Sciences, and she said, “You really need to talk to him because he knows about this topic.” I believe I have him on the phone. Dr. Bunnell, are you there? TIM BUNNELL: I’m here. WADE WINGLER: Thank you so much for taking some time out of your day to talk with me. I am excited to learn about voice banking and how you got interested in that and kind of what is the state of the art. Would you mind talking a little bit to start off with about how you became interested in technology related to speech and more specifically technology for folks with disabilities? TIM BUNNELL: Sure. As a graduate student, I was in an experimental psychology program, and my interest was in studying human speech perception. So that got me going on speech as a stimulus, if you will, as a signal. That in turn led me to look at the kind of technology we needed to use to analyze that signal. So I started using computers to do speech processing and speech analysis and other things. I did a dissertation on that. And then on a new PhD, I wanted to try and take the sort of theoretical things that we worried about in graduate school and actually find applications for them. Pretty for much my entire career, I have been doing things that involve technology and speech and assistive devices, including looking at ways that we can enhance speech for hearing aids. Then I got very seriously and do speech synthesis, partly because speech synthesis is really needed when you’re studying speech perception because it’s so easy to manipulate the speech signal that way. But then realizing that speech synthesis was also a really good way to provide people who are non-vocal with an assistive technology. I started looking at that, and I came to the DuPont Hospital about 25 years ago now and started working in the speech lab, following up on designing speech synthesis system that we now call model talker, which is used by a lot of people to do voice banking. WADE WINGLER: So great segue and transition. Tell me a little bit about voice banking. What is it? TIM BUNNELL: As the name sort of suggests, it’s recording and saving your voice so that later on you can use that voice even when you are no longer able to speak. It’s an ideal technology for people who, for instance, have been diagnosed with a disease like ALS where they know there’s a very good likelihood that they’ll lose the ability to speak, but they are still able to speak fluently at the time of diagnosis. Then they can record their speech and later on have that record a speech to use that still represents their own voice and their own identity. In the early days of voice banking, that simply meant recording a lot of stock phrases so that you would have a phrase for everything that you thought you might need a phrase for. Now with the use of the speech synthesis technology, what we’re able to do is take those recorded phrases and actually convert them into a synthetic voice that still sounds like the person who recorded the speech originally, but is able to say anything including things that the person didn’t originally record. WADE WINGLER: So I become fascinated with the technical aspects of that. Before I jump into my technical question, is this the kind of technology that was fairly well-publicized with Roger Ebert a few years ago? TIM BUNNELL: It’s exactly the kind of technology that was used by Robert — what’s his name? WADE WINGLER: Roger Ebert. TIM BUNNELL: Roger Ebert. So that was a company in the UK who use the same kind of technology that model talker uses to do that. WADE WINGLER: So I’m going to jump around here little bit. In terms of the technical details, tell me about the recording and maybe some things about file formats. My end question is going to be, does this then do text to speech with these recorded pieces or is it patched together phrases or words? I want you to take me to school on this a little bit. TIM BUNNELL: So we do start by just making a standard digital audio recording of many sentences. For model talker, we typically ask people to record about 1600 sentences. They are stored on the computer disk as .wav files so you could play them with iTunes or whatever audio player you use. But then instead of simply saving those recorded sentences as sentences, we use some speech recognition technology to locate all of the phonemes, the speech sounds, vowels, consonance, and even sub phonetic uses such as the onset of a vowel or just the middle of the vowel or just the end of the vowel. We actually cut up the speech into those tiny little pieces and index each one, depending on where it came from and how it was used. Then when we want to do synthesis, what we do is basically a database lookup where we go into our database and we find all of the bits and pieces that will most smoothly fit together to sound like they were originally recorded even though they probably weren’t recorded together at the same time. WADE WINGLER: And the end result event is something that sounds fairly natural, right? TIM BUNNELL: It does. Some people describe it as me if I was a robot. WADE WINGLER: Okay. TIM BUNNELL: Because it still does sound somewhat synthetic. WADE WINGLER: Sure. I’m tempted to make jokes about Cher and some of her popular songs of the last few years but I’m not going to go that direction. So tell me a little bit about — you mentioned a less. Is that pretty much the extent of the population who is utilizing this technology? Who really does benefit from this? TIM BUNNELL: They are certainly the largest group of people who are using the technology. We do have a number of people who were blind who have made voices because they are kind of into different sounds of forces and they want to be able to share a voice with other people so they can send their voice to somebody else. We’ve also prepared voices for people who are preparing to undergo surgery, for instance, with laryngeal cancer, because they know that they’re not going to be up to speak at least with the same voice after the surgery. WADE WINGLER: Dr. Bunnell, do you happen to have an example that we could share with our listeners right now? TIM BUNNELL: I can have our synthesizer read a standard passage that we use a lot called the grandfather passage. WADE WINGLER: Great. >> My grandfather? You wish to know all about my grandfather? Well, he is nearly 93 years old. He dresses himself in an ancient black frock coat, usually minus several buttons. He actually still thinks as swiftly as ever. A long, flowing beard clings to his chin, giving those who observe him a pronounced feeling of the utmost respect. When he speaks, his voice is just a bit cracked and quivers a trifle. Twice each day he plays differently and with zest upon our small organ, except in the winter when the snow and ice prevents it. He slowly takes a short walk in the open air each day. We often urged him to walk more and smoke less, but he always says no. Grandfather likes to be modern in his language. WADE WINGLER: Wow. That is amazingly good sounding. It really is clear. Thank you for sharing that with us. TIM BUNNELL: Sure. I should tell you that that’s a voice talent, if you will, someone who is hired to do that and recorded under more studio conditions. So that’s about as good as it can sound with an hour or so of speech. We have compared what model talker sounds like with some of the best cutting edge synthesizers. We are really able to do about as good a job as most of them do when we use the amount of speech that most commercial synthesizers use. So, for instance, if you buy a voice from Nuance, 16 to 20 hours of speech go into constructing that voice. It’s not 16 to 20 times better than the speech that we create with one hour of speech but it is unquestionably better. WADE WINGLER: That makes a lot of sense. So in terms of prevalence, is this common? Are a lot of folks doing it? And then how do people know when they need to consider doing voice banking? TIM BUNNELL: In the case of ALS, as soon as a diagnosis has been made, people should do the voice banking. You want to do that while you’re still able to speak fluently and don’t have any dysarthria that starts to develop often quickly after diagnosis. In fact, it’s often the dysarthria that leads to the diagnosis. So the sooner that people with ALS do their recording, the more likely it is that it’s going to capture their original and more natural voice quality and not something that sounds a little dysarthric. WADE WINGLER: And where do folks have that done? Is that a process you record at home? Do you go to a studio? What does that process look like? TIM BUNNELL: Of course, under really ideal circumstances, if there was a studio that someone could go to to do the recording, the audio quality will be somewhat better. The overwhelming majority of people who do model talker voices record them at home. We allow them to download some software and give them some instructions on what sort of microphone to use and a little bit of coaching on how to do the recording. We typically asked them to record about 10 screening sentences. We listen to them and make sure that it sounds okay, that the audio quality is going to be acceptable and that their approach to doing the recording is going to work well for generating synthetic voice. People often think that they need to speak very expressively. Actually, that’s counterproductive when you’re trying to build a synthetic voice because in order to be able to find the bits and pieces to paste together smoothly, you want a fair amount of uniformity in the recordings. So it’s actually better for people not to be very expressive in the way that they do the recording. WADE WINGLER: For my listeners might be familiar with voice input technology like Dragon Naturally Speaking and those kinds of projects, is it that same sort of speak that you’re looking for? TIM BUNNELL: Probably. You want to be regular and predictable in the way that you say everything. WADE WINGLER: Good. So after those 10 sentences or so to check that the audio is okay, how much reading and how much does a person speak after that and record? TIM BUNNELL: Right now we asked people to record about 1600 sentences. That comes down to being about close to an hour’s worth of speech if you pasted it all together. WADE WINGLER: And I’m fascinated with the fact that from that you can create a large percentage of speech that they might want to make in the future, right? TIM BUNNELL: Right. WADE WINGLER: That’s amazing. So what are some the challenges with voice banking and the way it works now? TIM BUNNELL: Of course the single biggest challenge is the amount of speech that somebody has to record. Although it’s only an hour of running speech, it takes many hours typically for people to do those recordings. With each sentence, we have the person record the sentence, we measure the features of the sentence and make sure it’s loud enough but not too loud, that their pitch is about right and make sure that the pronunciation appears to be correct. If any of those things look problematic, we ask them to rerecord the sentence again or the software does. Recording a single sentence can take two or three times as long as it would take to just say the sentence. In general, people were probably spend eight hours total in the recording process. For someone who has ALS and maybe already a little bit of dysarthria, this can be quite a challenge. WADE WINGLER: Yeah. TIM BUNNELL: So one of the big things for us is to try to find ways that we can reduce the amount of speech that somebody has to record and still get a good sounding voice for them. WADE WINGLER: I have to assume that when folks are doing that they probably inherently understand the benefit, and I’m going to guess that people probably expressed that it’s worth the time invested. TIM BUNNELL: I think for most people they feel that it is. It is still experimental technology though. Not every voice that we have made for people has turned out very well. There’s a lot of variability. Of course it depends on the quality of the speech to start with. Some people’s voices seem to be very amenable to being process for synthetic speech and some people’s voices give us rather a hard time. WADE WINGLER: So considering that, what do you see as the future for voice banking? What does this look like a few years now? TIM BUNNELL: Well, I think we’re going to be within not too many years having things down to the point we’re people probably need to spend about a half hour doing the recording. I think that we’ll have a speech that sounds a little bit smoother and is perhaps a little bit more forgiving of some of the pronunciation variations that people have. We’ll be shifting our speech synthesis technology from this purely concatenative synthesis, where you’re concatenating little bits of waveforms together, into something that’s called parametric synthesis where we actually calculate some parameters of the speech acoustics from the recordings and then use those to create the speech from scratch. WADE WINGLER: That sounds interesting. Dr. Bunnell, we’re just about out of time, but I want to make sure that before we finish up here, if people are interested in learning more about model talker come about voice banking, or interested in connecting with you, what do you recommend? How would they reach out to you? TIM BUNNELL: Have a website that I would encourage people to go to. It’s www.modeltalker.com. There’s many things there that people can look at including a section of frequently asked questions. There’s also a demonstration that works there, as well as examples of voices that people have made in the passive that they can hear what the synthetic speech actually sounds like when it’s been created. There’s email contact for us there as well. WADE WINGLER: Excellent. Dr. Tim Bunnell is the director of the Center for pediatric auditory and speech science at the Alfred duPont Hospital for children. Dr. Bunnell, thank you so much for being with us today. TIM BUNNELL: You’re very welcome. Thank you very much. WADE WINGLER: Do you have a question about assistive technology? Do you have a suggestion for someone we should interview on Assistive Technology Update? Call our listener line at 317-721-7124. Looking for show notes from today’s show? Head on over to EasterSealstech.com. Shoot us a note on Twitter @INDATAProject, or check us out on Facebook. That was your Assistance Technology Update. I’m Wade Wingler with the INDATA Project at Easter Seals Crossroads in Indiana.