Klatt’s Last Tapes: A History of Speech Synthesisers Video

Klatt’s Last Tapes: A History of Speech Synthesisers

Speech Synthesisers in Use

Stephen Hawking and his Speech Synthesiser

Speech synthesisers and technology involved in giving a voice to those who can’t utilise has an interesting and enthralling history. It’s an area of technology and science that has fascinated scientists and therapists from many fields but is rarely discussed in the mainstream. World renowned physicist and cosmologist Stephen Hawking has made the presence of this technology more widely known.

Klatt’s Last Tapes was a one off exclusive on BBC Radio 4 which looked into the work of Dennis Klatt, the American pioneer of text to speech machines. Klatt’s work is explored by Lucy Hawking, the daughter of Stephen, who during this video goes on  a journey back through the history of speech machines. It really shows the ingenuity and creativity of the inventors and the quirky history of the predecessors of the machines that help her father communicate.

 

 

In the Beginning

Speech synthesisers have been produced and developed for over 200 years. Beginning mechanically with Wolfgang von Kempelen’s speaking machine which he built in 1769. Lucy Hawking visit Saarland University to see and try out a working replica of this primitive

wooden box with a mouthpiece and a bellows that was an early speech machine

Replica of Von Kempelen Speaking Machine

machine and learns more about von Kempelen’s dedication to finding a mechanical solution for people who were unable to speech. Von Kempelen found the main problem with his machine and developments was the lack of tongue and this particular element of the speech system was beyond his abilities to recreate mechanically.

Mechanics to Electronics

Experts believe there was no smooth transition between mechanical and electrical speech synthesisers. The first known electrical system was The Voder developed in the 1930s and displayed for all to see at the 1937 World Fair in New York. It operated much like an organ and it was remarked that it would take people a year at least to get to grips with the controls required to master its use.

Problems in Speech Synthesis

Through speaking to experts in the field Lucy Hawking realises and explores some of the main problems that have been battled against since the first speech synthesisers were developed. Initially it was possible to create plausible male voices but creating a female voice proved and still does prove difficult. Simulating women’s’ voices is harder due to different characteristics and they sound much more artificial than male. Articulation for the female voice is different and this is something even the most advanced computer systems has struggled with. It’s clear, as Hawking remarks in the show that using a synthesised male voice would provide women with a huge loss of identity.

Similarly, adult speech synthesisers have proved problematic for children. Speaking with an adult synthesised voice makes socialisation harder for children whose peers may find it harder to relate to them with an adult voice. The long term aim is to create personalised speech synthesis machines which grow with their user.

Dennis Klatt – The Father of Computerised Speech Synthesis

Dennis Klatt was the man who made a difference to speech synthesis. He was the pioneer of text to speech machines from a technological perspective and created an interface which allowed for speech for non-expert users for the first time. Before Klatt’s work, non-verbal individuals would need specialist support to be able to speak at all.

Lucy Hawking discusses Klatt’s work with his daughter Dr Laura Fine during the show. Klatt invented DECTalk, the original system which could take text and turn it into speech. Klatt also produced a definitive history of speech devices which includes a collection of recordings from all the devices developed throughout the 20th century. It’s a hugely valuable resource for development as well as for prosperity.

Klatt was dedicated to the production of a system for speech synthesis that was natural and intelligible. As Dr Fine explains he combined engineering and speech production research with people’s perceptions to create the end product. Perception data and the way people interpret speech is key to how successful a speech synthesiser is for regular conversation and socialisation.

Klatt created a range of different voices, entertainingly labelled the DECTalk Gang, and they gave a choice to DECTalk users. Choices included Beautiful Betty, Kit the Kid and Perfect Paul. Stephen Hawking’s voice is very similar to Perfect Paul.

Eye Gaze Speech Synthesisers

The show tells us that over 1 million people in America are unable to speak for a range of reasons. Lucy Hawking then goes onto to talk to Michael Cubis who lose his voice after a stroke. He controls his speech synthesiser through gaze control which is increasingly where text to speech technology is heading.

Eye Gaze technology uses movement of the eyes to generate text and speaking to Mick Donegan, a specialist in the field Hawking further discusses how the technology works and how it’s developed. The technology itself has been around for about 30 years but the systems have developed a lot in the 21st century. Sophistication in new speech synthesisers mean they can be utilised by individuals who live with involuntary movement, perhaps muscle spasms or shakes. People living with conditions such as cerebral palsy and multiple sclerosis are now able to access gaze controlled text to speech machines as well as games and leisure pursuits.

Initially machines were developed without punctuation or even capital letters but Donegan tells Hawking that this was met with disappointment by Michael Cubis who was insistent that proper speech, with the proper markers, is key to his identity and expressing himself as a fully literate, intelligent person.

The Future

Mick Donegan continues to discuss the future of speech synthesisers and recent research is even looking into how they can provide speech to people living with Locked-In syndrome.

The ideal way of giving someone their speech back is through implants, which is obviously an area which needs more research but Donegan asserts that caps which can boost signals are the current best option.

Speech Synthesisers and Identity

Hawking looks a little at how a speech synthesiser gives or takes away someone’s identity by chatting to Irish director Simon Fitzmaurice. With motor neuron disease Fitzmaurice lost his voice but was provided with a new one through his speech synthesiser – a new American voice.

The American voice of the synthesiser has become synonymous with him for Fitzmaurice’s family with his children unnerved by changes to it through other computer systems and programmes. Despite this Fitzmaurice has been participating in research alongside CereProc, a leading synthetic speech company, to build him a new voice.

CereProc have used recordings of Fitzmaurice’s voice and even data from his father’s voice to produce a speech synthesiser which mimics how he used to sound. This is fascinating technology and the show suggests that if you live with a disease where you may lose your voice there is now scope to make recordings in advance to try and save their part of your identity in the long run.

We thought we’d end this piece with a bit of friendly advice from Michael Cubis. When asked how do you talk to someone with a speech machine he replied:

“I would ask people them not to ask long questions and be patient because it can take a long time to answer. Also please bear in mind that it can be very tiring for those using speech output devices”

 

Please share and comment

If you enjoyed this video, please embed it on your sites or share it. We would also love to hear your comments below the video transcription.

 

Klatt’s Last Tapes Radio Show Transcript:

00:01 Speaker 1: We’ve comedy in half an hour when Richie Webb and Nick Walker star as the Hobby Bobbies. Before that, here on BBC Radio 4, Lucy Hawking traces the development of speech synthesis in Klatt’s Last Tapes.

00:16 Speaker 2: You are listening to the voice of a machine.

00:20 Speaker 3: Mama, mama.

00:24 Speaker 4: A, B, C, D, E, F, G…

00:29 Speaker 5: Once upon a time, there lived a king and queen who had no children.

00:34 Speaker 6: Do I sound like a boy or a girl?

00:37 Speaker 7: How are you? I love you.

00:40 S2: I do not understands what the words mean when I read them.

00:45 Speaker 8: Ha-ha-ha.

00:47 Speaker 9: I can serve as an authority figure.

00:50 Speaker 10: What did you say before that?

00:53 Speaker 11: Can you understand me even though I am whispering?

00:56 Speaker 12: To be or not to be, that is the question.

01:01 Lucy Hawking: My name is Lucy Hawking and I have been regularly chatting to a user of speech technology, my father Stephen, for the past 28 years. I write adventure stories for primary aged children about astronomy, astrophysics and cosmology. When I go to schools, I always talk about my father’s use of speech technology and I tell the kids that even though my father may sound robotic, when I play them a clip of him talking, I ask them to remember that actually it’s a real man talking to them. And it’s a man who’s using a computer to give himself back the voice that his illness has taken away from him.

01:42 Speaker 14: Development of speech synthesizers. One, The Voder of Homer Dudley, 1939.

01:50 Speaker 15: Will you please make the Voder say for our Eastern listeners, “Good evening radio audience.”?

01:55 Speaker 16: Good evening radio audience.

01:59 LH: To find out where speech technology started, I went to Saarland University in Germany, where two researchers had built a model of the first ever voice machine. It was originally created in the 18th Century by inventor, scientist, and impresario Wolfgang Von Kempelen.

[background noises]

02:24 LH: Hello.

02:24 Speaker 17: Hello.

02:25 LH: Good morning.

02:26 S1: Please come in.

02:26 LH: Thank you so much.

02:27 S1: I’m very pleased to meet you.

02:28 S1: Hello.

[background conversation]

02:30 Jürgen Trouvain: My name is Jürgen Trouvain. I’m a lecturer and researcher here at the Department of Computational Linguistics and Phonetics at Saarland University and I’m also interested in the history of speech communication devices, like the one of von Kempelen, for example. Kempelen was both a good showman and a very good scientist, but he was really like, sort of a genius, a real engineer, because he was interested in building things which can function and can help also people.

03:03 Fabian Brackhane: My name is Fabian Brackhane.

03:04 LH: What do you think the relationship was between von Kempelen’s original inspiration and the organ?

03:11 FB: It’s a very curious thing, because there is a stop in the pipe organ called “vox humana.”

[music]

03:24 FB: When this stop was invented in the 17th century, it should be a representation of the human voice playing the organ.

03:39 LH: So, they wanted to take the vox humana from a musical note, something you’d find in compositions at the time, to actually be able to produce human speech.

03:53 FB: Exactly. Yes. But Kempelen knew very well that this stuff couldn’t be the solution to get a speech synthesis.

[background music]

04:07 S1: Three, PAT the Parametric Artificial Talker of Walter Lawrence, 1953.

04:14 S1: What did you say before that?

04:18 LH: And so, we’re looking at von Kempelen’s speech machine. [chuckle] The door of which has just fallen off. It looks like a small bird house. Yeah. So, we’re taking the lid off the box, which houses the speech machine. And so, Fabian is putting one hand through one hole with his elbow on the bellows, which represent the lungs and his other hand is coming underneath the rubber cone. Which, what does the rubber cone represent?

04:53 FB: The mouth.

04:54 LH: The mouth. So, it’s hand under the mouth piece.

04:59 S3: Mama Mama.

05:03 S1: Ooh, it’s creepy. Sorry.

[laughter]

05:05 S3: Papa Papa.

05:10 FB: So, it’s… These are the both best words he/she could say it.

05:17 S3: Mama.

05:19 FB: So, you have the nose to be opened.

05:23 S3: Papa.

05:25 LH: So, Fabian is moving his hand rapidly over the mouthpiece and using two fingers over the nostrils effectively, while pressing down with his elbow on the lungs. Fabian is actually mouthing the words “mama” and “papa” while the machine is saying them.

[music]

05:45 S1: Four, The “OVE” cascade formant synthesizer of Gunnar Fant, 1953.

05:51 S7: How are you? I love you.

05:59 Bernd Möbius: I might be able to find out whether Lucy is able to…

06:02 LH: Should we see… Should we see, perhaps like in…

06:03 FB: So, there’s your instructor.

06:05 LH: Right.

06:06 FB: If you want to say “em,” you have to close the mouth and the nostrils have to be opened.

06:12 LH: The nostrils are open, front [06:12] ____.

06:13 FB: And if you want to say “ah,” you have to move the hand backwards. So, just mah, mah, while I’m pressing them…

06:22 LH: While pressing…

06:23 S3: Mm… Mama… Mam…

[chuckle]

06:23 LH: I did that with three syllables. [chuckle] I’ll try with two this time.

06:34 S3: Mama…

06:37 LH: Right and what about papa? How would I do papa?

06:39 FB: The same way but you have to close the nostrils. Well…

06:44 LH: Okay. So…

06:44 S3: Pa-pa-paaaaa.

[laughter]

06:50 LH: Let’s see if I can just do it with two syllables this time.

06:53 S3: Pa-paa…

06:56 LH: Can I get her to say anything else or will I be… Would I be able to make it say any other words?

07:03 FB: If you don’t cover the mouth, it’s an A.

07:07 S3: Ah…

07:09 S1: And the more you cover the mouth, the vowel quality changes.

07:13 S3: Ahh… A… B… Mm…

[music]

07:28 FB: He knew that the missing of the tongue was very important thing and in his book, he wrote to his readers, to invent this machine forward, but nobody could invent it with the tongue, with teeth, so that, it could speak more than this few, very few things.

[music]

07:57 LH: It seems to me that his aim was actually to give a voice to people who couldn’t speak. And so, he must have hoped for further development of his machine ’cause he can’t have imagined that, it would just be mama and papa or those short sentences. He must have had in mind, this idea that people would be able to speak freely, mechanically.

08:15 JT: And there was a plea in that book Fabian mentioned, please read out that means, researchers and the later generations, please, go on with the development of that machine. So, we’re still trying to do that here.

[music]

08:32 S1: 16, Output from the first computer-based phonemic-synthesis-by-rule program, created by John Kelly and Louis Gerstman, 1961.

08:44 S1: To be or not to be, that is the question.

08:49 LH: It would be really nice to get a sense of the progression from a mechanical to electrical to computer solutions to providing a voice for people who can’t speak.

09:01 BM: I’m not sure whether that was actually a smooth transition from mechanical systems like [09:09] ____ to the first electrical ones. I only know that, all of a sudden, that’s how it looks. My name is Bernd Möbius. I am the Professor of Phonetics and Phonology at Saarland University. In the 1930s, there was an electrical system around, the so-called Voder, did by Homer Dudley, that was demonstrated at the World Fair in New York, I believe in 1937.

09:35 S1: For example, Helen, will you have the Voder say, “She saw me”?

09:41 Speaker 21: She saw me.

09:42 S1: That sounded awfully flat, how about a little expression? Say the sentence in answer to these questions. “Who saw you?”

09:49 S2: She saw me.

09:51 S1: Whom did she see?

09:52 S2: She saw me.

09:55 S1: What did she, see you or hear you?

09:57 S2: She saw me.

09:59 BM: During the demonstration at the World Fair, there was a female operator of the system who played the device a little bit like a church organ.

10:09 S1: About how long did it take you to become an expert in operating the Voder?

10:12 Speaker 22: It took me about a year of constant practice. This is about the average time required in most cases.

[music]

10:23 S2: She saw me. Who saw me? She saw me. She saw me. Who saw me? She saw me.

10:37 JT: We have to go back to the or is the floor next to the top, the top floor.

10:42 LH: I’m now just getting into an elevator, which probably I can talk to. So, does it speak English?

10:47 JT: Hopefully, yes.

10:51 S2: Okay. Hello, elevator. It doesn’t say hello back.

10:58 JT: You must be patient with that. It’s a machine. Maybe with German.

11:01 S?: Hello [German]

11:01 Speaker 23: Hi there, where can I take you?

11:08 LH: The third floor. Third floor.

11:14 S2: Okay, I’m bringing you to the third floor. Bye, bye.

11:18 LH: Bye now.

11:19 S1: 19. Rules to control a low-dimensionality articulatory model, by Cecil Coker, 1968.

11:28 S2: [11:28] ____. You are listening to the voice of a machine.

11:39 Speaker 24: I’m Eva Lizotte [11:39] ____, and I’m a PhD student and working in articulatory synthesis. The actual situation right now, is that, it’s very hard to simulate women’s voices ’cause they have a slightly different characteristics and if you just tune up the F0, the fundamental frequency or the pitch of the voice, it starts sounding really artificial and what you actually have to do, you have, also to alter the articulation. So when “ah”, when I or when we speak an “ah,” it’s different from a male long vocal tract “ah.” So, you have… You can not easily interpolate the articulation.

12:19 LH: Because of course it’d be awful for women not only to be using a speech synthesizer, but then, to be coming out with a man’s voice.

12:25 S2: Yeah.

[laughter]

12:26 LH: I mean, that would constitute… That would be a real loss of identity.

12:29 S2: Yeah. Exactly.

12:31 Speaker 25: This is result of trying to imitate a female voice by increasing the pitch.

[music]

12:37 S1: 24, the first full text-to-speech system, done in Japan by Noriko Umeda et al., 1968.

12:47 S5: Once upon a time, there lived a king and queen who had no children.

12:55 S1: But I think it’s also important to think of children for example, growing up and of course at the beginning to speak with an adult’s voice, even the sex would be the same, would be awful I think…

13:08 LH: Definitely very important just for making friends. It’s gonna be very hard for a child speaking with an adult’s voice to actually communicate with kids of their own age.

13:17 S2: Yeah.

13:18 JT: But at the moment we don’t know very much about the speaking voice of children coming, adults, for example. What’s really happening during the maturation of the vocal folds.

13:29 LH: So, the aim is to create speech machines which can grow up with somebody.

13:32 JT: That would be really nice. Then you would have shown real knowledge about what’s going on in your voice during life span, at least, of a first say, 20 years or so.

[music]

13:47 S1: 21, sentence-level phonology incorporated in rules by Dennis Klatt, 1976.

13:55 Speaker 26: It was the night before Christmas, went all through the house, not a creature was staring. Not even a mouse.

14:04 LH: Can you see that people who don’t maybe know, who Dennis Klatt is, could you put him in context?

14:09 JT: Yeah, he’s definitely one of the pioneers of speech emphasis, in the technological sense, but also in providing an interface for non-experts who could basically type in text and get synthetic speech out of the system, which wasn’t possible before I think.

14:27 S2: Before Klatt, you would actually have to be a specialist in order to be able to input what you wanted to say.

14:33 JT: Exactly.

14:33 LH: Okay. Laura can you hear me?

14:36 S2: I can hear you. Can you hear me?

14:37 LH: Yes. I’ve got you. That was fantastic. This is Dr, Laura Fine, the daughter of Dennis Klatt. Dennis Klatt is really the father of the modern speech machine. He created DECtalk, the system which takes text, inputted by the user and turns it into speech. Dennis Klatt also produced the definitive history of speech devices which includes a collection of recordings of each device through out the 20th century.

15:01 S2: He really was interested in making a natural and intelligible system. So, the most important qualities of a speech synthesis system are really the naturalness and the intelligibility. And he was very much interested in making those of high quality. One of the unique contributions was that, he used not only his understanding from an engineering standpoint and a speech production standpoint, but he also asked for analysis with perception data. How do people interpret speech and what is it in the listener that helps them determine, is this a child, is this a female, is this a male? What cues are important? And that really helped him to make an intelligible system that incorporated different age speakers and different genders.

[music]

15:47 S6: Do I sound like a boy or a girl?

15:51 S2: My mother came across this drawing that my father made of the different speakers. In the center, we have Perfect Paul. This is a picture of my father.

16:01 Speaker 27: I am Perfect Paul, the standard male voice.

16:04 S2: And then, this is beautiful Betty which is the standard female voice. And that is a picture that he drew of my mother.

16:13 Speaker 28: I am beautiful Betty, the standard female voice. Some people think I sound a bit like a man.

[laughter]

16:22 S2: This is Kit the kid, who’s a 10-year old child. So, this is a picture of me.

16:27 Speaker 29: My name is Kit the kid and I am about 10-years old.

16:31 S2: With my nice short hair cut, as a child.

16:33 LH: Oh, is that you?

16:34 S2: I was a lab rat. As a child, I spent a lot of time at MIT. My father had a candy drawer. I spent hours with him at MIT, in his laboratory and he took snippets of my voice and that helped to develop the child’s voice.

16:51 LH: I love that they’re called the DECtalk gang.

16:54 S2: The DECtalk gang.

16:55 LH: That is a great… That is a great title.

16:57 S2: So, there was my father in later years and underneath the caption says, Huge Harry. Kind of older gentleman’s voice.

17:04 S9: I am Huge Harry, a very large person with a deep voice. I can serve as an authority figure.

17:12 LH: Laura, I have to tell you something, Perfect Paul, sounds just like my dad.

17:17 S2: I mean, I think that’s amazing.

17:18 LH: Is Perfect Paul based on your father’s voice?

17:21 S2: Yes.

17:22 LH: Which therefore means that, my father is actually speaking with your father’s voice.

17:27 S2: It’s amazing, he would be so, so thrilled.

17:30 LH: I think, one of the things that strikes me about your father is his humanity and that he was obviously an amazing scientist, who managed to do something that has had a very profound impact on people’s day-to-day lives. And but also that he had quite a sense of humour.

17:45 S2: He did.

[chuckle]

17:47 LH: Is it true that he gave his synthesizer the ability to sing, “Happy birthday to you”?

17:53 S2: He did.

17:54 S2: Happy birthday to you. Happy birthday to you. Happy birthday dear…

18:03 S2: One of the ironies is, as a 40-year old man, he began to be somewhat hoarse, because he had thyroid cancer. And, he had had a thyroidectomy, but his vocal chords were affected by the disease. And so, he spoke in later years with a raspy voice. And I think he understood all too well your father’s challenges in terms of communication.

18:29 LH: So, he had a real sense himself of what it would actually be like to find that you had no voice.

18:36 S2: Yes, my father unfortunately passed away at age 50, way too young. And he knew that he had a terminal illness really, when I was quite young. He knew that he would not be around perhaps to see me graduate from college. But he was always so optimistic. I think it’s been such an amazing experience for me to talk to you about how your father’s life has been transformed by my father’s research. And I had never really thought before that my father’s voice lives on.

[music]

19:11 S1: 33, The Klattalk system by Dennis Klatt of MIT which formed the basis for Digital Equiptment Corporation’s DECtalk system, 1983.

19:24 S2: According to the American Speech and Hearing Association, there are over one million people in the United States who are unable to speak for one reason or another.

19:37 Speaker 30: I will show you the way that you can write using my eyes.

19:41 Speaker 31: At first, when people meet me as someone who is unable to speak, they’d seem to assume that you have some form of mental deficiency.

19:49 S3: I will show you the way that you can write using my eyes.

19:52 LH: This is [19:53] ____ Michael Cubis. And Michael lost his voice from a stroke some years ago.

19:56 Speaker 32: Some people will talk to me as if I have a learning disability. I find this quite funny as some of them [20:02] ____ the most ridiculous way. Some of them catch on fairly fast and realize that I’m perfectly sane. Other’s continue to act this way though, which is funny and completely bizarre.

[music]

20:20 S3: People are quite anxious about how to approach someone with a disability. And that’s what Michael does, he puts people at their ease. So, it is easy to communicate with him.

20:30 LH: Mick Donegan’s speciality is an eye gaze technology, and that means, using the movements of the eye in order to generate text, which can then be turned into speech. Could you explain a bit more to us about gaze control, about the kind of technology that we have just had a conversation with Michael [20:49] ____?

20:50 S3: It’s a system, it’s based on a very powerful camera system combined with low level infra-red lights. The actual technology has been around probably two or three decades, but the significant change that’s happened this century, is that systems began to cope with significant involuntary movement. That means that the significant numbers of people with cerebral palsy, for example, who have involuntary movement, suddenly that group of people were able to use the system. People with MS who have involuntary movement.

[music]

21:23 S1: 11, The DAVO articulatory synthesizer developed by George Rosen at MIT, 1958.

21:31 S4: A, B, C, D, E, F, G, H, I, J, K…

21:36 S3: When I first tried Michael with eye gaze technology, we used just a lower case system and Michael was very unhappy about that. He was insistent that I put capital letters, full stops, commas, semicolons, because it’s really important for him to show everyone that he’s a fully literate guy who is able to speak independently and in the highest literacy level.

21:56 S4: When we know our A, B, C…

22:02 LH: Mick, I wonder if you could tell us a bit about how you see the future of this technology developing?

22:07 S3: I’ve just finished being an advisor for a European project on brain-computer interface and disability. And for me, that’s a technology that excites me because for those people who are completely locked in, who can’t even move their eyes, then there is no other way to go, other than to use a brain computer interface. At the moment, you know it’s kind of inconvenient, because for the best signal… Well, in fact, for the best signal, you need an implant. But the second best signal [chuckle] is to actually wear a cap and for that [22:31] ____ gel on it, etcetera. But there are various dry caps being developed that have a reasonable signal as I understand it.

22:39 LH: I’m always asked how to talk to my father, and it would be great to know what advice you would give to people who are not familiar with speech machines, but who would like to have a conversation with you?

22:49 Speaker 33: I would ask them not to ask long questions and be patient because it can take a long time to answer. Also, please bear in mind that it can be very tiring for those using speech output devices.

[music]

23:06 Speaker 34: The question of whether I would change my voice given the opportunity is a difficult one. And I suddenly have an opportunity.

23:14 LH: This is acclaimed film-maker, Simon Fitzmaurice, who has lost his voice through MND.

23:20 S3: This voice, my voice is a generic one that came with the computer, turning an Irish man into an American overnight. But it has become my voice.

23:33 S?: Yeah. This is actually something that we have in mind as a real application for people who know that there’s a chance that they will lose their voice to record themselves. Such that the experts will be able to build a speech synthesiser that has that person’s voice.

23:51 S3: There are two key issues, and the question of changing my voice. What I think about my voice, and what those closest to me think and feel about my voice? And I can tell you what my children feel straightaway. They find the idea of me changing my voice completely abhorrent. Just recently, I was testing out another computer, when I glimpsed out of the corner of my eye, my two little boys standing outside the door, their heads close together whispering… They are four and six years of age. They are whispering and looking in my direction. It turns out they are discussing the strange voice coming out of this different computer. Later, back on my own computer, it’s bedtime and right my six-year old comes to give me a kiss, I type up “Goodnight” on my screen. “No. Say it.” I say it, “Goodnight.” He turns to his brother at the door, “You see, I told you. It’s the same.” Someone’s voice is part of their identity, integral to their perceived makeup, it’s funny though, I feel less protective of my computer voice than others, probably because my voice inside my head is what is familiar to me, my thoughts, not the voice that expresses them.

25:20 S3: Recently, I came across a video on YouTube, we have a doctor in Sweden with motor and neuron disease and there it was, my voice out of someone else’s computer, identical. It was a little unnerving. So, I decided to see if I could get some semblance of my old spoken voice back, uniquely mine. I’ve been working with a company in Edinburgh, CereProc, the world leaders in synthetic speech who have built a synthetic voice out of old recordings of my spoken voice. I was lucky enough to have a recording of me reading some of my poetry and other recordings. However, because of the lack of data in comparison to someone who would deliberately bank their voice, my synthetic voice is limited by the amount of original material. As a solution, CereProc are now in the process of using my father’s voice as a similar source from which to fill in the missing DNA and to build a harmonias rounded voice.

26:23 Speaker 35: Harmonious rounded voice. I await the results.

26:27 S3: I await the results.

26:27 S3: So, the question remain…

26:29 S3: The question remains…

26:30 S3: Will I change my voice?

26:31 S3: Will I change my voice. And more importantly…

26:34 S3: Will my children allow it?

26:36 S3: Will my children allow it?

[music]

26:40 S1: 30, The MIT MITalk system by Jonathan Allen, Sheri Hunnicut, and Dennis Klatt, 1979.

26:49 Speaker 36: Speech is so familiar, a feature of daily life that we rarely pause to define it.

26:56 S1: End of the demonstration. These recordings were made by Dennis Klatt, on November 22nd 1986.

27:04 LH: Amazingly, we’ve progressed from Von Kempelen’s 18th century machine which had a limited vocabulary to being able to recreate the exact voice that was lost and give it expression, meaning and modulation in a way that mimics the naturally produced voice. Soon, speech technology users will be able to make their voices smile.

27:26 S1: Klatt’s Last Tape was presented by Lucy Hawking.

27:29 S6: Do I sound like a boy or a girl.

27:31 S?: The recordings were made available by the Acoustical Society of America.

27:35 S4: A, B, C, D, E, F…

27:37 S?: The sound design was by Nick Romero.

27:40 S7: How are you? I love you.

27:43 S?: It was produced by Julian Mayers.

27:45 S8: Ha-ha-ha.

27:46 S?: It was a Sweet Talk production for BBC Radio 4.

27:51 S2: Thank you for listening and good luck on all your cosmic journeys.

28:01 S1: I’m a bit concerned about that last bit, but while I’ve still got a job, I’ll introduce Peter White to tell us about You and Yours in half an hour. Peter.

28:07 Speaker 37: Yeah. We’re pretty concerned up here too. It’s claimed over 200,000 people who lost money when the life assurance company, Equitable Life, collapsed 10 years ago, could end up with no compensation at all. The Public Accounts Committee has blamed the Treasury for not getting a grip on the scheme. We’ll be looking at what can be done before the current deadline runs out, next spring. Wales, has cut its use of carrier bags by a massive three-quarters by imposing a charge. England still says, “It’s not ready… ”

 

Photo Credit: Attribution Some rights reserved by lwpkommunikacio

6 Responses to “Klatt’s Last Tapes: A History of Speech Synthesisers Video”

Leave a Reply

CommentLuv badge
Buy Trabasack on Amazon
Equ4l.com – Equipped for Life
KanduGroup
A boutique store stocking useful, clever and accessible products and gadgets for everyday use.