All Stories

Rewriting Speech Technology with Computational Linguist Vasundhara Gautam

a brown-skinned agender person with short, black curly hair and a side buzz holding a cat

Rewriting Speech Technology

with Computational Linguist Vasundhara Gautam

Rewriting Speech Technology

What happens when we harness the powers of both AI and linguistics?

Computational linguist Vasundhara Gautam

shares that, for some voices, speech-recognition technologies "tend to fall flat on their faces," why we should fix them and how.

Rewriting Speech Technology

Vasundhara is a computer science PhD student at Saarland University in Germany. Before this, xe worked for over 2 years in industry as a Speech Recognition Engineer at Dialpad in Canada. Xe has a BSc in Computing Science and Linguistics from Simon Fraser University, where xe also worked at an experimental phonology lab and a discourse processing lab. Learn more here.

Rewriting Speech Technology

on LGBTQ+ STEM Day.

Celebrate with us!

From computational linguists creating more accessible technology, to public health researchers supporting queer youth through COVID-19, #LGBTQSTEMDAY is an opportunity to highlight scientific and artistic voices making amazing contributions to STEAM (science, technology, engineering, art & design and math).

Vasundhara Gautam (VG): In Grade 11 and 12, I took French. And I had this French teacher who was like, “Okay, you have this homework and do not use Google Translate to do it.” And of course, all of us just went around and used Google Translate anyway. 

Science World (SW): That’s Vasundhara Gautam, a computational linguist. Vasundhara traces xyr career path back to this moment when xe started finding mistakes in the software’s translation.  

VG: I sort of very conceitedly thought to myself, “Wow, if I, a measly sort of person with like two months of French education can find mistakes that this really big expensive software that's run by this fancy company can't really pick up on, then surely that must mean that the software is just really bad and I should make it my life goal to fix it.”  

SW: In pursuit of this life goal to fix language-based software, Vasundhara attended courses at Simon Fraser University towards a joint major of computer science and linguistics.  

VG: And of course, the process of that degree was very much just figuring out that no, I was very wrong, it's actually not simple at all. And that's why computers find it hard to deal with language and we humans are just pretty good at it. 

SW: Vasundhara found xemself in awe of how much humans use language without even really thinking about it. And how we communicate complex, sophisticated ideas to each other and we connect with one another using language, but we rarely stop to think: how does it actually work? When we throw computer technology into the mix, the questions become even more interesting. 

VG: I have noticed that speech recognition tends not to really know what to do with me, because I have features of all of the different places that I've lived and grown up in my accent. And it tends to sometimes mess speech recognition up because I don't really correspond to a lot of the training data that it has. So, we bucket people and we label people with a small set of labels, and typically the labels are based on regions, so based on countries, and sometimes based on continents. But unfortunately, this just erases a lot of difference and a lot of variation that just exists in the world.  

SW: Vasundhara says when it comes to speech recognition software, there's a tendency for people to either specialize in either computer science, such as machine learning and AI, or in linguistics, the science of language. And this silo-ing of specialties leads to a lot of problems.  

VG: I have a non-rhotic accent, so I say “car park,” which is essentially like not saying the “R”s. Whereas someone, my roommate Mady, for example, who grew up here, she would say “car park.” And so rather than bucketing us into, “This is a North American English accent, this is a UK English accent”--which is again not even true for me--it would make a lot more sense to just kind of have a feature-based approach where you say, “This person has a rhotic accent; this person has a non-rhotic accent,” and then go on to the next feature and look at it there. And so, when you put a bunch of people into a box, you kind of erase a bunch of the variation and you don't really get at, within this broad-strokes bucket that you have, who are the people that are being underserved? A lot of African American English speakers in the US are very underserved by these models, and speech recognition that claims to do well on US English voices, just don't really do well on African American English because it has some slightly different rules. 

SW: In speech recognition technology, these examples of underserved voices and accents are currently treated as edge cases or outsiders. The entire premise of Vasundhara’s work is to take these underserved voices from the edge and elevate them to a level of primary importance.  

VG: What I mean is, how do we get it to be more equitable? How do we get it to really do well on these edge cases like my voice on edge cases like people who, who have a speech impediment or who are second language English speakers. People who, when they're talking to another human, it's really easy for the other human to figure out what they're saying, but computers just completely just fall flat on their faces. And then what ends up happening is when you have a transcript of a conversation where someone's talking, you just get a lot more nonsense in the transcript that you can't really do a lot with. I've also seen people who these systems are kind of designed for who have almost perfect transcripts at the end of it and that's kind of what I want all of us to be able to move towards and for all of us to be able to get. 

SW: And LGBTQ+STEM Day is an opportunity for Vasundhara to further spread this message. 

VG: I think that we do just as a society have this mental image of what a scientist looks like. And unfortunately, it really doesn't look like me at all like it looks almost as far away from me as you can possibly imagine. And so, I really like all of these different STEM days that kind of allow for us to really celebrate the diversity in scientists that actually exist in the world. You know, my queerness is a very central part of my identity. And it's just a great opportunity to go on the Internet and yell about it (laughs).