Language is a fundamental part of the human experience. It’s what separates humans from other animals. But since the 1950s, computer scientists have been trying to make computers capable of understanding language as well. A series of breakthroughs using machine learning, many over the last decade, have helped computers leap forward in the dissection of human text and speech. Now, a major question for technology researchers is the extent to which they can teach machines to truly understand language as well as humans – and take action.
Florida State University computer science doctoral candidate Daniel Bis is among those imbuing computers with the power to recognize, process, and respond to human speech and text. Bis, who also has an internship working on Amazon’s virtual assistant, Alexa, said it takes teams of people with a variety of backgrounds, working in a field called natural language processing, or NLP, to make that happen.
“The field is very broad but, in general, it’s about making computers more adept at human language,” Bis said. “There’s a lot of people with backgrounds in linguistics who focus on grammar and formal structures of language whereas we, as computer scientists, are focused on more data-driven approaches.”
Bis is largely concerned with a core component of NLP known as language representation, which involves developing mathematical models that help computers predict the meaning of one word using context gleaned from other information. Google searches, suggestions to auto-complete sentences in emails, and dictation software that turns speech into text are common applications of language models.
NLP, of course, applies to his work on Amazon’s Alexa, which is considered to have the broadest range of capabilities among the virtual assistants at the top of the market. With more than 100,000 skills, Alexa has the capacity to control lights, cameras, speakers, garage doors, smart locks, security systems, and home appliances and is compatible with about 7,400 household brands.
Bis and his teammates work to help “disambiguate” user responses — for example, if two people were to say, “play the song ‘Hello,’” that could mean either the song by Lionel Richie and the other by Adele.
“What we do is work under the hood to try to figure out which artist you meant,” Bis said.
The team also works on fixing errors in the speech recognition system. Maybe there was background noise, and the word wasn’t transcribed correctly. But based on the context of the interaction, the system could conceivably adjust, without needing the user to repeat the request.
Bis said he’s applying principles he learned as an undergraduate and now a graduate student at Florida State to the Alexa project. His interest in the field began when he discovered IDEA Grants from the Center for Undergraduate Research and Academic Engagement, or CRE, which fund student research and creative projects during the summer.
“IDEA Grants allow students to be the drivers of their research experience, crafting their own questions, managing their timeline, and forming their own conclusions,” said Latika Young, CRE director. “Of course, as undergraduate researchers, these students are still guided by faculty mentors, but the IDEA Grants enable students to transition from acting as a research assistant to a proper researcher in their own right.”
Bis asked Xiuwen Liu, chair of the Department of Computer Science and now his doctoral adviser, to help oversee the IDEA Grant. Liu steered him toward language processing as a growing field with interesting problems for the proposal. Bis won the 2018 Nancy Casper Hillis and Mark Hillis Undergraduate Research Award and received $4,000 to conduct his research.
“I worked on a text summarization tool, which would take a longer news article and automatically generate a summary. Our approach belonged to a family of abstractive summarizers, which are designed to generate novel sentences. It worked out pretty well, but had some shortcomings, such as occasional factual inconsistency, a widely studied issue in the field currently,” he said.
Liu said, more importantly, the grant supporting Bis’ work on summarization laid the foundation for his focus on helping computers find meaning through context.
Bis started working on helping computers disambiguate, or correctly define a word with multiple meanings, while he wrote two papers as an undergraduate. For those papers, he focused on biomedical information, in collaboration with the FSU College of Communication and Information’s School of Information, which offers training in health informatics.
“Most of human knowledge exists only in text form. Whether it’s the law or medical research, it’s in documents, the trove of which is only growing as more is published. It’s not possible for a single human to read it all, let alone actually understand it.”
— Xiuwen Liu, Chair, FSU Department of Computer Science
As applied to health informatics, disambiguation could make searching research documents easier for health professionals, as the word “cold” may mean just a common cold or be the acronym for chronic obstructive lung disease.
“Most of human knowledge exists only in text form. Whether it’s the law or medical research, it’s in documents, the trove of which is only growing as more is published,” Liu said. “It’s not possible for a single human to read it all, let alone actually understand it. One way to overcome that problem is to use language models to help us.”
Bis said he likes having the ability to directly impact customers, being involved with engineering, and solving practical problems. He’s leaning toward taking research positions in the technology industry or in applied science after completing his doctorate at FSU.