Cowdsourcing.org The Industry Website

Register Login
or sign in with

The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources

document Distributed Knowledge
Summary
Augmented and alternative communication (AAC) devices enable users with certain communication disabilities to participate in everyday
conversations. Such devices often rely on statistical language models to improve text entry by offering word predictions. These predictions can be improved if the language model is trained on data that closely reflects the style of the users’ intended communications. Unfortunately, there is no large dataset consisting of genuine AAC messages.

Description
In this paper the researchers demonstrate how they can crowdsource the creation of a large set of fictional AAC messages. They showed that these messages model conversational AAC better than the currently used datasets based on telephone conversations or newswire text. They leverage their crowdsourced messages to intelligently select sentences from much larger sets of Twitter, blog and Usenet data. Compared to a model trained only on telephone transcripts, their best performing model reduced perplexity on three test sets of AAC-like communications by 60–82% relative. This translated to a potential keystroke savings in a predictive keyboard interface of 5–11%.

Download Flag This

1

Comments

Guest
 Join or Login
 Optional