Real-Time Understanding of Complex Discriminative Scene Descriptions

Ramesh Manuvinakurike, Casey Kennington, David DeVault and David Schlangen

Real-world scenes typically have complex structure, and utterances about them consequently do as well. We devise and evaluate a model that processes descriptions of complex configurations of geometric shapes and can identify the described scenes among a set of candidates, including similar distractors. The model works with raw images of scenes, and by design can work word-by-word incrementally. Hence, it can be used in highly-responsive interactive and situated settings. Using a corpus of descriptions from game-play between human subjects (who found this to be a challenging task), we show that reconstruction of description structure in our system contributes to task success and supports the performance of the word-based model of grounded semantics that we use.

Supporting Spoken Assistant Systems with a Graphical User Interface that Signals Incremental Understanding and Prediction State

Casey Kennington and David Schlangen

0:24:01

Toward incremental dialogue act segmentation in fast-paced interactive dialogue systems

Ramesh Manuvinakurike, Maike Paetzel, Cheng Qu, David Schlangen and David DeVault

SIGdial 2016

17th Annual SIGdial Meeting on Discourse and Dialogue