Reference resolution in the wild: How addresses circumscribe referential domains in real time comprehension during interactive problem-solving

Sarah Brown-Schmidt, M. Ellen Campana & Michael K. Tanenhaus
University of Rochester

sschmidt@bcs.rochester.edu

 

Although the generation and interpretation of definite reference has played a central theoretical and empirical role in sentence processing research, little is known about how addressees interpret referential expressions in natural interactive conversation.  To examine this process we monitored eye movements as pairs of participants, separated by a curtain, worked together to arrange blocks in matching configurations and confirm those configurations.  We varied the color, size, and orientation of blocks to encourage use of complex NPs, and grounding constructions.  Boards were partially covered, creating 5 distinct sub-areas.  Initially, sub-areas were dotted with stickers that represented blocks.  The task was to replace each sticker with a matching block.  While subjects' boards were identical with respect to the sub-areas, subjects' stickers differed: Every place that one subject had a sticker, the other subject had an empty spot, and vice-versa.  Pairs were instructed to tell each other where to put blocks so that in the end, their boards would match.  The entire study lasted approximately 2.5 hours; one person was eye-tracked, and the other had the primary microphone.

Participants typically took turns instructing one another about block placement, with conversation focusing primarily on the task.  All of the pairs who have participated thus far showed a similar approach to the task: pairs tended to work on one sub-area at a time, and only move to a new sub-area when all the blocks in the previous sub-area were placed.  Conversational grounding devices (and other pragmatic constraints) strongly restricted reference resolution to the current sub-area of conversation.  Of the 155 definite references analyzed, only 48% of the utterances were specific enough to disambiguate the target referent with respect to the other blocks in the relevant sub-area.  For this subset of utterances, eye movements were closely time-locked to the point of disambiguation (POD) in the utterance.  The proportion of looks to the target was significantly higher 300 ms after the POD than the 300ms before, replicating previous results with scripted instructions [1].  Most remarkably, ambiguous utterances elicited significantly more looks to the target than unambiguous utterances.  Moreover, fixations were primarily restricted to the referent shortly after onset of the definite reference.  These results suggest that (1) speakers systematically use less specific utterances when the referential domain has been pragmatically constrained; (2) the attentional states of speakers and addresses become closely tuned; and (3) utterances are interpreted with respect to referential domains circumscribed by pragmatic constraints.

 

Reference

[1] Eberhard, K.M., Spivey-Knowlton, M.J., Sedivy, J.C. & Tanenhaus, M.K. (l995).  Eye-movements as a window into spoken language comprehension in natural contexts.  Journal of Psycholinguistic Research, 24, 409-436.