There's more to this than meets the eye

Gerry T. M. Altmann & Yuki Kamide
University of York

g.altmann@psych.york.ac.uk

 

Prior research has demonstrated that eye-movements around a visual scene are closely time-locked to aspects of the mental processes implicated in sentence processing (e.g., Tanenhaus et al., 1995; Allopenna et al., 1998).  A range of data suggest that "listeners establish reference incrementally, rapidly integrating the information in the visual model with the speech" (Tanenhaus et al., 1995; 474).  In this paper, we describe data which demonstrate that sentences are not mapped onto static representations of the concurrent visual input, but rather they are mapped onto interpreted, and dynamically changeable, representations of that input.  Participants were shown a visual scene containing an open box, an open briefcase, and some books, and various other objects.  They were told simply to listen to each sentence and observe each picture.  We contrasted four conditions:

a.

The boy will close the box.  Then, he will drop the books into the briefcase.

b.

The boy will move the box closer.  Then, he will drop the books into the box.

c.

The boy will close the briefcase.  Then, he will drop the books into the box.

d.

The boy will move the briefcase closer.  Then, he will drop the books into the briefcase.

In each case, the concurrent scene (displayed throughout the two sentences) was identical (as was the fragment 'Then, he will drop the books').  During the second sentence, we found that up until verb offset there was a very strong bias to look towards whichever container had been mentioned in the prior sentence.  In the 'closer' contexts, this bias was maintained during 'the books' (when the processor is anticipating a subsequent expression that will potentially refer to the Goal of the action denoted by the verb); there were more looks towards the 'closer' (previously mentioned) container than towards the other one.  Crucially, this bias was eliminated in the 'closed' contexts the bias to look towards the previously mentioned (but 'closed') container was compounded now by an equivalent tendency to look towards the container that had not been 'closed'.  Prior sentential context thus interacted with the concurrent visual input to allow the rapid integration of the target sentence with a mental representation of the concurrent visual scene which took account of the 'changes' to that scene that had been introduced linguistically.  The target sentence was not, therefore, mapped onto the concurrent visual scene, but was mapped instead onto a linguistically-mediated representation of that scene.  Our data demonstrate that the 'world' in the visual-world paradigm is not a visual world at all, but rather a mental world, potentially mediated by linguistic context, and potentially somewhat different to the actual world that forms the concurrent visual input.  Thus, 'what you see' is not 'what you get' when mapping language onto the 'visual' world.

 

References

Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998).  Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models.  Journal of Memory and Language, 38(4), 419-439.

Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995).  Integration of visual and linguistic information in spoken language comprehension.  Science, 268, 1632-1634.

Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1996).  Using eye movements to study spoken language comprehension: Evidence for visually mediated incremental interpretation.  In T. Inui & J. L. McClelland (Eds.), Attention and Performance XVI: Information integration in perception and communication.  Cambridge, MA: MIT Press/Bradford Books.