Methods of assessing argument structure preferences: Sentence completion versus argument structure estimation

Methods of assessing argument structure preferences: Sentence completion versus argument structure estimation

Ann Bunger & Michael Walsh Dickey
Northwestern University

The frequency with which particular verbs are used in different syntactic contexts has been shown to be an important factor in modelling on-line processing performance (Trueswell, 1996; Garnsey, et al., 1997, among others). The standard methods for assessing such frequencies have been either sentence-completion studies or counts collected from large natural-language corpora. This paper presents a new methodology for assessing the frequency with which verbs are used in differing syntactic frames, argument structure estimation, and compares it with a standard methodology, sentence completion.

Two groups of participants drawn from the same population were asked to assess the transitivity bias of optionally transitive verbs like "type" and "phone". The two groups performed different tasks. The first group were presented with sentence fragments containing the verbs and asked to complete them in a standard sentence completion task:

When Alex phoned ...

The second group of subjects were presented with a list of verbs and asked to estimate "how many phrases" each verb occurred with in its most frequent use. Relevant phrases were defined as those describing "actors and entities that play a role in the event the verb describes." They circled the number of phrases on a scale from 1 to 4. In both tasks, the target verbs were interspersed among a number of fillers with varying argument structures.

Correlation analyses performed on the resulting means showed that the two measures were strongly correlated for verbs with mid to high token frequency in the CELEX database: r[2]=0.751 for verbs with past-tense token frequencies between 1000 and 2000, r[2]=0.608 for verbs with a token frequency of 2000 or more. However, the two measures were relatively poorly correlated for verbs with low token frequencies: r[2]=0.302 for verbs with frequencies of 200 or less. Regression analyses revealed that variation in the sentence completion norms was the source of this discrepancy. Token frequency was a reliable predictor of sentence completion values (p<.02), but not argument structure estimates (p<.5).

This comparison indicates that while norms drawn from the two methodologies are highly correlated, sentence completion norms may be more strongly influenced by the token frequency of the verbs being rated than argument structure estimation norms, particularly at low frequencies. This in turn suggests that argument structure estimation may provide more stable estimates of syntactic frame preferences, especially when looking at verbs with very low token frequencies.