A Learnability Analysis of Argument and Modifier Structure
Leon Bergen, Edward Gibson, Timothy O'Donnell
June 2015
 

We present a computational learnability analysis of the argument-modifier distinction, asking whether information present in the distribution of constituents in natural language supports the distinction and its learnability. We first develop general models of those aspects of argument structure and the argument-modifier distinction which have effects on the the distribution of constituents in sentences—abstracting away many of the implementational details of specific theoretical proposals. Combining these models with a theory of learning based on succinctness, we define two systems, the argument-only (PTSG) model and the argument-modifier (PSAG) model.

We first show that the argument-modifier (PSAG) model is able to recover the argument-modifier status of many individual constituents when evaluated against a gold standard. This provides evidence in favor of our general account of argument-modifier structure as well as providing a lower bound on the amount of information that natural language input can provide for appropriately equipped learners attempting to recover the argument-modifier status of individual constituents.

We then present a series of analyses investigating how and why the argument-modifier (PSAG) model is able to recover the argument-modifier status of some constituents. In particular, we show that the argument-modifier (PSAG) model model is able to provide a more succinct description of the input corpus than the argument-only (PTSG) model model, both in terms of lexicon size, and in terms of the complexity of individual derivations—both on the training data and for a novel heldout dataset. Intuitively, the argument-modifier (PSAG) model model is able to learn a more compact lexicon with more generalizable argument structures because it is able to “prune away” spurious modifier structure. These analyses further support our general account of argument-modifier structure and its learnability from naturalistic input.

We conclude with a discussion of the generality of our approach and the role of such computational learnability analyses to the study of grammar.
Format: [ pdf ]
Reference: lingbuzz/002502
(please use that when you cite this article)
Published in: Under revision
keywords: argument structure; language learning; grammatical representation; probabilistic models; bayesian nonparametrics, syntax
previous versions: v2 [May 2015]
v1 [May 2015]
Downloaded:1454 times

 

[ edit this article | back to article list ]