Phonological Knowledge and Perceptual Epenthesis
Bert Vaux, Bridget Samuels, Korlin Bruhn
February 2022

In a famous series of studies, Dupoux and his associates assert that Japanese speakers use epenthesis to break up illicit consonant clusters and resyllabify coda consonants in loanwords (e.g., Dupoux et al. 1999). The authors draw an explicit connection between perception and loanword adaptation, suggesting that adaptation happens at the perceptual level: That is, when Japanese listeners are confronted with, e.g., [sfINks] 'sphinx', they perceive it as [sɯ̥.φiŋ.kɯ̥.sɯ̥]. In their analysis phonotactic knowledge influences listeners so strongly that it creates a perceptual illusion: Japanese listeners judge there to be a speech segment (/u/) despite the absence of acoustic correlates in the signal, simply because their L1 phonology insists it should be there.

Peperkamp and Dupoux (2003) and Peperkamp (2005) refine this theory into what we refer to as the Phonetic Decoder model. In this model, a phonetic decoding module maps the speech signal, one word at a time, into a discrete phonetic representation that conforms to the L1 phonology; a phonological decoding module then maps this surface form onto an underlying representation. Accordingly, a sequence that is phonotactically illicit in the L1 cannot be mapped accurately: The phonetic decoder for L1 Japanese speakers cannot accommodate two consonants next to each other, so an empty vowel segment intervenes and is carried over to the phonological mapping, where the empty vowel slot is interpreted as the closest phonetic match.

In this chapter we investigate whether perceptual epenthesis can be overcome by Japanese listeners, adopting Davidson et al.’s (2007) methodology. We also replicate Dupoux et al.’s (1999) ABX task, which allows for a more reliable measure of improvement while at the same time allowing a direct comparison to Dupoux et al.’s study. We demonstrate that Japanese speakers are able to learn the ebzo-ebuzo contrast (and more generally can access phonetic details necessary to learn a non-native contrast in spite of their native phonology), and provide an explanation of these results that does not depend on the Phonetic Decoder model.

