Infant Sound Environment Project (ISEP)

The Infant Sound Environment Project (ISEP) is a longitudinal study of the sound inputs to infants and the relationship of these inputs to the sound production of these children as they emerge from infancy. Follow on research will address aspects of perceptual equivalence, to better understand this relationship. While previous studies have addressed the acquisition of words and grammar—how meaning and form emerge in the human mind—the present study will address a different aspect of this experience, namely the melodic and rhythmic elements of human vocal sounds, which play a major part in the expression and comprehension of emotions and attitudes. [1] These aspects of social and communicative behaviours, in language and music, carry a fundamental layer of meaning that has heretofore gone largely unexplored. Their foregrounding in this study will permit us to explore patterning, imitation, and creativity, without unduly prejudicing our assumptions regarding the nature of these vocal sounds.

It is often posited that language is unique to our species, and that what it contributes to our being is nothing less than defining of our nature. [2] The intent of this study is not directly to challenge this notion, but rather to put question to what fundamentally characterizes language in this regard. If language is the defining element of humanity, what is language? The answer to this question underlies the present research programme. It is quite possible that the “what” that will emerge is not exclusive to the domain of language, but rather more generally applicable to human social and communicative interactions, in the human capacity for pattern recognition within our natural environment. While clearly there are aspects of language that are outside the domain of sound production and perception (visual cues and gesture, as well as sign languages which are entirely exclusive of sound), it is my contention that the systems of pattern recognition and imitation that will be in evidence through this study are likely generalizable and comparable to other behaviours, rather than of a different nature. [3]

ISEP will study the origins of sound comprehension and sound production in the individual, with particular attention paid to the rhythmic and melodic features involved, retaining an interest in what this ontogenesis may have to say about the phylogenesis of these behaviours and capacities in the human species. Existing theories on the origins of human language, and the uniqueness of the human animal with regards to language, have mostly focused on the higher level domain of words and grammar. [4] But note for instance that even these aspects of cognitive development are underserved by existing research. [5] By focusing, in melody and rhythm, on the lower-level perceptual elements of sound rather than on higher-order conceptual elements, we can better define the stage at which comprehension and production occur in infancy and point toward a more nuanced view of our species’ uniqueness. [6] Later studies will involve comparison with the sound behaviours of other species, in hopes of drawing either distinctions or similarities with for instance the production and discrimination behaviours of chimpanzees and bonobos. [7]

This present attempt begins with extensive data gathering from the native sound environment to which human infants are exposed. This is a painstaking effort to gather raw data that can be analyzed and dissected to yield fruitful information. From the sound files produced, perceptual experiments will be devised seeking to establish and quantify rules for perceptual equivalence. One means of identifying the most salient features of sound, for later manipulation and testing in laboratory experiments, is by observing the relationship of early and continuing sound production by infants to similar patterns of sound that they have heard. If children’s behaviour reflects an internalisation of the patterns in their world, it is necessary to establish what they are hearing, and how this is reflected in their production. Imitation is known to play a major role in the acquisition of language; however, the study of prosodic imitation is still in its infancy. [8]

Prosodic manipulations are part and parcel to linguistic corrections for example, with special prominence given to the word or morpheme that is the intended target of such correction. “I WENT to the store, honey, not I GOED.” Surely the child is to gather from this prominence the meaning of the caregiver’s intention. Yet, how does the child come to acquire the meaning of a speech prominence itself? At what stage in their cognitive development does such comprehension become apparent? How do they gain an understanding of intentionality as reflected in speech prosody? To what extent, if at all, is such comprehension innate, rather than developmentally obtained?

Many studies, beginning with Charles Darwin’s, have argued that facial expressions are well nigh universal. One might suppose the same is true of human vocal sounds. Yet, aspects of the organisation of timing and pitch in both language and music vary widely across cultures? It is the contention of this author that most or all of these aspects are acquired through experience, that the native mechanisms of the infant’s brain may predispose certain features (such as the timbre of the human voice) to warrant special attention. However, the specific means by which sound is manipulated, how prominences for instance are achieved, are culture specific, and thus incapable of being universally innate. [9] This is a testable hypothesis, one for which the present study will provide evidence to contradict or support.

While studies have been done both of the prenatal environment, and of the capacities for sound perception of neonates and young infants, no longitudinal study has yet been completed to correlate the sound production of preschoolers with the sounds to which they were exposed earlier in life. [10] Previous studies have focused on the vocal output of children, or have considered the linguistic utterances of parents and caregivers, but little work has been done to compare these facets to each other; what has been done often presupposes an ill-defined distinction between linguistic and nonlinguistic behaviours. As noted by Sandra Trehub and colleagues, distinguishing language from other sounds is a non-trivial task for the infant. [11]

ISEP will collect data directly from the child’s natural environment, with the intention of gathering data across a swath of cultural and linguistic diversity. Daylong digital audio recordings will be made of children within their home, play, and school contexts. The length of recordings will ensure a sufficient quantity of usable data, and will help mitigate against the self-consciousness of caregivers and others. Recordings will be captured at intervals of about once per month for the first two to three years of life. Similar long-recording techniques have been used with success by corpus linguists, and promise to be fruitful in adaption by psychologists. Through in-depth analysis of the sound recordings, we will seek to identify the sound patterns to which the child is exposed, and to clarify which of those patterns are deemed perceptually most salient as demonstrated in the child’s own imitative behaviour.

The intention is then to return to the laboratory, using stimuli gleaned from these initial efforts for studies regarding perceptual equivalence. Scalar judgments of similarity between stimuli will be measured in order to obtain readings on the most salient aspects of variation in the sound. Such questions have been studied in terms of categorical perception for phonemes or pitches in isolation. Some work has been done regarding the perception of musical and prosodic contours, and brief melodic and rhythmic patterns. However little work has attempted to establish what sorts of variance in sound production and perception are ordinarily forgiven, or considered equivalent, in day-to-day interactions. The data gathered in this programme, will provide a wealth of natural stimuli of both speech prosody and spontaneous song, both child-directed and child-produced, through various stages of development, to address these questions.

NOTES

[1] There is good evidence to sustain the uniqueness in perception of the human voice as a carrier of salient information. Limiting the present study to human vocal sounds is therefore not a matter of arbitrary choice, but rather reflects human cognitive predispositions. Cf. Belin, P., et al. (2000), “Voice-selective areas in human auditory cortex,” Nature 43: 309-312.

[2] Jackendoff, R. (1994), Patterns in the Mind, New York: Basic Books; Bickerton, D. (1995), Language and Human Behavior, Seattle: University of Washington Press; Lieberman, P. (1991), Uniquely Human, Cambridge, MA: Harvard University Press.

[3] Sherman Wilcox for example has focused a great deal of research over the past several years on the issue of prosody in sign languages. He has argued that sign languages, like speech, produce an affective layer separate from the lexical and syntactic foundation, which might likely correspond to melodic, rhythmic and timbral aspects of spoken language.

[4] Atchison, J. (1996), The Seeds of Speech, Cambridge, UK: Cambridge University Press; Armstong, D., et al. (1995), Gesture and the Nature of Language, Cambridge, UK: Cambridge University Press; Hurford, J., et al., Eds. (1998), Approaches to the Evolution of Language, Cambridge, UK: Cambridge University Press; Cf. comparisons with theories on the evolutionary origin of music capacities in Wallin, N. et al., Eds. (2000), The Origins of Music, Cambridge, MA: MIT Press.

[5] Laurel Trainor (personal communication) has noted the utter lack of studies or literature addressing even so simple and pervasive a supposition as the efficacy of child-directed speech for aiding the acquisition of words and syntax.

[6] Don Hodges has presented a video of a captive and socialized chimpanzee purportedly engaged in musical improvisation at a keyboard. Is it possible that lower-level manipulations of melody and rhythm are shared behaviours with our primate cousins?

[7] De Waal, F. (2001), The Ape and the Sushi Master, New York: Basic Books; Hauser, M. (1996), The Evolution of Communication, Cambridge, MA: MIT Press; Anderson, S. (2004), Doctor Dolittle’s Delusion, New Haven, CT: Yale University Press;Williams, L. (1967), The Dancing Chimpanzee, New York: W. W. Norton & Co.

[8] Clark, E. (2002), First Language Acquisition, Cambridge, UK: Cambridge University Press; ‘t Hart, et al. (1990), A Perceptual Study of Intonation, Cambridge, UK: Cambridge University Press; Chun, D. (2002), Discourse Intonation in L2, Amsterdam: John Benjamins Publishing Company; Ladd, D. (1996), Intonational Phonology, Cambridge, UK: Cambridge University Press; Bolinger, D., Ed. (1972), Intonation, Baltimore, MD: Penguin Books; Lehiste, I. (1970), Suprasegmentals, Cambridge, MA: MIT Press; Cruttenden, A. (1997), Intonation, Cambridge, MA: MIT Press.

[9] Elman, J. et al. (1996), Rethinking Innateness, Cambridge, MA: MIT Press.

[10] Boysson-Bardies, B. (1999), How Language Comes to Children, Cambridge, MA: MIT Press; Deliège, I & J. Sloboda, Eds. (1996), Musical Beginnings, Oxford: Oxford University Press; Jusczyk, P. (1997), The Discovery of Spoken Language, Cambridge, MA: MIT Press; Mehler, J. & E. Dupoux (1993), What Infants Know, Cambridge, MA: Blackwell.

[11] Trehub, S. et al. “Music and speech processing in the first year of life,” Advances in Child Development and Behavior 24 (1993).

Leave a Comment

You must be logged in to post a comment.

Register Login
Locations of visitors to this page