Archive for Analysis

Speech Prosody & Music (UCSB Linguistics Colloquium, May 2006)

Speech Prosody & Music:
Transcription, Perception, and Meaning

Jonathan G. Secora Pearl
Linguistics Colloquium, May 18, 2006
University of California—Santa Barbara

Abstract
Music and language are twin aspects of civilization, found in all known human cultures, across time and place, embracing us from our earliest days until the ends of our lives. Speaking and singing are found everywhere and everywhen. Wherein lies the distinction?

The greatest difficulty in answering this foundational question is that we are often deceived by written forms of music and language into believing our object dwells within them, rather than in the sounds that inspire them. On the page, these materials appear far more distinct than they do in sound. Text without context is a world without air; yet context alone remains the unanalyzable chaos of everyday experience. The trick is to find the balance between too much detail, and too little.

Read the rest of this entry »

Comments

“Varieties of Czech Prosody, a century ago and today,” DGfS 2007

To be presented at the Annual Meeting of the German Society of Linguistics (DgfS)
(28 Feb 2007 – 2 Mar 2007)

Jonathan G. Secora Pearl
Department of Linguistics
University of California, Santa Barbara
jpearl@ linguistics.ucsb.edu

Abstract: “Varieties of Czech Prosody, a century ago and today”.

In 1897, the Czech composer and pedagogue Leoš Janáček (1854-1928) began a journey into the field of speech prosody, eventually spanning the final three decades of his life. (Pearl 2006) His procedure involved detailed observation of natural phenomena, and a good deal of confidence in his own intuitions. For more than 30 years he eavesdropped, often clandestinely, on the conversations of those around him, recording their utterances in the system most familiar to him, musical notation. These efforts are remarkable for a variety of reasons: because they captured a great many nuances of natural speech, some of which are lost by contemporary transcription techniques; because they have been all but ignored over the ensuing century; because they represent some of the earliest and most extensive attempts to concurrently describe the melody and rhythm of speech prosody, permitting the possibility to explore matters of diachronic change.

Janáček worked mostly with the Czech language, but also Russian, Slovak, Croatian, German, English, Italian, and others. His transcriptions provide us a glimpse into the everyday speech of a century ago, which otherwise might be lost, due to a dearth of audio recordings, and the poor quality of what does exist. I will begin my talk with a presentation of some of Janáček’s transcriptions of Czech, including discussion of the benefits and pitfalls of a music-based transcriptional system, then follow with presentation of modern-day recordings of spontaneous spoken Czech, along with the author’s prosodic transcriptions of these selections, which build on Janáček’s procedure, yet exploit today’s technologies and software, to describe melodic and rhythmic features of contemporary spoken Czech.

Pearl, J. (2006), “Eavesdropping with a Master: Leoš Janáček and the Music of Speech,” Empirical Musicology Review Vol. 1, No. 3.

Comments

Infant Sound Environment Project (ISEP)

The Infant Sound Environment Project (ISEP) is a longitudinal study of the sound inputs to infants and the relationship of these inputs to the sound production of these children as they emerge from infancy. Follow on research will address aspects of perceptual equivalence, to better understand this relationship. While previous studies have addressed the acquisition of words and grammar—how meaning and form emerge in the human mind—the present study will address a different aspect of this experience, namely the melodic and rhythmic elements of human vocal sounds, which play a major part in the expression and comprehension of emotions and attitudes. [1] These aspects of social and communicative behaviours, in language and music, carry a fundamental layer of meaning that has heretofore gone largely unexplored. Their foregrounding in this study will permit us to explore patterning, imitation, and creativity, without unduly prejudicing our assumptions regarding the nature of these vocal sounds.

It is often posited that language is unique to our species, and that what it contributes to our being is nothing less than defining of our nature. [2] The intent of this study is not directly to challenge this notion, but rather to put question to what fundamentally characterizes language in this regard. If language is the defining element of humanity, what is language? The answer to this question underlies the present research programme. It is quite possible that the “what” that will emerge is not exclusive to the domain of language, but rather more generally applicable to human social and communicative interactions, in the human capacity for pattern recognition within our natural environment. While clearly there are aspects of language that are outside the domain of sound production and perception (visual cues and gesture, as well as sign languages which are entirely exclusive of sound), it is my contention that the systems of pattern recognition and imitation that will be in evidence through this study are likely generalizable and comparable to other behaviours, rather than of a different nature. [3] Read the rest of this entry »

Comments

The Competition Model and its Relevance for Speech/Song Research

Jonathan G. Secora Pearl
Department of Linguistics
University of California, Santa Barbara

Corresponding address:

Jonathan Pearl
Music & Language Studies
7220 N. Rosemead Blvd., Suite 202-10
San Gabriel, CA 91775

email: type”jonathan@musiclanguage.net”

ABSTRACT

The emerging field of music and language studies draws on the traditions and techniques of linguistics and musicology, with an empirical and cognitive bent. The present paper examines the relevance of the Competition Model from psycholinguistics on research that straddles the territories of speech prosody and music, in particular addressing the production and perception of the musical aspects (pitch, timing, amplitude, and timbre) of human vocal sounds.

INTRODUCTION

The Competition Model is an emergentist model for human language. It assumes that human brains develop according to a genetically-specified though plastic plan, which includes certain preferences in computing style arising in particular regions or pathways of the brain, as a result of native architectural and timing mechanisms. This is in contrast with nativist theories that implicitly presume innate representations, of grammar for instance, at the cortical level. According to proponents of the Competition Model, evidence for domain-specific language modules is grossly exaggerated, and most localization of language processing that does exist is domain-general in nature and likely emerges as a result of the interaction between the sensory environment and the brain’s uneven computational playing field, rather than being specified in the genes.

It is argued that although grammar is not given in the world, neither is it provided for in the human genome. This approach in particular explains why brain damage in infants and children does not result in long-term deficits which appear as a result of analogous damage to adult brains. Adults have a life-long history of experience neurologically calcified, as a result of Hebbian learning. Children on the other hand have less experience from which to have solidified brain connectivity through stimulus/response-styled strengthening and weakening; in addition, continuing neurogenesis and synaptogenesis permit greater flexibility in attending to novel experiences, even if the resultant pathways may be computationally less efficient than in normals. For these reasons, maturation and learning are considered two aspects of the same events.

The Competition Model presumes that languages differ in the means by which linguistic information is encoded, and further that such differences are as likely quantitative as qualitative. Not only do they differ in their use of specific linguistic features (i.e., lexical tone, morphological inflections) but also in the degree to which various items bear relevant information for listeners. This is shown in cross-linguistic differences in relevance weightings and costs to processing for particular features in conflict with one another (for example: word order, animacy, subject-verb agreement, and gender and number markings used in decisions regarding transitivity). In support of the theory, it appears that the most cost efficient of these features—which can differ significantly from language to language—in terms of processing load and relevance (dubbed cue costs and cue validity), are the least susceptible to disturbance under brain damage, meaning they are most likely to be encoded reduplicatively in the brain. Since aphasic syndromes differ cross-linguistically in the specific deficits they engender—in particular, that these differences reflect the inherent qualitative and quantitative variety among languages—this is taken as evidence that grammar is not innately and universally encoded, but rather based in the brain’s experience of the world.

RELEVANCE TO SPEECH/SONG COMPARISONS
It appears that much of the research involving aphasias has been grossly flawed by preconceived notions regarding the nature of these deficits, as well as over-reliance on generative theories of language. In the literature on prosodic and musical deficits, strikingly these studies are largely based on presumptions of evidence from the more abundant literature on aphasias. If those are flawed then a great deal of the latticework upon which studies regarding neurologically-based deficits in linguistic prosody and the various amusias may collapse.

From the stance that any questions regarding the nature of language and music must be empirically tested, how would research regarding speech prosody and song fit into the scheme of the Competition Model? The literature is littered with hasty conclusions and crass simplifications of the nature of music. Music however, no less than language, appears to be a uniquely human attribute. It is ubiquitous across cultures, and throughout known history, and perhaps more primitive phylogenetically. [1] Just as no chimpanzee has spontaneously begun a dialogue on the nature of altruism, no bonobo has ever played so much as a hollow log or a blade of grass. Fruitless analogies between human song and whale or bird song aside, any continuity between human music and the behaviors of other animals is likely to be found in those aspects of human behavior that are common to both music and language. In particular, I would argue that it is in finding the commonalities between speaking and singing that we are likely to find a large part of the gulf that divides humanity from the rest of nature. And in those features, we will understand the cognitive roots that evolutionarily gave rise to both language and culture.

If adaptations that are claimed for language are not domain-specific, we are likely to find further evidence for this in attempting to define the difference between speech and song. Both are human vocal behaviors. Both leave an acoustic signature, and provide imperfect data to the perceptual apparatus of listeners. In each case, the behavior is most often directed towards or for the benefit of other humans, with an intent to express or communicate ideas or emotions. Further, there are cultural differences regarding which cues carry the most relevant information (i.e., rhythm, melody, divisions of the octave, timbre) that can be analyzed and reliably perceived (though in different ways cross-culturally). Each has aspects of grammar and syntax that are more or less clearly definable. Just as the local choice of phoneme sets varies in arbitrary ways, so too aspects of musical vocabulary vary according to seemingly arbitrary choices. Which features of the acoustic signal segment categorical boundaries vary as much for music as they do for language.

However, there are distinct contrasts between these two domains of human behavior. For instance, language contains a lexicon of semantically-grounded words, whereas music can be, and often is, entirely devoid of propositional meaning. The music in song is apart from the meaning of the words, sometimes independent, at times reinforcing, often contradicting. The musical contribution to song serves in a way to replace the natural prosody of speech. But prosodic aspects of speech contain and convey a great deal of information that is outside the grammar and lexicon of language.

In addition, there is some evidence in the literature for a dissociation between spoken prosody (both lexical and affective) and singing. These studies have used a variety of methodologies (experimental and clinical), and have implicated a multitude of brain regions, from left frontal lobe for lexical prosody (Monrad-Krohn 1947; and Buchanan et al 2000), to right tempoparietal regions (Ross & Mesulam 1979; Ross 1981) for affective prosody, to cerebellum and bilateral motor cortex/posterior inferior frontal gyri for dissociations between speaking and nonverbal singing of melody and rhythm (Riecker et al 2000). Clearly a great deal of study remains to be done.

POINTS FOR FUTURE RESEARCH
How is meaning altered when speech is sung? How do the musical aspects of song figure into the calculations of a listener? Can cue validity and cue cost be separately defined in musical terms? Might this provide further evidence for the case that language processing is in large-part domain-general? Why is it that some aphasics, unable to utter a word of speech, can sing? Is it merely a matter of defining in finer detail the subtle aspects of these deficits? Is there any evidence to sustain dissociations between speaking and singing in comprehension? If there are, I have not yet found any in the literature. If not, it would be rather strange that the production of song, but not its reception, would dissociate from speech.

Likely the anecdotal evidence is skewed by flawed assumptions. Primarily, the issue is confounded by the fact that no one has sufficiently defined the subject matter under investigation. What does it mean to speak, that is different from what it means to sing? If anecdotal evidence supports the claim that brain damaged individuals are able to engage in one but not another of two similar activities, both including the expression of words by the voice, encoded by means of manipulating pitch, duration, amplitude, and timbre, then we need to understand better how these two behaviors differ. Are they two ends of a continuum, or is there a disjunction that divides up the otherwise shared behavior space? How can these matters be tested empirically?

Difficulty arises even in the simplest stages of such research. For instance, there is the nativist argument that brain structures have evolved solely for speech. However, nowhere in the literature is there a clear definition of speech as a solitary act. In fact, speech, like many human behaviors, is a complex of many parts. Without better definitions of the matter under investigation, claims one way or the other are unfalsifiable. Although the necessary distinction between production and perception is normally stipulated, even accounting for this distinction, the remaining behaviors are not simple acts. The perception of speech for instance involves acoustic input to the ears, sent to the primary auditory cortex. A great deal of calculating must go on, however, before the brain will recognize the auditory input as a meaningful signal. Interestingly, there is evidence that the brain early on recognizes human vocal sounds as special (Belin et al 2000), yet this only serves further to link speaking and singing in their uniqueness as stimuli, rather than to distinguish them from each other.

Here is a hypothetical, if entirely speculative, sequence of events: First there is the segmentation of the signal by sources (the “cocktail party effect”). The signal may likely include not only other voices, but environmental sounds as well, which must be filtered out as irrelevant. Next, the signal is parsed into phonemic units, which are further recalibrated based on context (i.e. coarticulation effects, nasalization). Allowances must be made for dialectic and idiolectic variation, for proper categorization of these sounds. In parallel, there will be processing of pitch, intensity and timing. Calculations will go on to determine which aspects of the pitch are local, some relevant for phonemic categorization and others for lexical prominence, and which are more global, and therefore relevant for affective determinations of attitude or judgments on the encoded meanings. Some allowances must be made for individual differences of voice quality, perhaps based on style of speaking or physiological issues such as hoarseness, or lack of muscular control (dysarthria) due to aging or disease. It becomes quickly clear that to speak of a speech act is a polite fiction, if the implication is that such an utterance can be easily qualified and quantified.

For this reason, many of the deficits that appear to affect specific grammatical or lexical processing, may in fact be the result of problems higher along one or another secondary processing pathways. As Bates et al (1998) note: “If we experience two stimuli in exactly the same way, then (by definition) we do not know that they are different.” (p. 599) It follows then that what can be distinguished in normals, or dissociated in pathologies are somehow different in terms of brain processing. Surely, there are many distinctions that the brain is incapable (or disinclined) to notice. For instance, sharp boundaries do exist in perception for graded acoustic events, such as the categorical boundary for the phonemes /b/ and /p/; and as noted in Bates (in press, p. 8 ), this appears not to be a species-specific phenomena. The same is likely true for categorical perception of colors.

The point is: graded phenomena in the world can be perceived as disjunct by living brains. Where brains fail to make a distinction, the phenomena are for our purposes categorically the same. It is by identifying and quantifying the features used by brains that we will come to understand how seemingly equivalent behaviors do in fact differ, likewise how apparently different behaviors may utilize shared processes in the brain. Therefore the task of specifying dissociations is largely a matter of determining the level of processing at which each dissociation occurs. If these levels are consistent across subjects, they can be viewed as universal brain mechanisms (without regard at this point for whether they are innate or emergent). Where they differ, it is likely the result of individual differences (perhaps based in experience or native abilities) or failure to specify the stimuli with sufficient detail. In many cases, the technology for such fine-grained distinctions may not yet exist.

FOOTNOTES

[1] This is a contentious point. Some have argued that music is not universally understood and appreciated by individuals across cultures. Others have noted that not all cultures have a native music. For example Southern Popaluca has been cited in this regard. Southern Popalucan music is all borrowed from Spanish and popular Mexican traditions. On the one hand, such cases may be the exceptions that prove the rule. However, and more deeply indicative is the question regarding what features distinguish music from language. Inherent in all spoken languages are manipulations of timing, intonation, and timbre, which are features shared in common between musical and linguistic phenomena. Arguably, even signed languages, while lacking sound, contain similar and analogous features, as has been argued by Sherman Wilcox among others.
REFERENCES

BATES, E. “On the nature and nurture of language.” (in press). In R. Levi-Montalcini, D. Baltimore, R. Dulbecco, & F. Jacob (Series Eds.) & E. Bizzi, P. Calissano, & V. Volterra (Vol. Eds.), Frontiere della biologia [Frontiers of biology]. The brain of homo sapiens. Rome: Giovanni Trecanni. [Prepublication version].

BATES, E., DEVESCOVI, A., & WULFECK, B. (2001). Psycholinguistics: a cross-language perspective. Annual Review of Psychology. Chippewa Falls, WI: Annual Reviews.

BATES, E., et al (1998). “Innateness and emergentism.” In W. Bechtel & G. Graham (Eds.), A Companion to Cognitive Science (pp. 590-601). Malden, MA and Oxford: Blackwell Publishers.

BELIN, P., et al. 2000. “Voice-selective areas in human auditory cortex.” Nature 43 (20 January 2000): 309-312.

BUCHANAN, T. W., et al. 2000. “Recognition of emotional prosody and verbal components of spoken language: an fMRI study. Cognitive Brain Research 9: 227-238.

MONRAD-KROHN, G. H. (1947). “Dysprosody or altered ‘melody of language’.” Brain 70, 405-415.

RIECKER, A., et al. (2000). “Opposite hemispheric lateralization effects during speaking and singing at motor cortex, insula and cerebellum.” NeuroReport 11 (9), 1997-2000.

ROSS, E. D. (1981, Sep). “The aprosodias: Functional-anatomic organization of the affective components of language in the right hemisphere.” Archives of Neurology 38, 561-569.

ROSS, E. D. & MESULAM, M.-M. (1979). “Dominant language functions of the right hemisphere? Prosody and emotional gesturing.” Archives of Neurology 36, 144-148.

Comments

Denoting the Voice: Text and Context in Music and Language

Denoting the Voice: Text and Context in Music and Language

Jonathan G. Secora Pearl
Fellowship proposal, submitted to the NEH

The Problem

Charles Darwin was wrong, at least about music. In “The Descent of Man,” he wrote: “As neither the enjoyment nor the capacity of producing musical notes are faculties of the least use to man in reference to his daily habits of life, they must be ranked amongst the most mysterious with which he is endowed.” (Darwin, C. 2004 [1879]: 636) One might have expected more, knowing his wife Emma was a fine pianist, who in her youth had studied in Paris with Frédéric Chopin. Generations of scholars, from outside the field of music, have compared it to other human behaviors, and found it lacking, a mere artifice, insubstantial, ornamental, irrelevant. Some have dismissed it as a byproduct of something ostentibly more useful to the species, like language. (Pinker, S. 1997: 528) To hold that music is useless, but that language is not, one must understand how they differ. It is a simple thing to claim they are not alike, but far harder in practice to define the ways. Music and language remain twin aspects of civilization, found in all known human cultures, across time and place, embracing us from our earliest days until the ends of our lives. Speaking and singing are found everywhere and everywhen. Wherein lies the distinction?

The greatest difficulty in answering this foundational question is that we are often deceived by written forms of music and language into believing our object dwells within them, rather than in the sounds that inspire them. On the page, they appear far more distinct than they do in sound.Text without context is a world without air; yet context alone remains the unanalyzable chaos of everyday experience. The trick is to find the balance between too much detail, and too little. Most important is a self-reflective understanding of the specifics regarding what each system captures and what it leaves out. Standard Western music notation gives preference to pitch classes and length, dealing more with intention than with execution. Written language may highlight phonetic details and word order at the expense of intonation and timing. Comparing music and language in these forms is speaking at cross-purposes. Read the rest of this entry »

Comments

Software page updated

Links have been added to the Software page, including links to the tools EXMARaLDA and PRAAT.

Comments

Not awaiting Dolly (abstract for PCS-AMS, Berkeley, CA, May 6-7, 2006)

Not Awaiting Dolly: Louis Armstrong’s speaking style
Louis Armstrong’s various renditions of the song “Hello Dolly” present a concrete basis for examining the issue of swing as reflected in vocal entrances. This paper will present and discuss the means and extent to which he uses salient entrances to push the beat ahead, and the ends of phrases or less salient entrances to return to a baseline tempo. I will briefly show how his particular choices in this regard differ from those of other performances of this same song.

These aspects of performance practice however are not limited to music alone. Rather they can be observed in spontaneous conversation as well. Linguists discuss features of speaking styles, including the anticipation or delay of beats in turn taking. The focus of this paper is to compare and contrast how the same effects can be observed in music and language, and what that reveals about commonalities and differences between them.

The methodology used here is one of acoustic analysis from digitized recordings of performances of the song “Hello Dolly”, as well as digitized recordings of spontaneous and naturally spoken discourse (a term used in Linguistics to describe language interaction in its broadest sense). Rather than examining scores of music or textual transcripts of language, the present effort seeks to highlight performed durations, and to a lesser extent pitches, through examination in detail of the sounds themselves. Transcriptions and calculations are rendered from the performances. This technique brings to the fore issues of perceptual salience, how for instance we perceive widely divergent durations as nonetheless bearing the same note value in context. Why and how these issues arise will be addressed.

Finally, in addition to comments regarding the interactions between musical and linguistic materials, and our perceptions of them, I will be presenting issues of transcription and notation, that reveal shortcuts and pitfalls to writing systems for both music and language, and point toward a greater awareness of the similarities between research in these two fields.

Comments

Infant Sound Environment Project (ISEP)

Draft

The Infant Sound Environment Project (ISEP) is a longitudinal study of the sound inputs to infants and the relationship of these inputs to the sound production of these same children as they emerge from infancy. While previous studies have often addressed the acquisition of words and grammar—how meaning and form emerge in the human mind—the present study will address a different aspect of this experience. The focus will be on the melodic and rhythmic elements of sound—both those encountered by the child, and those which the child eventually produces. These aspects in communicative behaviors such as language and music carry a fundamental level of emotion that has heretofore gone largely unexplored. Further their foregrounding in this study will permit us to explore patterning, imitation, and creativity, without prejudicing the linguistic context.

It is often posited that language is unique to our species, and that what it contributes to our being is nothing less than defining of our nature. The intent of this study is not directly to challenge this notion, but rather to put question to what fundamentally characterizes language in this vein. If language is the defining element of humanity, what is language? The answer to this question underlies all of this research. It is quite possible that the “what” that will emerge is not exclusive to the domain of language.

We will study the origins of sound comprehension and sound production in the individual, with particular attention paid to the rhythmic and melodic features of this sound. We will retain an interest in what this ontogenesis may have to say about the phylogenesis of these same qualities in the human species. Existing theories of the origins of human language, and of the uniqueness of the human animal with regards to language, have mostly focused on the higher level domain of words and grammar. By focusing on these lower-level elements of sound, we can begin to test the point at which our species’ uniqueness arises1.

The present attempt will begin with the data, rather than any presupposed theoretical framework, with a broad interest in sounds of all variety, without prejudicing language or any other behavior a priori. One key feature of the current project is its methodology. This is a painstaking effort to gather raw data that can be analyzed and dissected to yield fruitful information. No hypothesis is being tested. Rather the evidence is being gathered first, with an eye toward constructing a theory from this evidence. The basic research in this domain is yet to be done. This effort is an attempt to redress this failing.

While studies have been done both of the prenatal environment, and of the capacities for sound perception of neonates and young infants, no longitudinal study has yet been done to correlate the sound production of preschoolers with the sounds they have been exposed to earlier in life. It seems a reasonable assumption for example that young children will model the prosodic patterns that they have encountered. Yet so far this remains mostly an assumption. What can be empirically established regarding the relationship between the sound environment of infants and their own later sound productions? This study is intended to establish these relationships.

It is clear that infants (as the meaning of the word connotes) are born without speech. Their earliest indulgences with sound are described first as cries, then also as babbling. At what point does this babbling become intentional experimentation? What is the nature of this experimentation? If imitative, what is being imitated? These are some of the questions that we wish to elucidate. Previous studies have focused exclusively on the vocal output of the infants and children, or have considered the primarily linguistic utterances of parents and caregivers, described nowadays as infant-directed speech, but little work has been done to compare these facets, and what has been done presupposes an often ill-defined distinction between linguistic and nonlinguistic behaviors.

However, as noted by Trehub et al2, distinguishing language from other sounds is a non-trivial task for the infant. Put another way, until language has been acquired, the infant is tasked with making sense of a mass of meaningless sounds. Only by stipulating a hard-wired language module in the infant brain (for which at the moment there is scant empirical evidence) are we able to view their task as language acquisition. Yet language, to the infant, shares many features in common with nonlinguistic sounds in their environment.

Even to an adult, there remains overlap between musical and linguistic sound production. What are the features that distinguish language from other sound-producing behaviors? How does the infant come to understand these differences? To what extent are these features universal or culture-specific? Rather than beginning with the assumption that an infant’s task is to acquire language, perhaps we should seek to understand the child’s behavior in more general terms. If patterns emerge in the sound environment of infants, how do infants go about recognizing, understanding, and reproducing these patterns? Only a study that seeks to examine sound without presupposing the prominence of language will be capable of answering these questions.

The methodology we will adopt is based in a naturalistic approach. Rather than a laboratory setting that trades off naturalism for control in the environment, ISEP will collect data directly from the natural environment of the child. Daylong digital audio recordings will be made of the children within their natural home, play, and school contexts. The length of recordings will ensure a sufficient quantity of usable data, and will help mitigate against the self-consciousness of caregivers and others. These recordings will be captured at intervals of about once a month for the first two to three years of life.

Every effort will be made to maintain the confidentiality and privacy of the participants. However the data collected will have much broader application than the focus of this study, and thus we hope to make this data available to other researchers involved in complementary work. Similarly, it is likely that data collected for other projects, especially from diverse cultural contexts, will be useful for the present purpose. Thus efforts will be made to find outside collaborators for this project.

The question, which by some is supposed to have been settled in 1959 by the suppositions and arguments of Noam Chomsky regarding the unlearnability of many aspects of the linguistic system3, nonetheless remains: to what extent is language acquired from the environment; and to what extent is it innate? More broadly, what influence on a child’s cognitive development is directly played by the inputs to that child during maturation? Since the focus here in on the rhythmic and melodic aspects of sound, it is now possible to gather data which can be used to answer this question. Until recently, with the advent of long-term recordings of sound, this question has been constrained by the unreliability of report, or by extrapolations from limited data gathered in laboratories and elsewhere.

It is well-known that all normal children acquire the language(s) to which they are exposed, and no other. Little is known however about the acquisition of aspects of language and interactive behaviors beyond the domains of words and grammar, as for instance intonation, timing, stress. It has generally been supposed that language socialization (both socialization to learn language, and further socialization through language) is the dominant, even all-encompassing task of the prelinguistic infant. Additionally, it has been assumed that the principal or sole function of these behaviors has been communicative, while their possible role as part of the child’s mental construction of the world and oneself has mostly remained unconsidered.

The current study will dispose with any such prejudice, and suppose merely that sound exists in the environment in which an infant dwells; that this sound assuredly contains remarkable features in terms of its melodic and rhythmic content, and likely in other aspects such as timbre; that the infant’s brain is in some way predisposed by nature and biology to selectively attend to that sound4; and further that through some process, which is yet to be clearly understood, the child as it emerges from infancy begins to contribute its own sounds to that environment, which likely bear some resemblance to those it has experienced. The process by which the child’s own sound manipulations reflect natural dispositions or exhibit cultural or idiosyncratic features is little understood. The intent of the present study is to provide a means for measuring the impact of the sound environment of early life on the child’s own later productions of sound.

Surely, the patterning through time that rhythm presents can be found in a soundless environment as well. Other features, such as melody and timbre, may also be found to have silent correlates5. Later research should allow us to compare the normal development of hearing children to the cognitive, linguistic, and social development of the congenitally deaf in ways that have so far eluded comparison. Through the establishment of a baseline for development at the lower levels that will be exposed through this study, we will be able to more readily and more early identify certain deficits which may affect a child’s cognitive development and language acquisition.

Further, this study will permit us to uncover other elements in the maturation and socialization of the child through sound, which are extralinguistic, such as music acquisition. More broadly, by focusing not on the lexicon and syntax, but rather on melody, rhythm, and timbre, we will be able to discuss in finer-grained detail the child’s sound perception and production than has previously been the norm. In this way we will be able to see what features are shared among a variety of behaviors, without presuming their function and significance as linguistic or otherwise.

1 Don Hodges has presented a video of a captive and socialized chimpanzee purportedly engaged in musical improvisation at a keyboard. Is it possible that lower-level manipulations of melody and rhythm are shared behaviors with our primate cousins?

2 Trehub, et al. “Music and speech processing in the first year of life,” Advances in Child Development and Behavior 24 (1993).

3 For example, Boysson-Bardies (1999) writes: “By the middle of the twentieth century, after a brief interval in which Anglo-American psychologists held the acquisition of language to be the exclusive result of learning and imitation, it came to be recognized that language development could not be reduced to a mechanism of elementary linkages between images or sensations and sounds. In 1959, Noam Chomsky demonstrated the impossibility of acquiring language with approaches of this type. … Only a powerful innate system could allow the child to extract a model of language from adult speech … This endowment consists of a universal system that belongs to the human brain, which he called a universal grammar. This grammar is the basic schema that grounds the grammars of all human languages. A mental circuitry, inscribed in the biological constraints governing the development of the brain, underlies this schema and permits it to select the sounds, signs, and sign combinations of the mature language.”

4 Belin, et al. (2000) have identified what they term “voice selective areas” of the brain, neurons that respond specifically to the vocal sounds of conspecifics.

5 Sherman Wilcox has argued that sign languages exhibit features which should properly be interpreted as prosodic, an affective layering on top of the lexical and syntactic foundation, which might likely correspond to melodic and timbral aspects of spoken language.

Comments

In progress pages added

Added the pages Melodic ambiguity in song and speech and Anticipating Dolly, describing two new projects, under the heading In Progress.

Comments

Berkner & Armstrong analyses (tba)

Need to add some analyses of Laurie Berkner (“the only thing…”) from “I really love to dance”, and Louis Armstrong’s rendition of “Hello Dolly” as illustrations of melodic and rhythmic ambiguity in music. Possibly add others to this roster. Use screenshots and sound clips to illustrate these concepts.

For Berkner analysis, also compare to a few speech examples of the word “only”. Review and describe the literature on pitch change as a result of physiology, as in closing to the nasal “n”. Discuss the issues for a transcriber (or perceiver) in determining the relevance and salience of such transient pitch change. Should we notate every detail?

Discuss issues of level of detail, broad versus narrow transcription. Use metaphor of vision: examination with the naked eye, magnifying glass, microscope.

Comments

Register Login
Locations of visitors to this page