Archive for March, 2006

Not awaiting Dolly (abstract for PCS-AMS, Berkeley, CA, May 6-7, 2006)

Not Awaiting Dolly: Louis Armstrong’s speaking style
Louis Armstrong’s various renditions of the song “Hello Dolly” present a concrete basis for examining the issue of swing as reflected in vocal entrances. This paper will present and discuss the means and extent to which he uses salient entrances to push the beat ahead, and the ends of phrases or less salient entrances to return to a baseline tempo. I will briefly show how his particular choices in this regard differ from those of other performances of this same song.

These aspects of performance practice however are not limited to music alone. Rather they can be observed in spontaneous conversation as well. Linguists discuss features of speaking styles, including the anticipation or delay of beats in turn taking. The focus of this paper is to compare and contrast how the same effects can be observed in music and language, and what that reveals about commonalities and differences between them.

The methodology used here is one of acoustic analysis from digitized recordings of performances of the song “Hello Dolly”, as well as digitized recordings of spontaneous and naturally spoken discourse (a term used in Linguistics to describe language interaction in its broadest sense). Rather than examining scores of music or textual transcripts of language, the present effort seeks to highlight performed durations, and to a lesser extent pitches, through examination in detail of the sounds themselves. Transcriptions and calculations are rendered from the performances. This technique brings to the fore issues of perceptual salience, how for instance we perceive widely divergent durations as nonetheless bearing the same note value in context. Why and how these issues arise will be addressed.

Finally, in addition to comments regarding the interactions between musical and linguistic materials, and our perceptions of them, I will be presenting issues of transcription and notation, that reveal shortcuts and pitfalls to writing systems for both music and language, and point toward a greater awareness of the similarities between research in these two fields.

Comments

Infant Sound Environment Project (ISEP)

Draft

The Infant Sound Environment Project (ISEP) is a longitudinal study of the sound inputs to infants and the relationship of these inputs to the sound production of these same children as they emerge from infancy. While previous studies have often addressed the acquisition of words and grammar—how meaning and form emerge in the human mind—the present study will address a different aspect of this experience. The focus will be on the melodic and rhythmic elements of sound—both those encountered by the child, and those which the child eventually produces. These aspects in communicative behaviors such as language and music carry a fundamental level of emotion that has heretofore gone largely unexplored. Further their foregrounding in this study will permit us to explore patterning, imitation, and creativity, without prejudicing the linguistic context.

It is often posited that language is unique to our species, and that what it contributes to our being is nothing less than defining of our nature. The intent of this study is not directly to challenge this notion, but rather to put question to what fundamentally characterizes language in this vein. If language is the defining element of humanity, what is language? The answer to this question underlies all of this research. It is quite possible that the “what” that will emerge is not exclusive to the domain of language.

We will study the origins of sound comprehension and sound production in the individual, with particular attention paid to the rhythmic and melodic features of this sound. We will retain an interest in what this ontogenesis may have to say about the phylogenesis of these same qualities in the human species. Existing theories of the origins of human language, and of the uniqueness of the human animal with regards to language, have mostly focused on the higher level domain of words and grammar. By focusing on these lower-level elements of sound, we can begin to test the point at which our species’ uniqueness arises1.

The present attempt will begin with the data, rather than any presupposed theoretical framework, with a broad interest in sounds of all variety, without prejudicing language or any other behavior a priori. One key feature of the current project is its methodology. This is a painstaking effort to gather raw data that can be analyzed and dissected to yield fruitful information. No hypothesis is being tested. Rather the evidence is being gathered first, with an eye toward constructing a theory from this evidence. The basic research in this domain is yet to be done. This effort is an attempt to redress this failing.

While studies have been done both of the prenatal environment, and of the capacities for sound perception of neonates and young infants, no longitudinal study has yet been done to correlate the sound production of preschoolers with the sounds they have been exposed to earlier in life. It seems a reasonable assumption for example that young children will model the prosodic patterns that they have encountered. Yet so far this remains mostly an assumption. What can be empirically established regarding the relationship between the sound environment of infants and their own later sound productions? This study is intended to establish these relationships.

It is clear that infants (as the meaning of the word connotes) are born without speech. Their earliest indulgences with sound are described first as cries, then also as babbling. At what point does this babbling become intentional experimentation? What is the nature of this experimentation? If imitative, what is being imitated? These are some of the questions that we wish to elucidate. Previous studies have focused exclusively on the vocal output of the infants and children, or have considered the primarily linguistic utterances of parents and caregivers, described nowadays as infant-directed speech, but little work has been done to compare these facets, and what has been done presupposes an often ill-defined distinction between linguistic and nonlinguistic behaviors.

However, as noted by Trehub et al2, distinguishing language from other sounds is a non-trivial task for the infant. Put another way, until language has been acquired, the infant is tasked with making sense of a mass of meaningless sounds. Only by stipulating a hard-wired language module in the infant brain (for which at the moment there is scant empirical evidence) are we able to view their task as language acquisition. Yet language, to the infant, shares many features in common with nonlinguistic sounds in their environment.

Even to an adult, there remains overlap between musical and linguistic sound production. What are the features that distinguish language from other sound-producing behaviors? How does the infant come to understand these differences? To what extent are these features universal or culture-specific? Rather than beginning with the assumption that an infant’s task is to acquire language, perhaps we should seek to understand the child’s behavior in more general terms. If patterns emerge in the sound environment of infants, how do infants go about recognizing, understanding, and reproducing these patterns? Only a study that seeks to examine sound without presupposing the prominence of language will be capable of answering these questions.

The methodology we will adopt is based in a naturalistic approach. Rather than a laboratory setting that trades off naturalism for control in the environment, ISEP will collect data directly from the natural environment of the child. Daylong digital audio recordings will be made of the children within their natural home, play, and school contexts. The length of recordings will ensure a sufficient quantity of usable data, and will help mitigate against the self-consciousness of caregivers and others. These recordings will be captured at intervals of about once a month for the first two to three years of life.

Every effort will be made to maintain the confidentiality and privacy of the participants. However the data collected will have much broader application than the focus of this study, and thus we hope to make this data available to other researchers involved in complementary work. Similarly, it is likely that data collected for other projects, especially from diverse cultural contexts, will be useful for the present purpose. Thus efforts will be made to find outside collaborators for this project.

The question, which by some is supposed to have been settled in 1959 by the suppositions and arguments of Noam Chomsky regarding the unlearnability of many aspects of the linguistic system3, nonetheless remains: to what extent is language acquired from the environment; and to what extent is it innate? More broadly, what influence on a child’s cognitive development is directly played by the inputs to that child during maturation? Since the focus here in on the rhythmic and melodic aspects of sound, it is now possible to gather data which can be used to answer this question. Until recently, with the advent of long-term recordings of sound, this question has been constrained by the unreliability of report, or by extrapolations from limited data gathered in laboratories and elsewhere.

It is well-known that all normal children acquire the language(s) to which they are exposed, and no other. Little is known however about the acquisition of aspects of language and interactive behaviors beyond the domains of words and grammar, as for instance intonation, timing, stress. It has generally been supposed that language socialization (both socialization to learn language, and further socialization through language) is the dominant, even all-encompassing task of the prelinguistic infant. Additionally, it has been assumed that the principal or sole function of these behaviors has been communicative, while their possible role as part of the child’s mental construction of the world and oneself has mostly remained unconsidered.

The current study will dispose with any such prejudice, and suppose merely that sound exists in the environment in which an infant dwells; that this sound assuredly contains remarkable features in terms of its melodic and rhythmic content, and likely in other aspects such as timbre; that the infant’s brain is in some way predisposed by nature and biology to selectively attend to that sound4; and further that through some process, which is yet to be clearly understood, the child as it emerges from infancy begins to contribute its own sounds to that environment, which likely bear some resemblance to those it has experienced. The process by which the child’s own sound manipulations reflect natural dispositions or exhibit cultural or idiosyncratic features is little understood. The intent of the present study is to provide a means for measuring the impact of the sound environment of early life on the child’s own later productions of sound.

Surely, the patterning through time that rhythm presents can be found in a soundless environment as well. Other features, such as melody and timbre, may also be found to have silent correlates5. Later research should allow us to compare the normal development of hearing children to the cognitive, linguistic, and social development of the congenitally deaf in ways that have so far eluded comparison. Through the establishment of a baseline for development at the lower levels that will be exposed through this study, we will be able to more readily and more early identify certain deficits which may affect a child’s cognitive development and language acquisition.

Further, this study will permit us to uncover other elements in the maturation and socialization of the child through sound, which are extralinguistic, such as music acquisition. More broadly, by focusing not on the lexicon and syntax, but rather on melody, rhythm, and timbre, we will be able to discuss in finer-grained detail the child’s sound perception and production than has previously been the norm. In this way we will be able to see what features are shared among a variety of behaviors, without presuming their function and significance as linguistic or otherwise.

1 Don Hodges has presented a video of a captive and socialized chimpanzee purportedly engaged in musical improvisation at a keyboard. Is it possible that lower-level manipulations of melody and rhythm are shared behaviors with our primate cousins?

2 Trehub, et al. “Music and speech processing in the first year of life,” Advances in Child Development and Behavior 24 (1993).

3 For example, Boysson-Bardies (1999) writes: “By the middle of the twentieth century, after a brief interval in which Anglo-American psychologists held the acquisition of language to be the exclusive result of learning and imitation, it came to be recognized that language development could not be reduced to a mechanism of elementary linkages between images or sensations and sounds. In 1959, Noam Chomsky demonstrated the impossibility of acquiring language with approaches of this type. … Only a powerful innate system could allow the child to extract a model of language from adult speech … This endowment consists of a universal system that belongs to the human brain, which he called a universal grammar. This grammar is the basic schema that grounds the grammars of all human languages. A mental circuitry, inscribed in the biological constraints governing the development of the brain, underlies this schema and permits it to select the sounds, signs, and sign combinations of the mature language.”

4 Belin, et al. (2000) have identified what they term “voice selective areas” of the brain, neurons that respond specifically to the vocal sounds of conspecifics.

5 Sherman Wilcox has argued that sign languages exhibit features which should properly be interpreted as prosodic, an affective layering on top of the lexical and syntactic foundation, which might likely correspond to melodic and timbral aspects of spoken language.

Comments

Murry, Hoit-Dalgaard, and Gracco (1983)

Murry, Thomas, Jeannette Hoit-Dalgaard, and Vincent L. Gracco. “Infant Vocalization: A Longitudinal Study of Acoustic and Temporal Parameters,” Folia Phoniatrica 35: 245-253 (1983).

The present investigation is a longitudinal study of cry and noncry vocalizations obtained from 1 infant at biweekly intervals from 2 to 12 weeks of life. (245)

The authors sought:

to determine if longitudinal trends existed in cry and nondistress vocalizations. (245)

Recordings were made about every two weeks for the first three months of life. A distinction was drawn between distress and non-distress vocalizations, principally on the basis of known context. The authors note that non-distress vocalizations first emerge during the eighth week of life. The cry samples were analyzed acoustically

to obtain F0, the duration of the vocalic segments, the amount of periodicity, aperiodicity and silence in the sample, and classification of melody contours. (246)

The researchers plotted a stylization of the intonation contours (termed melodygrams) abstracted from 0.1 s segment readings of periodic signals. Segments in excess of 0.4 s were classified from these stylizations as one of 7 melody types:

rising (R), falling (F), rising-falling (RF), falling-rising (FR), flat (FL), rising-falling-rising (RFR), or falling-rising-falling (FRF). (246)

They observe:

The falling contours are assumed to represent the natural physiologic state; that is, the fall in F0 results from a decrease in subglottic air pressure over the expiratory cycle. The rising or flat terminations, however, reflect phonatory modification of the airstream; that is, the natural falling contour is altered. This finding suggests that cry and nondistress vocalizations may be separated on the basis of final termination markers. (251)

As a result of these investigations, they conclude:

Examination of the similarities among the vocalization data revealed evidence of two developmental trends. A general increase in the main duration of vocalic segments was noted in all three phonatory conditions. Secondly, the cry data, especially the discomfort cry, showed a higher proportion of periodic phonation relative to silence as the infant matured. These trends appear to reflect increased respiratory and phonatory control which accompanies the normal developmental process. (251)

Comments (1)

Song, Speech, and Brain updated

Commentary on Belin, et al. (2000) has been added to the Song, Speech, and Brain bibliography.

Comments

Belin, et al. (2000)

Belin, Pascal, et al. 2000. “Voice-selective areas in human auditory cortex.” Nature 43 (20 January 2000): 309-312.

This team of researchers reports on what they term voice selective areas of the brain, which selectively respond to the sounds of a human voice regardless of the sorts of behavior in which it is involved. This flies in the face of previous work on auditory perception that presupposed a distinction between linguistic and non-linguistic sounds. Much of the work on dichotic listening from the 1960s for instance (and even more so, later articles that uncritically cited this earlier work) presupposed as uncontentious the a priori distinction between linguistic and non-linguistic inputs. What is implied by this report is that a more valid distinction lies between vocal and non-vocal sounds.

They write:

Here we show, using functional magnetic resonance imaging in human volunteers, that voice-selective regions can be found bilaterally along the upper bank of the superior temporal sulcus (STS). These regions showed greater neuronal activity when subjects listened passively to vocal sounds, whether speech or non-speech, than to non-vocal environmental sounds. (309)

The authors conducted three experiments. The first consisted in passive listening by 8 right-handed subjects to two categories of stimuli:

1) vocal sounds produced by several speakers of different gender and age, either speech (for example, isolated words, connected speech in several languages) or non-speech (such as laughs, sighs and coughs); and (2) energy-matched, non-vocal sounds (for example, natural sounds, animal cries, mechanical sounds) from a variety of environmental sources. (309)

Experiment 2 further supported a supposition that

the voice-sensitive response was not entirely due to the presence of speech in the vocal stimuli. (310)

and further that

frequency structure plays a more prominent role in voice-sensitive activation than does amplitude envelope. (310)

The third experiment was conducted with a different group of subjects. In this case, a vocal/non-vocal decision task and a speaker’s gender-identification tasks were added, post-scanning.

They explain:

In all three experiments, peaks of voice selectivity could be found in most subjects along the upper bank of the STS, a deep, long sulcus (>8 cm) running along the whole temporal lobe that is also found in many non-human primates. (310)

They observe that

voice-selective regions were found in the STS on both sides, but voice selectivity was stronger in the right hemisphere in the first two experiments… However, the same pattern was not found in expt 3, suggesting that the neural substrate of voice perception might be less clearly lateralized than in the case of speech perception. (310-311)

They conclude:

these experiments provide strong evidence that the human brain contains regions that are not only sensitive to, but also strongly selective to, human voices. (311)

In outlining the significance of these findings, among other points, they suggest that

it could lead to new comparisons between species, by suggesting that areas sensitive to species-typical vocalizations could be found in the homologous regions in other primates. Indeed, language is probably unique to humans, and its possible evolutionary precursors are hard to define and study in other animals. In contrast, we share the ability to reliably extract affective- and identity-related cues from the species-specific vocalizations with many other species, at least of primates. (311)

Indeed, it would be fascinating to attempt such studies with non-human subjects (though the logistics of scanning an ape might be more than can be handled at the moment). I would suggest also that similar experiments should be conducted on deaf subjects, using passive viewing of sign and gestures. In the place of environmental sounds, environmental motions could be used. The point would be to ascertain 1) whether these voice-selective areas are indeed exclusively a part of auditory processing, or whether (as in the case of congenital deafness in particular) they might be shown to be part of a general human pattern-recognition process, coopted by the mind, in the absence of auditory input; and 2) even if these regions are sustained as exclusively (or primarily) part of our sound-processing mechanism, is it possible to uncover homologous regions in the brain that are selective, rather than to speech or vocal sounds, to (in the words of Petitto 2000) “aspects of the patterning of language …its temporal and distributional regularities”?

Comments (2)

In progress pages added

Added the pages Melodic ambiguity in song and speech and Anticipating Dolly, describing two new projects, under the heading In Progress.

Comments

Plantinga and Trainor (2005)

Plantinga, Judy & Laurel J. Trainor. “Memory for melody: infants use a relative pitch code,” Cognition 98 (2005): 1-11.

Plantinga & Trainor begin by noting that while most other animals rely upon absolute pitch data, humans tend to prefer relative pitch. While absolute pitch has often been considered a coveted ability, the authors observe that

focusing on absolute pitch information may be a musical hindrance. (2)

This matter goes beyond music however, as relative pitch reflects a more general ability of humans to approximate, to identify similarities from a mass of real-world instances, in ways that likely go beyond the abilities of other animals. As they put it:

the ability to encode relative pitch and perceive melodic invariance across pitch transposition is a more sophisticated ability than remembering absolute pitch. (2)

They point out the tonotopic organization of pitch physiologically, and indicate that relative pitch processing is therefore a cognitive task, following initial processing of the acoustic information. They cite Heaton, Hermelin, & Pring (1998) and Brown et al., (2003) in discussing correlations between absolute pitch processing and autism

suggesting that absolute pitch processing is associated with a particular cognitive style. (3)

They indicate the suggestion by others

that early in life all infants rely mainly on absolute pitch, but that with increasing age and experience most shift to processing relative pitch (Sergeant & Roche, 197; Takeuchi & Hulse, 1993). (3)

Yet, they challenge this notion, stating:

there are little data to suggest a transition from absolute to relative pitch processing. (3)

The remainder of the article sets out details of two experiments conducted to test this, concluding:

The results of this study suggest that by 6 months of age infants, like adults, store melodic information primarily according to a relative and not an absolute pitch code in long-term memory. … The possibility that infants also remember absolute pitch of a familiar melody cannot be ruled out, but the present results argue against robust absolute pitch memory. (8)

Yet, they concede:

It is still possible that there is a developmental shift from predominantly absolute pitch processing to predominantly relative pitch processing that takes place before 6 months of age. (9)

References

Brown, W. A., et al. (2003). “Autism-related language, personality, and cognition in people with absolute pitch: Results of a preliminary study,” Journal of Autism and Developmental Disorders 33, 163-167.

Heaton, P., B. Hermelin, and L. Pring. (1998). “Autism and pitch processing: A precursor for savant musical ability,” Music Perception 15, 291-205.

Sergeant, D. and S. Roche. (1973). “Perceptual shifts in the auditory information processing of young children,” Perception of Music 1, 39-48.

Takeuchi, A. H. and S. H. Hulse. (1993). “Absolute pitch,” Psychological Bulletin 113, 345-361.

Comments

Conference on Interdisciplinary Musicology ‘07

The Conferences page has been updated to reflect the new URL for CIM07 (which should be up and running soon).

Comments

Petitto (2000)

Petitto, Laura Ann. “On the biological foundations of human language,” in Emmorey et al, Eds., The Signs of Language Revisited. [CITY?]: Lawrence Erlbaum. 2000, 449-473.

The author, who as an undergraduate at Columbia University had been part of an ape-language experiment with the West African chimpanzee, Nim Chimpsky, sets down the task:

Our question concerned whether aspects of human language were species specific, or whether human language was wholly learnable from environmental input. (450)

She adds:

All chimpanzees fail to master key aspects of human language structure, even when you give them a way to bypass their inability to speak—for example, by exposing them to other types of linguistic input such as natural signed languages. This fact raised the hypothesis to me that humans possessed something at birth in addition to the mechanisms for producing and perceiving speech sounds that aided them in acquiring natural language.

In this article, she seeks in particular to challenge the notion that

evolution has rendered the human brain neurologically “hardwired” for speech (Liberman & Mattingly, 1985, 1989; Lieberman, 1984). (452)

She contends that

If, as has been argued, very early human language acquisition is under the exclusive control of the maturation of the mechanisms for speech production and speech perception (Locke, 1983; Van der Stelt & Koopmans-van Bienum, 1986), then spoken and signed languages should be acquired in very different ways. (452)

She proceeds to outline a great many experiments involving deaf children of deaf and hearing parents, hearing children of deaf parents, English, French, ASL (American Sign Language), LSQ (Langue des Signes Québécoise).

In describing one part of the study, involving hearing children exposed only to sign languages, she notes the suprising result that

These babies achieve all linguistic milestones on a normal maturational time table. If early human language acquisition were wholly determined neurologically by the mechanisms for speech production and reception, then these hearing babies raised without systematic spoken language stimulation should show atypical patterns of language acquisition. Instead, all of these groups of hearing babies produced manual babbling, first signs, first two-signs, and other milestones, at the same time as is seen in all other children, be they hearing acquiring speech or Deaf acquiring sign. (456)

She asks:

Might the occurrence and developmental timing of this behavior in all infants suggest something about the “ready-state” nature of the human body to express language from multiple pathways? (462)

Then proceeds to outline her theory that

there is a biological “equipotentiality” of the spoken and signed modalities to receive and produce natural language. (462)

At the end of this thoroughly enjoyable, greatly detailed, and forcefully argued piece, she concludes:

the present findings have led me to propose a new way to construe human language ontogeny. Rather than being exclusively hard-wired for speech or sound, the young of our species are initially hardwired to detect aspects of the patterning of language. I suggested here that this initial sensitivity is to aspects of its temporal and distributional regularities initially corresponding to the syllabic and prosodic levels of natural language organization. (470)

References

LIBERMAN, Alvin M., and Ignatius G. Mattingly. (1985) “The Motor Theory of Speech Perception Revised.” Cognition 21 (1985): 1-36.

–. (1989) “A specialization for speech perception.” Science 243 (4890): 489-494.

Lieberman, P. (1984) The biology and evolution of language. Cambridge, MA: Harvard University Press.

Locke, J. L. (1983). Phonological acquisition and change. New York: Academic Press.

Van der Stelt, J. M. & Koomans-van Bienum, F. J. (1986). “The onset of babbling related to gross motor development.” In Lindblom & Zetterstrom (Eds.), Precursors of early speech. New York: Stockton Press, 163-173.

Comments (3)

Saffran, Aslin, and Newport (1996)

Saffran, Jenny R., Richard N. Aslin & Elissa L. Newport. “Statistical Learning by 8-Month-Old Infants,” Science 274 (13 December 1996).

A short but sophisticated and critical look at well-established assumptions regarding first language acquisition. In particular, the authors sought to challenge the notion of the poverty of the stimulus, as articulated by Noam Chomsky and others. [1] They summarized the standard view:

few theorists have entertained the hypothesis that learning plays a primary role in the acquisition of more complicated aspects of language, favoring instead experience-independent mechanisms. Young humans are generally viewed as poor learners, suggesting that innate factors are primarily responsible for the acquisition of language. (1926)

They went on:

In particular, we ask whether infants are in fact better learners than has previously been assumed, thus potentially reducing the extent to which experience-independent structures must be posited.(1927)

The focus of this study was the acquisition of word boundaries from speech stimuli. Moving from the established principle that “measurable statistical regularities” can serve as cues to word boundaries, [2] they tested whether 8-month old infants were able to abstract such cues from a brief exposure to artificial wordlike stimuli.

They observed:

Our results raise the intriguing possibility that infants possess experience-dependent mechanisms that may be powerful enough to support not only word segmentation but also the acquisition of other aspects of language.

and concluded:

the massive amount of experience gathered by infants during the first postnatal year may play a far greater role in development than has previously been recognized.

[1] Cited by the authors in this regard: Chomsky, N. (1965), Aspects of the Theory of Syntax, Cambridge, MA: MIT Press; Crain, S. Behavioral and Brain Sciences 14 (1991), 597.

[2] Cited by the authors: Harris, Z. (1955), Language 31, 190; Hayes, J. and H. Clark (1970), in Cognition and the Development of Language, J. Hayes (ed.); Brent, M. and T. Cartwright (1996), Cognition 61, 93.

Comments

Evolution

I will be creating a bibliography on the phylogenetic development of skills and capacities both within the hominid line and comparatively within other species.

Comments

Nathani, Oller, and Cobo-Lewis (2003)

Nathani, Suneeti, D. Kimbrough Oller, and Alan B. Cobo-Lewis. “Final Syllable Lengthening (FSL) in infant vocalizations,” Journal of Child Language 30 (2003): 3-25.

Comments

« Previous entries Next Page » Next Page »
Register Login
Locations of visitors to this page