In agreement with preceding studies, the analysis of melody pattern was based on a differentiation between simple (singe-arc) and complex melodies (Fig. 2a, b). Complex melodies were defined as exhibiting ≥ two arcs and/or intra-melodic breaks between arcs by laryngeal constrictions that generate rhythmical variations [11, 31]. Preceding studies defined a melody arc as being longer than 150 ms and as exhibiting a frequency modulation amplitude (FM amplitude) of at least three (cry) or two (non-cry) semitones [12]. However, we applied a stronger time criterion and adjusted the FM amplitude criteria for arc definition to vocalization type: arc duration was defined to require more than 300 ms and a minimum FM amplitude of two (cry), one and a half semitones (non-cry), or of one semitone (marginal and canonical babbling), respectively. This distinction was justified because Wermke and Mende [12] characterized melody arcs using a mathematical modeling approach and reported a decreasing FM amplitude from crying toward babbling of six to eleven percent, i.e., one up to two semitones. The criteria defined here took this into account.
a, b Melody/intensity diagrams. Melody (black) and intensity (gray) curve of a single-arc melody (a) and complex, triple-arc melody (b).
Based on the mentioned arc criteria, all vocalization melodies were objectively subdivided into those with only a simple (single-arc, SA) melody and those with a complex (multiple-arc, MA) melody (cf. Fig. 2a, b). SA contained a single ascending-then-descending melody arc with a minimum duration of 300 ms. Complex melodies consisted of two (DA), three (TA), or more (MA) melody arcs, each arc having a duration of at least 300 ms. Following earlier work [11, 29], we also identified rhythmically segmented contours (SEG) which contain inner-melodic pauses caused by laryngeal constrictions [31]. Complex melodies of pre-speech utterances regularly exhibit rhythmical variations generated by short “segmentation pauses” between adjacent melody arcs across the first year of life (Fig. (Fig.33).
Time waveform and narrow-band (45 Hz) spectrogram displaying a cry utterance consisting of a complex, segmented melody with the following structure: single arc (symmetric shape) − laryngeal constriction (segmentation pause S) − single arc (falling shape) − laryngeal constriction (incomplete segmentation S) − double arc (combination falling-symmetric arc). The subsequent inspiratory noise (ingressive phonation IP) is also visible.
Melodies containing at least one complete or incomplete laryngeal constrictions [11, 31] were assigned to “segmented” complex contours (SEG) independent of arc numbers.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.