Preprint
Article

This version is not peer-reviewed.

Music Notes Are Neither Letters nor Words: A Closer Look at the Nature of Musical Stimuli in Eye Movement Research Design

Submitted:

06 August 2025

Posted:

13 August 2025

You are already at the latest version

Abstract
This article examines the nature of musical stimuli used in eye-movement research on music reading, with a focus on syntactic elements essential for fluent reading: melody, rhythm, and harmony. Drawing parallels between language and music as syntactic systems, the study critiques the widespread use of stimuli that lack coherent musical structure, such as random pitch sequences or rhythmically ambiguous patterns. Eight peer-reviewed studies were analyzed based on their use of stimuli specifically composed for research purposes. The findings reveal that most stimuli do not reflect authentic musical syntax, limiting the validity of conclusions about music reading processes. The article also explores how researchers interpret the concept of “complexity” in musical stimuli, noting inconsistencies and a lack of standardized criteria. Additionally, it highlights the importance of considering motor planning and instrument-specific challenges, which are often overlooked in experimental design. The study calls for more deliberate and informed stimulus design in future research, emphasizing the need for syntactically meaningful musical excerpts and standardized definitions of complexity. Such improvements are essential for advancing the understanding of visual processing in music reading and ensuring methodological consistency across studies.
Keywords: 
;  ;  ;  
Subject: 
Arts and Humanities  -   Music

Introduction

Reading is a necessary skill for anyone who wants to be part of today’s society. The reading process has therefore been of great interest to researchers for several decades [41]. New techniques such as examining eye movements and activity in different parts of the brain have provided opportunities for a relatively accurate description of the reading process. Research into music reading seen in this context has been given considerably less priority [15,19,30,39].
Both language and music are syntactic systems governed by the hierarchical organization of elements into meaningful structures [33]. Experiments investigating eye movements during reading often focus on understanding how comprehension is affected when various elements of the text are altered or manipulated [7]. Each word in a text serves a relatively unambiguous function that can be identified and described. Grammatical and orthographic rules guide the construction of letters into words, words into sentences, and sentences into longer paragraphs. As a result, language texts are relatively easy to process.
Similarly, the notes in a music score are organized into meaningful units that serve different functions within the read or performed musical text. However, these units are not as clearly visually defined as words and sentences. Despite this, researchers frequently adopt experimental designs from language reading studies to investigate music reading. For decades, experiments examining eye movements have used language-based paradigms, substituting linguistic stimuli with musical notation [13,30,36,39].
Most eye-movement research in reading focuses on meaningful units composed of letters—namely, words and word groups—their lexical processing, and the reader’s perceptual span [40,41]. Numerous studies over the past decades have shown that skilled readers fixate on nearly every word in a text, and that the ability to extract visual information quickly and efficiently is fundamental to proficient reading [6]. Eye movements during reading are largely, if not entirely, driven by lexical linguistic processing [8]. Useful graphemic information is typically extracted from the first three letters of the word to the right of the fixation point [41]. Overall, the word appears to be the primary unit of reference in eye-movement analysis within reading research [6].
Music is perceived as sounds organized in relation to one another, rather than as a stream of isolated elements [44]. One of the fundamental aspects of this organization is the similarity between different elements, which may manifest through repetition of phrases, rhythms, or harmonic patterns. Consequently, the ability to quickly organize written musical symbols into larger, meaningful patterns—and to retrieve them automatically—is essential for fluent music reading [14,25,26,32]. This ability to segment written music, known as chunking [31], stems from a substantial inventory of meaningful units stored in the musician’s long-term memory.
A music score comprises meaningful melodic, rhythmic, and harmonic patterns that fluent music readers automatically chunk into larger units of information [15,25]. In terms of visual processing, these units can be considered analogous to words in a language text, providing both understanding and the ability to anticipate what comes next. However, this aspect of musical stimuli has not been sufficiently considered in research design. Musical stimuli composed for the experiment purposes often consist of a few isolated pitches [46,47], or sequences of random pitches and/or note values that cannot be chunked into meaningful units [3,29]. As a result, conclusions drawn about aspects of “music reading” may be questionable, since participants are not truly reading music—just as eye-movement studies using random letters cannot yield meaningful insights into language reading. A sequence of random letters does not constitute written language, and a sequence of random notes does not constitute written music.
The research question explored in this article is: How can the nature of a music score as a visual stimulus influence the interpretation of music reading research results, in light of language reading experiment design? The article focuses on the reading process using the alphabetical notation system in language, the Western staff notation system, and the reading of Western tonal music. Western music notation, as a globally recognized system, transcends language barriers. Moreover, the ability to read music is valuable for professionals, amateurs, and their instructors alike. Therefore, eye-movement research in music holds broad potential for both theoretical insight and practical application [39].

Text and Music Reading

Language reading and music reading as cognitive processes have much in common and have frequently been compared to each other [2] [18,33,34,42,44,45]. People process linguistic and musical stimuli as syntactic structures [33]. It is much easier to remember sequences that match the grammatical rules than those that do not, regardless of whether the sequence is meaningful at the semantic level. In music, it is easier to remember sequences that apply the conventional rules of tonality and typical harmonic relations. This effect increases with the amount of knowledge about the music idiom that is being performed [44].
The basic level of the musical structure is related to scale and its pitches. All tones in a piece based on Western harmony are perceived in relation to the seven notes known as scale degrees, with a stable tonal center [33]. The durations of events are typically structured in simple integer ratios—most often, each event lasts one, two, three, or four times the length of the shortest event [9]. They create “horizontal” organization of music. Musical syntax allows for the simultaneous use of multiple tones to form chords and create harmony. Chord syntax refers to the “vertical” organization of tones in music. A key element of musical syntax is the tension-relaxation relationship between chords, which is crucial in the structure of a musical piece. Harmonic variation involves contrasts between tension and relaxation. Like language syntax, the order of elements in harmonic structures is essential [33]. Scales and their chords are fundamental components of a key. The selection of keys used in a piece of music (such as in modulation) is always deliberate.
In language, syntax helps express the meaning of “who did what to whom,” essentially forming the conceptual structure of reference and predication in sentences. Similarly, in music, syntax supports the meaning through the pattern of tension and resolution that listeners experience as the music progresses over time [33].
Working memory also seems to play an important role for efficient sight-reading [17,25]. In a performance situation, skilled music readers look further ahead in the score, i.e. they have greater eye-hand span than amateur music readers [28,30,43]. This is because they are able to store a larger amount of information in their working memory, compared to less skilled readers, in a process called chunking [31,37].
One consequence of knowing the syntactic rules, both in language and music, is the ability to use context to predict that certain items such as words, chords, or other meaningful units will occur. This ability is crucial for fluent reading of both language and music [10,25]. There are several syntactic strategies related to the grouping of words into meaningful units, segmenting sentences and building new meaningful structures. Koelsch et al. [24] compared processing of semantic meaning in language and music, investigating the semantic priming effect. They found that both music and language can prime the meaning of a word, or a meaningful unit of music. The effect did not differ between language and music with respect to time course, strength or neural generators.

Eye Movements

When we read, our eyes do not move smoothly scanning the text. We continually make eye movements called saccades. Between the saccades, our eyes remain relatively still during fixations [20,41]. We do not obtain new information during a saccade, because the eyes are moving too quickly. We process information available within our perceptual span during fixations [7]. About 10- 15% of the saccades are regressions, going back to previously read words or lines [41].
How long readers look at a word is influenced by how frequent the word is in the language, how familiar it is, and if it is predictable from the preceding context of the sentence. Frequent, familiar and expected words are processed using fewer and shorter fixations than less frequent or unpredictable words, despite the length of the word. Readers tend to skip over high predictable words more frequently than low predictable words [7,41].
Eye movements are also influenced by textual and typographical variables. Fixation duration increases, saccade length decreases, and the frequency of regressions increases while reading a conceptually difficult text. Factors such as the quality of the print, line length, font size and letter spacing influence eye movements, as well [41]. The effective visual field is called perceptual span in reading research, and it has been explored extensively [41].
Like reading language, musicians use saccadic eye movements when reading a sheet of music [23]. The research results show that professional music readers have fewer fixations varying for a shorter time (but longer than corresponding fixations in language reading) and have longer saccades [30], compared to less skilled music readers. A less complex score leads to relatively fewer and shorter fixations and longer saccades in both groups. With several attempts using the same score, the number of fixations decreases.
Puurtinen [39] observed that there is a notable limitation of previous eye-movement research in music reading. The main focus seems to place the emphasis on general expertise, which has diverted attention from the effects of the content being read, that is musical stimuli themselves. Madell and Hébert [30] highlighted the need for more research on how stimulus features influence eye movements. Consequently, the research field lacks methodological consistency and has yet to establish a standard approach.
Eye-movement measures are often described as particularly appropriate for studying music reading [13,30]. At the same time, recent meta-analyses summarizing eye-tracking research with focus on differences between expert and non-expert musicians point out contradictory results of the studies describing similar factors [36,39]. These might occur due to not systematically reported age, level of education, and number of years of musical practice for the participants, various features of the experimental set-up, the differences in eye-movement metrics, as well as the chosen stimuli [13,36,39].
Puurtinen [39] in her review of eye movement research focused on four methodological aspects, one of them described as performed music, or ‘what is being read’. The analysis provided highlighted challenges related to variability in the music stimuli and lack of consistency in creating them, especially in studies involving authentic music. Puurtinen [39] concludes:
Understanding the effects of the most basic features of music notation on the targeting and timing of eye movements seems essential before combining these observations with the effects of expertise, added visual elements, violation of musical expectations in complex settings, or even the distribution of attention between two staves.
The present study aims to take a closer look at the visual stimuli used in the experiments and examine how their application may influence the results. A detailed analysis of the musical stimuli, considered in light of common music reading strategies across different proficiency levels, may serve as a valuable starting point for more deliberate and informed selection of musical excerpts in future research. A deeper understanding of the nature of musical stimuli can enhance the validity of eye-movement studies by enabling the use of comparable stimuli, facilitating result comparisons, and supporting the replication of experimental conditions.

Method

The article presents an analysis of musical stimuli in relation to the core elements that constitute musical syntax essential for fluent reading: melodic, rhythmic, and harmonic patterns [25,34]. Musical stimuli used in eye-movement research typically fall into one of three categories: excerpts from authentic musical pieces, simplified or modified versions of such excerpts, or musical examples composed specifically for research purposes [36,39].
The primary focus of the present study is on musical stimuli that have been composed or arranged explicitly for research. The selection of musical examples aims to illustrate the challenges and potential pitfalls associated with the use of certain stimuli in eye-movement research on music reading over the past few decades. The author does not intend to provide a comprehensive review of recent research in the field, as existing meta-analyses and reviews [30,36,39] already offer an extensive overview of relevant publications.
The publications included in the analysis were selected based on the following criteria: (1) peer-reviewed scientific journal articles, (2) reporting on eye movements during music reading, (3) using musical stimuli specifically composed for research purposes, (4) identifying at least one research aim related to the syntactic processing of musical notation, (5) published in English within the last thirty years (post-1994), and (6) containing figures that depict the musical stimuli used in the study. Criteria related to the mode of stimulus presentation were not applied; studies involving sight reading, rehearsed reading, and silent reading were all considered equally relevant.
Music comprises three fundamental structural elements: rhythm, melody, and harmony [33]. Melody refers to the succession of pitch heights. Rhythm is the execution of sounds and silences in a fixed and precise temporal relationship, organized into patterns and structured by bar lines. Rhythmic notation preserves the perception of meter, clarifying the placement of beats. The time signature at the beginning of a score determines the stressed and unstressed beats within each bar. Rhythmic notation should never obscure the meter, particularly when notating syncopation [12].
Harmony pertains to the relationships among pitches and their connection to the scale or key, indicated by the key signature. It creates contrasts between tension and relaxation. Researchers often describe harmony as groups of simultaneously performed notes [34,39], though this is not an exhaustive representation of the harmonic dimension. Even a tonal piece consisting solely of a single melodic line possesses harmonic structure, as the succession of pitches serves to create tension and resolution. Thus, harmony functions as a structural scaffold and a crucial syntactic element in any tonal composition, not only in those featuring multiple simultaneous pitches. This broader understanding of harmony is adopted in the present analysis.

Analysis

The preliminary analysis included all studies referenced in the reviews by Madell and Hébert [30], Puurtinen [39], and Perra et al. [36]. From this pool, eight publications that met the methodological criteria outlined above were selected for in-depth analysis. Detailed annotations of these studies revealed several overarching themes. Table 1 summarizes the aims, participants, procedures, selected stimuli, and results of the studies.
The analysis of musical stimuli was conducted using key concepts related to the primary elements of musical syntax that facilitate fluent music reading: the use of melodic and rhythmic patterns forming meaningful units, harmonic progression, and the presence of musical motives or repetitions. An additional aspect considered was the layout of the stimuli, which in some cases did not adhere to conventional rules of musical notation. Given the close relationship between melody and harmony—and the fact that only one of the examples involved simultaneously played pitches—the analysis was organized into four categories: (1) melody and harmony, (2) rhythm, (3) phrases and/or repetitions, and (4) layout. Another theme that emerged during the analysis was the researchers’ interpretation of the term “complexity” in relation to musical stimuli.

Study 1 (Kinsler & Carpenter, 1995)

Figure 1 presents the musical stimuli as printed in the publication by the researchers. Red and green markings are added by the author of this paper.
Melody and harmony: not present
Rhythm: The time signature is not provided. The beamed eight notes marked red are assummingly the triplets, but the triplet symbol (the number 3 over or under the triplet notes, often with a bracket) is missing. In the last example each of the bars marked green has different number of beats or assumed time signature: bar one 3/4, bar two 3/4 but incorrect notation, bar three not identifible (3,5 beats) and bar four 6/8. The authors describe the stimuli in a following way: “The arrangement and grouping of the notes followed normal musical conventions, with barlines and beams for multiple quavers, except where indicated” [23] (p. 1448).
Phrases and/or repetitions: in some cases there might be spotted recurring rhythmic patterns. Layout: only partly following conventional notation.

Study 2 (Polanka, 1995)

Figure 2 presents the musical stimuli as printed in the publication by the researcher.
Melody and harmony: The stimuli are not written in any particular key, although the absence of key signature and the use of C major triads in some of the melodies suggest the key of C major. The researcher describes the stimuli in a following way: “The three-note pattern melodies were designed such that one melody was composed of low complexity patterns (stepwise patterns), one was composed of medium complexity patterns (third-skip patterns and root position triads),and one was composed of high complexity patterns (triads in inversion and broken triads)” [38] (p. 179). The patterns are not visually highlighted in any way, and the conventional notation rules are not used. The stimuli are presented as a succession of dots defining pitch height. This way of writing music seems to be the invention of the researcher and it does not exist in music literature. Any harmonic structures but the C major triad can be spotted.
Rhythm: Neither the time signature, bar lines nor note values are provided. No patterns can be recognized.
Phrases and/or repetitions: not present. Layout: not following conventional notation.

Study 3 (Waters et al., 1997), experiment 2

Figure 3 presents the musical stimuli as printed in the publication by the researchers. Red marking is added by the author of this paper.
Melody and harmony: the absence of key signature or accidentals suggest the key of C-major in all the examples. However, the pitches seem to be chosen randomly and do not match the key. They do not constitute any syntactically logical melodic or harmonic units of information.
Rhythm: The time signature 4/4 and the bar lines indicate strong and weak beats and the expected groupings of notes. All note values faster than quarter notes are notated using flags. The contemporary rhythm notation practice does not use flags in the way they were used here [12]. The notation of eight and sixteen notes using flags instead of beams is common only in older vocal music (never instrumental) and is considered archaic. The bar marked red in Figure 3 violates the syntactic rules in several ways: a half beat is added, and the notation is incorrect.
Phrases and/or repetitions: the pitch and rhythm succession might be considered as musical phrases the way they are presented. Repetitions are not present. Layout: for the most part following the conventional notation

Study 4 (Waters & Underwood, 1998)

Figure 4 presents the musical stimuli as printed in the publication by the researchers. Red markings are added by the author of this paper.
Melody and harmony: The stimuli consist of an arpeggio of four notes, either tonal (major) or using random pitches, but preserving a visual contour of an arpeggio (no stepwise motion). There is no deliberate use of a melody as an element of the stimuli. Visual complexity is varied by the direction of the subsequent pitches (only ascending/ descending or both). Tonal complexity is created by changing the position of accidentals or shifting one or two of the notes to not be encompassed with one diatonic scale. The pitches in “tonally simple” examples constitute a form for harmony – a major chord. It is interesting to note that the researchers chose to use the accidental (#) for the note B (Figure 4, marked red), which is relatively unusual, as B# is enharmonically a C.
Rhythm: Time signature or bar lines are not provided; all the examples consist of four quarter notes. The rhythm is not an element of the stimuli.
Phrases and/or repetitions: not present. Layout: partly following conventional notation.

Study 5 (Penttinen & Huovinen, 2011)

Figure 5 presents musical stimuli as printed in the publication by the researchers.
Melody and harmony: The stimuli are written in C major, using the first five tones of the scale. The succession of pitches constitutes a form for melody, despite the lack of any harmonic variation. The absence of harmonic progression is the reason why it is not possible to recognize any syntactic units. Provided finger numbers are placed under the note symbols and not above, which is an unusual practice for right hand notes.
Rhythm: There is no time signature. The bar lines indicate a 4/4 measure. There are no rhythmic patterns, as the stimuli consist of quarter notes only (except the last bar).
Phrases and/or repetitions: not present. Layout: following conventional notation rules.

Study 6 (Ahken et al., 2012)

Figure 6 presents the musical stimuli as printed in the publication by the researchers. Red markings are added by the author of this paper.
Melody and harmony: The examples are written in different keys, established either by using key signature or accidentals. All the examples consist of syntactic patterns following the musical rules, both melodic and harmonic. The chosen incongruities (marked red in Figure 6) are not random pitches played simultaneously, but harmonically unexpected chords (B minor in example c, and Gb major in example d). The lack of fingering in the example d makes it much harder to sight-read than the other stimuli.
Rhythm: The stimuli follow all the notational conventions. They are provided with time signature, bar lines, and the rhythmic patterns are present in all the examples. The length of the stimuli (five, six or seven bars) does not follow the expected conventions of two, four or eight bars [9].
Phrases and/or repetitions: there are both melodic and rhythmic motives and repetitions in all the stimuli, used logically, following the common conventions. Layout: following conventional notation rules.

Study 7 (Arthur et al., 2016)

Figure 7 presents the musical stimuli as printed in the publication by the researchers.
Melody and harmony: The lack of key signature suggests the key of C major. However, the succession of pitches seems random and is syntactically incongruent with the key. There is no harmonic progression, and melodic or harmonic patterns are not present.
Rhythm: The examples use a combination of varied note values. Example a) is supposed to be used as a congruent stimuli, whereas example b) is manipulated so that “notational structure is unexpectedly changed”. The time signature and the bar lines in example a) establish the 4/4 meter and give expectation of strong and weak beats, as well as note groupings. Notation does not violate the conventions. However, the chosen note values do not constitute any patterns and do not give any logical context for more than one beat at the time. They seem randomly put together and are not similar to authentic tonal music pieces.
Phrases and/or repetitions: not present. Layout: following conventional notation rules.

Study 8 (Huovinen et al., 2018), experiment 2

Figure 8 presents the musical stimuli as printed in the publication by the researchers.
Melody and harmony: The example above is notated in the key of G major, however the succession of pitches seems random and does not constitute any patterns or elements of musical syntax. There is no harmonic progression and the melody does not create any expectations of upcoming events.
Rhythm: All the examples consist solely of quarter notes. Time signature is not provided. The bar lines suggest the 4/4 meter. Rhythmic patterns are not present.
Phrases and/or repetitions: not present. Layout: following conventional notation rules.

Complexity of Musical Stimuli

In several of the reviewed publications, the term complexity is frequently used. However, the level of complexity in sheet music appears to be a subjective concept that is difficult to define and verify.
Polanka [38] conducted an experiment using musical examples with varying levels of complexity. Scalar patterns were classified as low-complexity, root position triadic patterns as medium-complexity, and inverted triadic patterns as high-complexity (Figure 2). Waters and Underwood [47] also investigated how musical complexity influences eye movements, using short monophonic examples consisting of four notes (Figure 4). The researchers described their stimuli as follows (p. 49):
Twenty “Tonally Simple, Visually Simple” stimuli were composed with four notes preceded by a treble clef, forming simple scale or arpeggio structures. All notes fit within a single major diatonic scale, each containing two or fewer accidentals and one or no contour changes. Twenty “Tonally Simple, Visually Complex” stimuli retained the same musical structures but were arranged to include two contour changes. Twenty “Tonally Complex, Visually Simple” stimuli were created by altering one or two notes so they no longer fit within a single diatonic scale, or by repositioning accidentals. Finally, twenty “Tonally Complex, Visually Complex” stimuli combined these tonal alterations with two contour changes.
This description illustrates how both visual and tonal complexity in a four-note pattern can be manipulated by modifying individual notes or accidentals. Kinsler and Carpenter [23] approached complexity through rhythmic patterns, defining complex stimuli as those that violate conventional music notation rules (see Figure 1).
Huovinen and colleagues [22] also addressed the concept of complexity, interpreting it as expected processing load. They proposed that larger melodic intervals pose greater cognitive challenges than smaller, stepwise ones, which may be more easily decoded as directional commands (up/down) within a scale. Accordingly, stimuli featuring larger intervals or accidentals can be considered more complex than those using stepwise motion and diatonic pitches.

Unexpected Findings

An analysis of the results and discussions in the reviewed publications revealed that several findings were described by the researchers as difficult to explain or unexpected.
Kinsler and Carpenter [23] (p. 1450) observed:
Sometimes a subject may fixate each of a pair of quavers individually, sometimes only one of them, or neither. (…) Comparison of the number of saccades made when performing musically identical bars with quavers notated either as isolated or beamed showed, for two subjects, a significant (P = 0.05) decrease with beaming, but for the other subject (RHSC), an equally significant increase—demonstrating again the idiosyncratic nature of the responses.
Polanka [38] (p. 182) noted: “It had been assumed that stepwise patterns were the least complex musically and therefore would be processed in the largest units. This was not found to be the case.” Waters et al. [48] found that musician groups made more errors on the duration-different trials than nonmusicians, and vice versa for the pitch-difference trials. The authors explained this as follows (p. 486): “The violation of the space-duration relationship in generating duration differences probably results in the tendency of the musician subjects to overlook the duration ‘misprints’.” Waters and Underwood [47] (p. 58) also reported an unexpected result:
However, on the first stimulus presentation, it is interesting that there was no evidence that the tonal complexity of the material had any effect on the fixation durations of the experts, as might be predicted from Kinsler and Carpenter’s model. This is a curious finding since we would have expected the experts’ drop in accuracy on the more difficult material to have been due to encoding difficulties on the first stimulus presentation. Furthermore, there was no suggestion from the spatial data that the experts used larger saccade sizes for the tonally simple material. In other words, there was no evidence for any differences in eye movement behaviour between tonally simple and tonally complex material for the expert group.
Arthur et al. [3] similarly reported: “Score disruption had no significant effect on the Total Time within either group. Saccadic latency was the only other measure to reach significance, and this was for experts only when encountering disrupted score—the latency increased significantly.”
Interpretation of these findings in the context of syntactic processing will follow in the next section.

Results and Discussion

In all of the analyzed studies, one of the stated aims was to investigate visual processing in music reading (see Table 1) by examining eye movements. Western tonal music is syntactic, and all authentic (tonal) music scores are composed of melodic, rhythmic, and harmonic patterns that skilled readers instinctively group into larger, meaningful units [15,25]. While all of the studies referenced such elements—musical patterns or structures—in their methodology, the results indicate that only a few researchers actually incorporated syntactic patterns into the design of their musical stimuli. Table 2 summarizes the use of the most common syntactic elements across the analyzed studies.
Almost none of the musical stimuli analyzed in the reviewed studies consisted of fundamental units of information essential for structuring single notes into meaningful sequences—an ability crucial for efficient sight-reading, defined here as the unrehearsed performance of notated music. Nevertheless, sight-reading was one of the primary data collection methods, employed in five of the eight experiments.
In contrast, experiments investigating eye movements during language reading often focus on the reader’s comprehension when various textual elements are manipulated [7]. Reading a sentence out of context typically does not hinder understanding, as the syntactic structure remains intact. However, reading a sequence of music notes that lacks harmonic, melodic, or rhythmic coherence fails to provide meaningful connotations for a musician. Such connotations are defined as logical and expected successions of events governed by syntactic rules [25].
The syntactic properties of written music appear to have been insufficiently considered in the design of musical stimuli in most of the analyzed studies. The following section discusses the degree to which these stimuli align with the syntactic conventions of Western tonal music. This discussion is further illustrated with fictitious examples of linguistic stimuli that exhibit a comparable level of syntactic disconnection.

Melodic and Harmonic Patterns

Knowledge of the key in which a piece or excerpt is written provides a set of expectations for an experienced sight-reader. For example, an authentic four-bar melody typically follows a harmonic progression that alternates between tension and resolution. Notes on strong beats are expected to be chord tones, complemented by passing tones on weak beats. Musicians also develop motor expectations—particularly pianists, who often play multiple tones simultaneously. A key signature with one sharp (♯) will prompt a pianist or violinist to position their hands according to the G major triad and anticipate D major tones in subsequent bars.
Of the studies analyzed, only one—by Ahken et al. [1], in which the stimuli were composed by a graduate student in music composition—incorporated any form of harmonic structure in the musical stimuli. This is despite the fact that several studies declared a key and included a key signature in their experimental descriptions. It appears to be common practice to equate a key signature merely with the use of seven specific notes from a diatonic scale [22,35].
Some studies did not specify a key at all, yet still examined how participants organized notes into patterns [38,48], or stated that the stimuli used “white notes only” to investigate visual expectations in sight-reading [3]. As a result, although some stimuli may visually resemble Western tonal melodies, their melodic and harmonic content often fails to follow fundamental syntactic rules. Consequently, common melodic or harmonic patterns are absent. Even when using the same pitches and intervals, a slight reordering—if aligned with syntactic rules—can significantly alter visual processing. Stepwise motion and intervallic skips become meaningful only when they can be “chunked” as part of a scale or harmonic progression expected in a given key.
A skilled sight-reader typically assumes that the absence of a key signature indicates C major (or A minor). If the tonal material contradicts this assumption (see, e.g., Figure 3 and Figure 6), it may disrupt cognitive processing due to perceived incongruity. This is analogous to how a sequence of letters is chunked into words based on syntactic meaning, not merely visual similarity.
Finally, consider the note C4—commonly referred to as “middle C” on the piano. Because it is written with a ledger line, it stands out visually and is often recognized more quickly by beginners than other notes [11]. Therefore, using C as a “target note” (i.e., the second note in an interval) may yield different results than using other notes (see, e.g., Figure 5).

Rhythm Patterns

Note values were an active component of the stimuli in four of the studies: Kinsler and Carpenter [23], Waters et al. [48], Arthur et al. [3], and Ahken et al. [1]. Of these, only the stimuli used by Ahken et al. [1] consisted of common rhythmic patterns that could plausibly appear in authentic musical compositions. The other three studies either violated conventional rules of music notation or used a variety of note values placed relatively randomly within each bar. None of these stimuli resembled authentic rhythmic patterns. Moreover, some studies did not include a time signature at all [22,35,38]. Yet the time signature is an essential element in rhythm reading, as rhythmic units can only be interpreted meaningfully in relation to meter and beat.
The term rhythmic pattern appears to require clearer operationalization. A pattern can be defined as a predictable, repeating arrangement of elements. Repetition is a fundamental aspect of rhythmic structure in tonal music [27]. Meaningful rhythmic units often span two to four beats and are repeated in a predictable manner [21]. The analysis of rhythmic content in the reviewed studies reveals that most researchers did not employ familiar rhythmic patterns or incorporate any form of repetition in the succession of rhythmic events. As a result, the syntactic processing of these musical stimuli may differ substantially from that involved in reading authentic rhythmic sequences.
Several stimuli lacked differentiated rhythmic information altogether. In some cases, they consisted only of dots indicating pitch [38] or used the same note value throughout the entire excerpt [35,47]. This design choice was presumably made to isolate pitch reading and minimize the influence of other musical elements. However, a melody line without varied note values—and in one case, even without bar lines (see Figure 2)—is visually unnatural and rare (if not entirely absent) in authentic musical scores. Beyond their role in conveying meter and accentuation, bar lines also serve a referential and visual function [35].
Chunking information into larger units appears to be particularly relevant to rhythmic processing [16,49]. A useful analogy can be drawn from language: reading sentences composed entirely of capital letters significantly slows processing compared to sentences that follow standard capitalization rules [4]. Similarly, reading pitches without rhythmic differentiation may reduce the efficiency of visual processing in music reading.

Complexity

The term complexity was interpreted in various ways across the reviewed studies. Waters and Underwood [47] classified accidentals as a visual element of the score rather than a tonal one, manipulating visual complexity by increasing the number of accidentals. In contrast, Huovinen et al. [22] treated the use of accidentals as a form of cognitive complexity. Polanka [38] considered scalar and triadic patterns as components of tonal complexity, as opposed to visual complexity. To enable meaningful comparisons across studies and draw broader conclusions, it is essential to understand the rationale behind how researchers categorized these elements of musical notation.
These differing interpretations raise several fundamental questions:
  • Does the use of accidentals contribute to tonal complexity, visual complexity, or both? In a musical context, accidentals increase harmonic complexity, but they also make the score visually more intricate.
  • Do scales and triads affect tonal or visual complexity? These are basic and high-frequency musical structures.
  • Can we meaningfully discuss musical structure, tonal progression, and complexity when the stimulus consists of only four notes, as in Waters and Underwood [47]?
These examples suggest that clearer operationalization and standardization of the term complexity is needed. Several aspects could be investigated in this context:
  • How do different note values and rhythmic patterns influence the complexity of a score, both generally and in relation to the instrument used in the experiment?
  • Do accidentals increase complexity visually, tonally, or both?
  • Does the use of two treble clefs or two bass clefs on a piano staff make the score more complex than the more common treble and bass clef combination?
  • How do basic elements like scales and triads—and their density relative to tonally unrelated notes—affect perceived complexity?
  • Is piano music with fingering more or less complex to read than music without fingering?
Finally, it may be beneficial to reverse the perspective: rather than defining complexity a priori, researchers could determine the complexity level of musical notation empirically by measuring fixation duration, saccade length, and latency. The parameters and criteria that modulate the effect of complexity may vary across studies, particularly in the absence of a standardized conceptual framework for musical stimulus complexity.

Music Instrument and Motor Planning

The difficulty of reading and performing the same musical pattern can vary significantly depending on the instrument. It can also be assumed that musicians playing different instruments focus on different aspects of the score to optimize performance. For example, singers may naturally focus on intervals, while pianists are more likely to attend to harmonic progressions. However, none of the reviewed studies discussed their results in light of instrument choice—most often the piano—or considered the broader motor aspects that may influence eye movements during sight-reading.
Two of the studies [22,35] investigated the processing of intervals using the piano. The authors compared visual processing of stepwise motion with that of intervallic skips. The piano is one of the few instruments where the spatial distance between notes on the staff corresponds directly to the number of keys skipped on the instrument. For example, the interval of a fifth (C–G) can be challenging for novice pianists, both due to the physical span and the use of the fifth finger (pinky), which is often less developed in beginners. In contrast, the same interval poses no motor challenge for a novice trumpet player, who can play both notes without changing finger positions—unlike stepwise motion, which may require more complex motor planning. Beginning brass players may not rely on scale-based visual strategies in the same way pianists do, due to the differing properties of their instruments.
Another example is the stepwise interval C5–D5, which is relatively easy to play on many instruments but can be challenging for beginner flutists due to the motor complexity involved. Huovinen et al. [22] define complexity as expected processing load and suggest that larger melodic intervals involve greater cognitive difficulty than smaller, stepwise ones. While this may hold true for pianists, excluding the motor characteristics of the instrument in more general discussions of visual processing during sight-reading may compromise the validity of the findings.
An additional factor related to piano performance is fingering. Fingering is an integral part of piano notation and may be crucial for successful sight-reading [49]. Two of the analyzed studies [22,35] included finger numbers at the beginning of five-finger stimuli played by novices. However, the same authors did not include fingering in their most complex musical stimuli (see Figure 8). Similarly, fingering was not provided in the stimuli used by Ahken et al. [1]. This suggests that the researchers may not have fully considered the physiological demands of piano playing when designing their stimuli.
For instance, the stimulus shown in Figure 8 could likely be sight-read with relative ease by a trumpeter, but would be more challenging for a pianist, who must plan awkward fingerings in real time. It is possible that presenting the same notes in a different order would have yielded different results. Some of the musical stimuli that were presumably considered easy to play by the researchers may, in fact, be quite demanding—precisely because they do not account for the motor aspects of performance.

Interpretation of the Unexpected Findings

Some of the results reported in the reviewed studies appeared difficult for the researchers to explain. Kinsler and Carpenter [23] discussed why participants sight-read the same stimuli in contrasting ways. These differences in processing may be attributed to the syntactic context and the presence or absence of a priming effect (see Figure 1). Although no participant data were provided, it is reasonable to assume that novice readers may have lacked experience with reading eight notes notated with flags instead of beams—a style rarely used in instrumental music. Conversely, such atypical notation might have been perceived as incongruent by experienced musicians, while novices may not have recognized it as unusual.
Polanka [32] reported that stepwise patterns were not processed in larger units than patterns involving wider intervals. A closer examination of the stimuli (Figure 2) suggests that the visual layout of the stepwise motion may have introduced challenges not typically encountered in conventionally notated music, where bar lines and rhythmic differentiation aid in chunking. Although the researcher used 3- and 4-note patterns, there were no visual cues in the stimuli to support the reader in recognizing these groupings.
Waters et al. [48] found that musician groups made more errors on duration-different trials than nonmusicians, and vice versa for pitch-difference trials. The authors attributed this to musicians overlooking the duration “misprints.” An alternative explanation may lie in the violation of notational conventions: the extensive incongruities in the duration-different trials may have disrupted chunking for musicians, whereas nonmusicians—lacking such expectations—were unaffected. Altering a single pitch in a bar is a relatively minor change, but adding half a beat (as in Figure 3) disrupts the entire metrical structure.
Waters and Underwood [47] reported no significant differences in eye movement behavior between tonally simple and tonally complex material among expert participants. This may be due to the shortness of the stimuli—only four single notes (see Figure 4). Such short excerpts, lacking established tonality, may prompt a note-by-note reading strategy regardless of expertise. Chunking typically requires harmonic, melodic, or rhythmic context. Moreover, chunking enables the simultaneous processing of approximately seven (plus or minus two) elements in working memory [31]. A task involving only four elements can be completed accurately without chunking or relying on strategies unique to expert readers.
Arthur et al. [3] found no effect of score disruption on total task time for either expert or nonexpert participants. Figure 6 shows that the “non-disrupted” musical stimulus lacked any common melodic, rhythmic, or harmonic patterns. It is therefore plausible that expert readers processed the stimulus similarly to nonexperts—a phenomenon comparable to findings in chess research, where experts perform no better than novices when presented with random, non-meaningful configurations [5].

Comparison to Linguistic Stimuli

Reading a succession of pitches without any musical context—such as key signature, harmony, or phrasing—likely provides a level of understanding comparable to reading a series of unrelated words or non-words. This type of visual stimulus removes much of the essential information that skilled music readers rely on to decode a score efficiently, as they would in everyday performance situations. As a result, such stimuli may significantly affect research outcomes, particularly in studies comparing novice and expert readers, since beginners do not utilize contextual musical knowledge to the same extent as experienced musicians [15,44].
The process of reading single, unrelated letters differs fundamentally from reading words or meaningful word groups. In language reading research, eye movement studies typically use visual stimuli such as words and syntactically coherent sentence elements. It is therefore unclear why some music reading studies treat single notes as meaningful units, rather than focusing on melodic, rhythmic, or harmonic patterns.
Table 3 illustrates this issue by presenting the musical stimuli used in the analyzed studies alongside equivalent fictitious linguistic stimuli, offering a comparative perspective on chunking and comprehension potential.

Conclusion

This study highlights a critical gap in the design of musical stimuli used in eye-movement research on music reading. While linguistic reading research consistently employs syntactically meaningful units such as words and sentences, many music reading studies rely on stimuli that lack equivalent syntactic coherence—melodic, rhythmic, and harmonic patterns essential for fluent music reading. The analysis revealed that only one of the eight reviewed studies incorporated musical stimuli that could realistically be part of an authentic tonal composition.
The absence of syntactic structure in most stimuli compromises the validity of findings, particularly when comparing expert and novice readers. Furthermore, inconsistent interpretations of complexity and a lack of consideration for motor planning and instrument-specific challenges further obscure results. To advance the field, future research must adopt standardized definitions of musical complexity, ensure stimuli reflecting authentic syntactic structures, and account for the motor demands of different instruments. Doing so will enhance the reliability and comparability of findings and support the development of a robust framework for understanding visual processing in music reading.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Statements and Declarations

There are no competing interests that are related to this work.

References

  1. Ahken, S., Comeau, G., Hébert, S., & Balasubramaniam, R. (2012). Eye movement patterns during the processing of musical and linguistic syntactic incongruities. Psychomusicology: Music, Mind, and Brain, 22(1), 18-25. [CrossRef]
  2. Aiello, R. (1994). Music and Language: Parallels and Contrasts. In R. Aiello & J. A. Sloboda (Eds.), (pp. 40-63). Oxford University Press.
  3. Arthur, P., Khuu, S., & Blom, D. (2016). Music sight-reading expertise, visually disrupted score and eye movements. Journal of Eye Movement Research, 9(7), 35.
  4. Babayigit, Ö. (2019). The Reading Speed of Elementary School Students on the All Text Written with Capital and Lowercase Letters. Universal Journal of Educational Research, 7(2), 371-380.
  5. Chase, W. G., & Simon, H. A. (1975). The mind’s eyes in chess. In W. G. Chase (Ed.), Visual Information Processing (pp. 215-281). Academic.
  6. Clifton, C., Ferreira, F., Henderson, J. M., Inhoff, A. W., Liversedge, S. P., Reichle, E. D., & Schotter, E. R. (2016). Eye movements in reading and information processing: Keith Rayner’s 40year legacy. Journal of Memory and Language, 86, 1-19. [CrossRef]
  7. Clifton, C., Staub, A., & Rayner, K. (2007). Eye movements in reading words and sentences. In R. P. G. Van Gompel, M. H. Fischer, W. S. Murray, & R. L. Hill (Eds.), Eye movements (pp. 341-371). Elsevier. [CrossRef]
  8. Dambacher, M., Slattery, T. J., Yang, J., Kliegl, R., & Rayner, K. (2013). Evidence for direct control of eye movements during reading. Journal of Experimental Psychology: Human Perception and Performance, 39(5), 1468-1484. [CrossRef]
  9. Drake, C., & Palmer, C. (2000). Skill acquisition in music performance: relations between planning and temporal control. Cognition, 74(1), 1-32. [CrossRef]
  10. Ehri, L. C. (2005). Learning to Read Words: Theory, Findings, and Issues. Scientific Studies of Reading, 9(2), 167-188. [CrossRef]
  11. Emond, B., & Comeau, G. (2013). Cognitive modelling of early music reading skill acquisition for piano: A comparison of the Middle-C and Intervallic methods. Cognitive Systems Research, 24, 26-34. [CrossRef]
  12. Feist, J. (2017). Berklee contemporary music notation. Hal Leonard Corporation.
  13. Fink, L. K., Lange, E. B., & Groner, R. (2019). The application of eye-tracking in music research. Journal of Eye Movement Research, 11(2). [CrossRef]
  14. Goolsby, T. W. (1994). Profiles of Processing: Eye Movements During Sightreading. Music Perception, 12, 97-123. [CrossRef]
  15. Gudmundsdottir, H. R. (2010). Advances in music-reading research. Music Education Research, 12(4), 331-338. [CrossRef]
  16. Halsband, U., Binkofski, F., & Camp, M. (1994). The role of the perception of rhythmic grouping in musical performance: Evidence from motor-skill development in piano playing. Music Perception, 11(3), 265-288.
  17. Hambrick, D. Z., Oswald, F. L., Altmann, E. M., Meinz, E. J., Gobet, F., & Campitelli, G. (2014). Deliberate practice: Is that all it takes to become an expert? Intelligence, 45, 34-45. [CrossRef]
  18. Hansen, D., Bernstorf, E., & Stuber, G. M. (2014). The music and literacy connection. Rowman & Littlefield.
  19. Hodges, D. A., & Nolker, D. B. (1992). The acquisition of music reading skills. In R. Colwell (Ed.), Handbook of research on music teaching and learning (pp. 466-471). Schrimer Books.
  20. Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & Van de Weijer, J. (2011). Eye tracking: A comprehensive guide to methods and measures. Oxford University Press.
  21. Honing, H. (2013). Structure and interpretation of rhythm in music. In D. Deutsch (Ed.), The psychology of music (Vol. 3, pp. 369-404). Academic Press.
  22. Huovinen, E., Ylitalo, A. K., & Puurtinen, M. (2018). Early Attraction in Temporally Controlled Sight Reading of Music. J Eye Mov Res, 11(2). [CrossRef]
  23. Kinsler, V., & Carpenter, R. H. S. (1995). Saccadic eye movements while reading music. Vision Research, 35(10), 1447-1458. [CrossRef]
  24. Koelsch, S., Kasper, E., Sammler, D., Schulze, K., Gunter, T., & Friederici, A. D. (2004). Music, language and meaning: brain signatures of semantic processing. Nature neuroscience, 7(3), 302-307.
  25. Kopiez, R., & Lee, J. I. (2008). Towards a General Model of Skills Involved in Sight Reading Music. Music Education Research, 10(1), 41-62. [CrossRef]
  26. Lehmann, A. C., Sloboda, J. A., & Woody, R. H. (2007). Psychology for musicians: understanding and acquiring the skills. Oxford University Press.
  27. Lerdahl, F., & Jackendoff, R. (1996). A generative theory of tonal music. MIT Press.
  28. Lim, Y., Park, J. M., Rhyu, S.-Y., Chung, C. K., Kim, Y., & Yi, S. W. (2019). Eye-hand span is not an indicator of but a strategy for proficient sight-reading in piano performance. Scientific reports, 9(1), 17906. [CrossRef]
  29. Lörch, L. (2021). The association of eye movements and performance accuracy in a novel sight-reading task. Journal of Eye Movement Research, 14(4), 10.16910/jemr. 16914.16914. 16915.
  30. Madell, J., & Hébert, S. (2008). Eye Movements and Music Reading: Where Do We Look Next? Music Perception: An Interdisciplinary Journal, 26(2), 157-170. [CrossRef]
  31. Miller, G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2), 81. [CrossRef]
  32. Mishra, J. (2014). Improving sightreading accuracy: A meta-analysis. Psychology of Music, 42(2), 131-156. [CrossRef]
  33. Patel, A. D. (2008). Music, Language, and the Brain. Oxford University Press. http://site.ebrary.com/lib/hisbib/docDetail.action?docID=10211997.
  34. Patel, A. D. (2012). Advancing the comparative study of linguistic and musical syntactic processing. In P. Rebuschat, M. Rohrmeier, J. A. Hawkins, & I. Cross (Eds.), Language and music as cognitive systems (pp. 248-253). Oxford University Press.
  35. Penttinen, M., & Huovinen, E. (2011). The Early Development of Sight-Reading Skills in Adulthood:A Study of Eye Movements. Journal of Research in Music Education, 59(2), 196-220. [CrossRef]
  36. Perra, J., Latimier, A., Poulin-Charronnat, B., Baccino, T., & Drai-Zerbib, V. (2022). A Meta-analysis on the Effect of Expertise on Eye Movements during Music Reading. Journal of Eye Movement Research, 15(4). [CrossRef]
  37. Pike, P. D., & Carter, R. (2010). Employing Cognitive Chunking Techniques to Enhance Sight-Reading Performance of Undergraduate Group-Piano Students. International Journal of Music Education, 28(3), 231-246. [CrossRef]
  38. Polanka, M. (1995). Research Note: Factors Affecting Eye Movements During the Reading of Short Melodies. Psychology of Music, 23(2), 177-183. [CrossRef]
  39. Puurtinen, M. (2018). Eye on Music Reading: A Methodological Review of Studies from 1994 to 2017. Journal of Eye Movement Research, 11(2). [CrossRef]
  40. Radach, R., & Kennedy, A. (2013). Eye movements in reading: Some theoretical context. Quarterly Journal of Experimental Psychology, 66(3), 429-452. [CrossRef]
  41. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological bulletin, 124(3), 372. [CrossRef]
  42. Rebuschat, P., Rohrmeier, M., Hawkins, J. A., & Cross, I. (Eds.). (2012). Language and music as cognitive systems. Oxford University Press.
  43. Rosemann, S., Altenmüller, E., & Fahle, M. (2016). The art of sight-reading: Influence of practice, playing tempo, complexity and cognitive skills on the eye–hand span in pianists. Psychology of Music, 44(4), 658-673. [CrossRef]
  44. Sloboda, J. A. (2005). Exploring the musical mind: cognition, emotion, ability, function. Oxford University Press.
  45. Waller, D. (2010). Language literacy and music literacy: a pedagogical asymmetry. Philosophy of Music Education Review, 18(1), 26-44. [CrossRef]
  46. Waters, A., Townsend, E., & Underwood, G. (1998). Expertise in musical sight reading: A study of pianists. British Journal of Psychology, 89, 123-149.
  47. Waters, A. J., & Underwood, G. (1998). Eye Movements in a Simple Music Reading Task: A Study of Expert and Novice Musicians. Psychology of Music, 26(1), 46-60. [CrossRef]
  48. Waters, A. J., Underwood, G., & Findlay, J. M. (1997). Studying expertise in music reading: Use of a pattern-matching paradigm. Perception & Psychophysics, 59(4), 477-488. [CrossRef]
  49. Zhukov, K., & McPherson, G. E. (2022). Sight-reading. In G. E. McPherson (Ed.), The Oxford Handbook of Music Performance, Volume 1. Oxford University Press.
Figure 1. Examples of musical stimuli in Kinsler and Carpenter [23].
Figure 1. Examples of musical stimuli in Kinsler and Carpenter [23].
Preprints 171431 g001
Figure 2. Musical stimuli in Polanka [38]: low (L), medium (M) and high (H) complexity (C) patterns consisting of three (3) or four (4) notes.
Figure 2. Musical stimuli in Polanka [38]: low (L), medium (M) and high (H) complexity (C) patterns consisting of three (3) or four (4) notes.
Preprints 171431 g002
Figure 3. Examples of musical stimuli in the experiment to Waters et al. [48].
Figure 3. Examples of musical stimuli in the experiment to Waters et al. [48].
Preprints 171431 g003
Figure 4. Examples of musical stimuli in Waters & Underwood [47].
Figure 4. Examples of musical stimuli in Waters & Underwood [47].
Preprints 171431 g004
Figure 5. Examples of musical stimuli in Penttinen & Huovinen [35].
Figure 5. Examples of musical stimuli in Penttinen & Huovinen [35].
Preprints 171431 g005
Figure 6. Examples of musical stimuli in Ahken, et al. (2012).
Figure 6. Examples of musical stimuli in Ahken, et al. (2012).
Preprints 171431 g006
Figure 7. Examples of musical stimuli in Arthur, et al. [3], using normal spacing (a) and disrupted spacing (b).
Figure 7. Examples of musical stimuli in Arthur, et al. [3], using normal spacing (a) and disrupted spacing (b).
Preprints 171431 g007
Figure 8. Examples of musical stimuli in Huovinen, et al. [22].
Figure 8. Examples of musical stimuli in Huovinen, et al. [22].
Preprints 171431 g008
Table 1. Overview of eight empirical studies on eye movements in music reading, including research aims related to syntactic processing, participant characteristics, types of musical stimuli, experimental procedures, and key findings.
Table 1. Overview of eight empirical studies on eye movements in music reading, including research aims related to syntactic processing, participant characteristics, types of musical stimuli, experimental procedures, and key findings.
Study Aim (related to syntactic processing) Participants Stimuli Procedure Conclusion
1 (Kinsler & Carpenter, 1995) To examine saccades while reading music: to be able to dissociate the contributions of factors associated with the input (the printed music) and with the output (the rate of execution). No data. Short rhythmic phrases presented as a single line of notes with bar lines. Participants were suddenly presented with a line of notes on a computer screen and asked to tap the corresponding rhythm on a microphone. At slow speeds with complex sequences there may be considerably more saccades than notes; with a fast speed and simple pattern---as here more notes than eye movements.
2 (Polanka, 1995) To determine whether musicians read in higher order structures (patterns) or note by note by monitoring their eye movements as they sight-read. 18 undergraduate music majors (11 females, 7 males), divided into three skill groups based on a sight-singing pretest. Six melodies—three composed of three-note pitch patterns and three of four-note patterns, each with varying complexity (low, medium, high). Subjects read each melody twice, once silently and once humming and their vocal responses were recorded on audio tape. Better readers tended to process larger units than poorer readers. Pattern size influenced eye movement behavior. Stepwise patterns were processed in smaller units than triadic patterns.
3 (Waters et al., 1997)
exp.2
To test whether skilled sight-reading is associated with rapid processing of note groups and to examine the relationship between expertise and eye-movement parameters (e.g., fixation duration). Three groups of 8 subjects each: two “expert” groups (full time music students playing a monophonic instrument) and a novice group (familiar with the names of the notes). Sixty 10-note melodies in 3/4 or 4/4 time, each consisting of two bars of five notes each. No key signature or accidentals. Each melody had a randomized counterpart. Silent reading, matching pairs of stimuli as same or different by pressing a button as quickly as possible. Experienced musicians used larger units and processed them with fewer and shorter fixations. Musicians made more errors on duration-different trials, while nonmusicians made more errors on pitch-different trials.
4 (Waters & Underwood, 1998) To determine the effect of the tonal complexity of the stimuli on task performance and eye movement behaviour. To determine whether there was any difference in task performance and eye movement behaviour for expert and novice musicians. Twenty-two subjects divided into two groups: “expert” group, experienced musicians playing at least one musical instrument associated with the treble clef register. The “novice” group, familiar with musical notation. Twenty “Tonally Simple, Visually Simple” stimuli: four notes encompassed within one major diatonic scale, preceded by the treble clef, consisting of simple scale or arpeggio structures. Other stimuli with various complexity level were created by shifting some of the notes. Silent reading. Each subject made a “same” response with their preferred hand, and a “different” response with their non-preferred hand. Experts outperformed novices in speed and accuracy. Experts showed reduced performance on tonally complex material, while novices showed no difference. There was no evidence for any differences in eye movement behavior between tonally simple and tonally complex material for the expert group.
5 (Penttinen & Huovinen, 2011) To elucidate the early stages of learning to read music in adulthood by examining the various measures of fixation time in elementary sight-reading tasks, and compare novices with experienced music amateurs. 49 second-year teacher education students in Finland, all enrolled in a year-long compulsory music course.
Twelve five-bar melodies in C major, using quarter notes and a whole note in the final bar. Melodic range: C4–G4. Fingering marked for the first note. The melodic movement in each melody was primarily stepwise, with the exceptions of two larger intervals at the temporal distance from one another. Participants sight-read four melodies on piano with a metronome (60 bpm) at three time points: start, mid-point (16 weeks), and end of the course. Sight-reading skills improved significantly. Fixation times decreased for central notes in large intervals, but not for surrounding notes.
6 (Ahken et al., 2012) Investigating the eye movements of readers during the visual processing of music and linguistic syntactic incongruities. To examine the role of key signature and accidentals to establish tonality. Eighteen experienced pianists. Sixteen short musical phrases (5–7 bars), grouped in fours. Half were syntactically congruent; the rest ended with a non-tonic chord or note. Participants were instructed to play each musical sequence at the piano with hands together and no preview time, at any speed they liked. Incongruent stimuli elicited more fixations, longer fixation durations, and longer trial durations. Effects were less pronounced for stimuli with accidentals than for those with key signatures.
7 (Arthur, et al., 2016) To explore how visual expectations influence sight-reading expertise, focusing on working memory, cross-modal integration, and visual crowding. The study examined eye movement responses to unexpected changes in notation observed in expert and non-expert music sightreaders. 20 participants: 9 were assigned to the expert sight-reader group and 13 to the non-expert sightreader group. No data about their main instrument. Ten four-bar melodies in treble clef, right-hand only, using white notes. Notational features were altered (e.g., bar line removal, stem direction, spacing). Participants sight-read the 9 specifically composed musical excerpts of 4 bars duration on the piano. Score disruption had no effect on total task time. Saccadic latency increased significantly for experts only when encountering disrupted notation.
8 (Huovinen, et al., 2018), exp 2 To examine the hypothesis stating that local increases in music-structural complexity (and thus visual salience) of the score may bring about local, stimulus-driven lengthening of the ETS [eye-time span]. 14 professional piano students from three Finnish universities. Eight mostly stepwise melodies in 4/4 time, each six bars long and composed entirely of quarter notes. The melodies were divided into two sets in the keys of G, C, F, and B♭. In each melody, one larger intervallic skip (a minor sixth) was inserted in one of bars 3–5. The participants were instructed to sight-read the melodies on the piano in time with a metronome. Experienced musicians appeared to react sensitively to upcoming deviant elements. Target notes triggered longer-than-average eye-time spans for notes occurring several beats before the target itself. Sight-readers often responded to the target element as early as six beats in advance.
Table 2. The use of syntactic elements in musical stimuli of the analyzed studies.
Table 2. The use of syntactic elements in musical stimuli of the analyzed studies.
Study Syntactic units of information possible to “chunk” Could be a part of an authentic piece
Melodic Rhythmic Harmonic Phrases/ Repetitions
1 (Kinsler & Carpenter, 1995) Not applicable limited Not applicable limited no
2 (Polanka, 1995) limited no no no no
3 (Waters et al., 1997), exp 2 no no no limited no
4 (Waters & Underwood, 1998) no no limited no limited
5 (Penttinen & Huovinen, 2011) no no no no no
6 (Ahken et al., 2012) yes yes yes yes yes
7 (Arthur et al., 2016) no no no no no
8 (Huovinen et al., 2018), experiment 2 no no no no no
Table 3. Comparison of musical stimuli used in the analyzed studies with fictitious linguistic equivalents, illustrating the degree of syntactic structure and potential for chunking and comprehension in each domain.
Table 3. Comparison of musical stimuli used in the analyzed studies with fictitious linguistic equivalents, illustrating the degree of syntactic structure and potential for chunking and comprehension in each domain.
Study Description of the musical stimuli in the context of musical syntactic processing Fictious equivalent of linguistic stimuli with the same degree of syntactic structure
1 (Kinsler & Carpenter, 1995) Rhythmic stimuli without a time signature; complexity is introduced by violating notational conventions. Unrelated words placed within a sentence in simple tasks, and non-words used in complex tasks.
2 (Polanka, 1995) A set of equally spaced dots indicating pitch on a staff with a treble clef, designed as short patterns. Short, unrelated words written in an unusual font, presented as single capital letters with equal spacing in a continuous row.
3 (Waters et al., 1997), exp 2 Two-bar excerpts with a time signature but no key signature; pitches do not conform to C major tonality; rhythmic rules are intentionally violated. Short sentences composed of non-words
4 (Waters & Underwood, 1998) Four-note excerpts on a staff, using quarter notes with or without accidentals, forming either major triads or random pitch sequences. Single short words and similar non-words
5 (Penttinen & Huovinen, 2011) Five-bar excerpts using a random succession of notes from C4 to G4, with no rhythmic differentiation, time signature, or key signature. Unrelated short words and non-words composed of the same five letters, placed within a sentence.
6 (Ahken et al., 2012) Five- to seven-bar excerpts in piano score format, with time and key signatures, harmonic and rhythmic patterns, and a syntactically incongruent final bar. Meaningful sentences in which the final word does not match the context.
7 (Arthur et al., 2016) Four-bar excerpts with a time signature but no key signature; pitches and note values are randomly ordered and lack syntactic structure; spacing is manipulated. Four complex non-words resembling real words, placed in a sentence with irregular spacing between letters, making word boundaries visually ambiguous.
8 (Huovinen et al., 2018), experiment 2 24-bar excerpts with a key signature but no time signature or rhythmic variation (quarter notes only); no harmonic progression; complexity introduced via accidentals. Unrelated short words and non-words, some with difficult spelling, placed in an unnaturally long sentence without punctuation.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated