Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Design of Semantic Understanding System for Optical Staff Symbols

Version 1 : Received: 13 July 2023 / Approved: 13 July 2023 / Online: 13 July 2023 (05:07:42 CEST)
Version 2 : Received: 13 July 2023 / Approved: 13 July 2023 / Online: 13 July 2023 (08:50:49 CEST)

A peer-reviewed article of this Preprint also exists.

Lou, F.; Lu, Y.; Wang, G. Design of a Semantic Understanding System for Optical Staff Symbols. Appl. Sci. 2023, 13, 12627. Lou, F.; Lu, Y.; Wang, G. Design of a Semantic Understanding System for Optical Staff Symbols. Appl. Sci. 2023, 13, 12627.

Abstract

Symbolic semantic understanding of staff images is an important part in music information retrieval. Due to the complex composition of staff symbols and the strong semantic correlation between symbol spaces, it is difficult to understand the pitch and duration of each note during performances. In this paper, we design a semantic understanding system for optical staff symbols. The system uses the YOLOv5 to implement optical staff’s low-level semantic understanding stage, which understands the pitch and duration in natural scales and other symbols that affect the pitch and duration. The proposed note encoding reconstruction algorithm is used to implement high-level semantic understanding stage. Such algorithm understands the logical, spatial, and temporal relationships between natural scales and other symbols based on music theory, and outputs digital codes for the pitch and duration of main notes during performances. The model is trained with a self-constructed SUSN dataset. Experimental results of YOLOv5 show that the precision is 0.989 and the recall is 0.972. For the system, the error rate is 0.031 and the omission rate is 0.021. The paper concludes by analysing the causes of semantic understanding errors and offers recommendations for further research. The results of this paper provide a method for multimodal music artificial intelligence applications such as notation recognition through listening, intelligent score flipping and automatic performance.

Keywords

semantic understanding; neural networks; optical music recognition; YOLOv5; digital code

Subject

Computer Science and Mathematics, Signal Processing

Comments (1)

Comment 1
Received: 13 July 2023
Commenter: Lu yaling
Commenter's Conflict of Interests: Author
Comment: We revise the figure8 to meet the Rights&Permissions suggested.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.