project-thumb


S3: A Symbolic Music Dataset for Computational Music Analysis of Symphonies

Research | 2024 (Poster accepted)

Featured at ISMIR Late-breaking Demo 2024.

Led a team to compile an expansive dataset annotated for harmony, musical form, and texture, encompassing over 11,000 measures.
This dataset is a foundational tool for advancing computational music analysis in the study of symphonies.


Abstract

The scarcity of symbolic music datasets has long been a challenge in the field of music information retrieval. Many studies have emphasized the need for high-quality, manually annotated datasets that include multifaceted labels, or focus on underrepresented periods like the Romantic period. In this paper, we present the S3, Symbolic Symphony Set, a comprehensive collection featuring four symphonies, totalling 16 movements, by Mozart, Beethoven, Dvorak, and Tchaikovsky. This dataset includes XML files and detailed annotations in the CSV format for notes and musical structure on both horizontal and vertical aspects, which are commonly known as form-related and texture-related information. The note annotations are semi-automatically generated. Form-related information includes form analysis, cadence, and harmony, while orchestral texture include the role (melody, rhythm, harmony, or mixed) for each instrument. All annotations have been converted into CSV format to facilitate further analysis and modeling. Additionally, manually annotated PDF files are included in the dataset for reference. Our dataset is available on https://github.com/iis-mctl/mctl-symphony-dataset.


Author: Zih-Syuan Lin*, Yu-Chia Kuo*, Tzu-Yun Hung, Wei-Yang Lin, Ya-Hsuan Chu, Ting-Kang Wang, Jing-Heng Huang, Chien Chang, Christofer Julio, Gloria Hsieh, and Li Su

Reference Format:
Zih-Syuan Lin*, Yu-Chia Kuo*, Tzu-Yun Hung, Wei-Yang Lin, Ya-Hsuan Chu, Ting-Kang Wang, Jing-Heng Huang, et al. 2024. “S3: A Symbolic Music Dataset for Computational Music Analysis of Symphonies.” In Extended Abstracts for the Late Breaking Demo Session of the 25th International Society for Music Information Retrieval Conference. San Francisco, United States.

>> Paper
>> GitHub