TY - GEN
T1 - Comparing Three Data Representations for Music with a Sequence-to-Sequence Model
AU - Li, Sichao
AU - Martin, Charles Patrick
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - The choices of neural network model and data representation, a mapping between musical notation and input signals for a neural network, have emerged as a major challenge in creating convincing models for melody generation. Music generation can inspire creativity in artists and the general public, but choosing a proper data representation is complicated because the same musical piece can be presented in a range of expressive ways. In this paper, we compare three different data representations on the task of generating melodies with a sequence-to-sequence model, which generates melodies with flexible length, to explore how they affect the performance of generated music. These three representations are: a monophonic representation, playing one note each time, a polyphonic representation, indicating simultaneous notes and a complex polyphonic representation, expanding the polyphonic representation with dynamics. The influences of three data representations on the generated performance are compared and evaluated by mathematical analysis and human-cantered evaluation. The results show that different data representations fed into the same model endow the generated music with various features, the monophonic representation makes the music sound more melodious to humans’ ears, the polyphonic representation provides expressiveness and the complex-polyphonic representation guarantees the complexity of the generated music.
AB - The choices of neural network model and data representation, a mapping between musical notation and input signals for a neural network, have emerged as a major challenge in creating convincing models for melody generation. Music generation can inspire creativity in artists and the general public, but choosing a proper data representation is complicated because the same musical piece can be presented in a range of expressive ways. In this paper, we compare three different data representations on the task of generating melodies with a sequence-to-sequence model, which generates melodies with flexible length, to explore how they affect the performance of generated music. These three representations are: a monophonic representation, playing one note each time, a polyphonic representation, indicating simultaneous notes and a complex polyphonic representation, expanding the polyphonic representation with dynamics. The influences of three data representations on the generated performance are compared and evaluated by mathematical analysis and human-cantered evaluation. The results show that different data representations fed into the same model endow the generated music with various features, the monophonic representation makes the music sound more melodious to humans’ ears, the polyphonic representation provides expressiveness and the complex-polyphonic representation guarantees the complexity of the generated music.
KW - Data representation
KW - Machine learning
KW - Music generation
KW - Sequence-to-sequence model
UR - http://www.scopus.com/inward/record.url?scp=85097647359&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-64984-5_2
DO - 10.1007/978-3-030-64984-5_2
M3 - Conference contribution
SN - 9783030649838
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 16
EP - 28
BT - AI 2020
A2 - Gallagher, Marcus
A2 - Moustafa, Nour
A2 - Lakshika, Erandi
PB - Springer Science and Business Media Deutschland GmbH
T2 - 33rd Australasian Joint Conference on Artificial Intelligence, AI 2020
Y2 - 29 November 2020 through 30 November 2020
ER -