Dynamic programming algorithms applied to musical counterpoint in process composition: an example using Henri Pousseur’s Scambi

: The Needleman-Wunsch process is a classic tool in bioinformatics, being a dynamic programming algorithm that performs a pairwise alignment of two input biological sequences, either protein or nucleic acid. A distance matrix between the tokens used in the sequences is also required as input. The distance matrix is used to generate a positional pairwise similarity matrix between the input sequences, which is in turn used to generate a dynamic programming matrix. The best path through the dynamic programming matrix is navigated using a traceback procedure that maximises similarity, inserting gaps as necessary. Needleman-Wunsch can align both nucleic acids or proteins, which use alphabets of size 4 and 20 tokens respectively. It can also be applied to any other kind of sequence where distance matrices can be specified. Here, we apply it to chains of Pousseur’s Scambi electronic music fragments, of which there are 32, and which Pousseur categorised by their sonic properties, thus permitting the consecutive construction of distance, similarity and dynamic programming matrices. Traceback through the dynamic programming matrix thus produces contrapuntal duet compositions in which two Scambi chains are played in the maximally euphonious manner, providing also an illustration of the principles of biological sequence alignment in sound.


Introduction
Henri Pousseur 's Scambi was composed in 1957(Pousseur 1959. Pousseur's recording of one of his own realizations of Scambi is available on YouTube 1 and is frequently included in anthologies of early electronic music. Other recorded versions, by Pousseur himself and by Luciano Berio, survive but are not publicly available. Marc Wilkinson's version may have been lost (Wilkinson 1958). More recently, realizations of Scambi have been created by Andre Castro and Rudy Ceccato, as well as a jointly realized version by Robin Fencott and Simon Harris. These are available from the website of the AHRCfunded Scambi project 2 .
These several versions of Scambi exist, and many more may be created, because Scambi is an open form process piece (Dack 2009). However, it is neither aleatory nor does it permit improvisation. The raw material for Scambi is a set of 32 fragments ("sequences") of electronically generated sound of either 30 or 42 seconds in length, together with a set of imposed rules about how those sequences may be put together 3 . Scambi is thus in some respects a kind of musical game. The rules that Pousseur defined for assembling the sequences relate to the most euphonious connection of the end of one sequence with the start of the next, with the goal of creating a smooth composition from fragmentary parts.
The rules of Scambi also allow for sequences to be played simultaneously, producing polyphonic compositions (Dack 2009). Pousseur's sequences were recorded in mono, so for a polyphonic piece they may be distributed around a stereo (or higher dimension) system. In a previous work, a 5-channel Scambi realization was produced 4 to create alternating periods of thicker and sparser sonic texturea stripy effect inspired by the fruit fly Drosophila fushi tarazu gene expression pattern (Gatherer 2020).
As professional biologists interested in electronic music, we thus endeavour to bring the spirit of the Bio-Art movement 5 to musical composition. In the present paper, we describe a further Bio-Music project, this time investigating the use of dynamic programming algorithms to generate binaural counterpoint between two previously composed Scambi monophonic chains of sequences. Dynamic programming has been applied in bioinformatics to align both protein sequences and nucleic acid 1 https://www.youtube.com/watch?v=E6vlOFApLnQ 2 http://scambi.mdx.ac.uk/ 3 It should be noted that a "sequence" in Scambi is the individual token from which a compositional chain is created. In biology, a "sequence" is not the individual element, but the whole chain, and the individual tokens are either "bases" (for nucleic acids) or "residues" (for proteins). 4 https://www.youtube.com/watch?v=X6qDwhmZ01k 5 As exemplified by our colleague at Lancaster, Dr Rod Dillon: https://roddillon.com/ sequences. We provide software written in Python (Scambi Kit) that automates both the creation of Scambi chains and their contrapuntal alignment.

Sound Files and Software
Pousseur's original magnetic tape sequences have been kindly provided by Dr John Dack (Middlesex University) in aiff format. These were converted into 16-bit wav format in Audacity 6 . Scambi Kit (available at https://github.com/ljcochrane/Scambi_Kit) was written in Python version 3.7.2 with the Python library PyDub used to aid in audio manipulation and processing. Figure 1 shows the opening menu of Scambi Kit.

Figure 1: Screenshot of Scambi Kit main menu
As in the previous paper (Gatherer 2020), a reversed version is created for each Scambi sequence. Table 1 shows the sequence classification. Pousseur's start and end binary digit codes refer to the musical characteristics of the first and second half of each sequence. From left to right: pitch (low '0' to high '1'), tempo (slow '0' to fast '1'), sound quality (dry '0' to reverberated '1') and continuity (inclusion of pauses '0' to continuous sound '1'). So, in Table 1, sequence family A starts 0110 -low, fast, reverberated and with pauses -and ends 1100 -high, fast, dry and with pauses. We render the binary strings in denary for ease of comparison.

Implementation Basic chain generation
Menu option 1 of Scambi Kit generates a chain of Scambi sequences following the rules described by Dack (2009) and the previous paper (Gatherer 2020). The user can specify the length of the chain and can either start on a given sequence or with a randomly selected sequence. Figure 2 shows a typical output.   The chain generated in Figure 2 is: 2r, 7r, 3r, 4, 16r, 26, 26r, 31r, 18, 4 which Table 2 shows represents sequences beginning and ending, in denary notation: 12-6, 6-15, 15-5, 5-15, 15-11, 11-1, 1-11, 11-2, 2-5, 5-15 The matching of the end of each sequence with the start of the succeeding sequence is thus illustrated.

Dual chain generation
Option 2 performs the same process, but generates two chains, as shown in Figure 3. The dual channel output begins with a single sequence that bifurcates into two simultaneously running chains. The reverse occurs at the end, with both chains ending in a single sequence. In Figure 3, the random starting sequence is 4 (5-15), which then serves to generate one chain continuing with 9 (15-8) and another with 7 (15-6). The two chains then evolve according to the rules, until the 9 th sequence pair is reached. In Figure 3 these are 21 (5-1) and 29r (8-1), which both lead into the final sequence, 22r (1-5).
The duet generated by the dual chain function has no provision for counterpoint. Each generated line is of the same length and played simultaneously. Since each may be following a very different set of start and end values (until these converge in the second last sequence), there may be sonic clashes.
One might even say that there is a danger of "dissonance", if such a concept may be used in the context of this kind of electronic music.

Contrapuntal generation
To generate a contrapuntal composition, menu 3 is first applied to generate a chain. The top chain from the duet composition in Figure 3 is entered manually.

Figure 4: Screenshot of Scambi Kit menu 3 output
Menu 5 is then used to convert this into a chain in Family notation (see Table 1)

Figure 5: Screenshot of Scambi Kit menu 5 output
This is then repeated for the lower chain. We now have two chains in family notation as follows:  Table 3: Two chains parsed into Family notation by menu option 5.
Menu option 6 is now used to generate a pairwise scoring matrix between these two chains ( Figure   6).  Table 1. Family B has binary notation of 0101 1111. Compared to itself, there is a Hamming Distance of zero. This is designated with a top score of 4. Family E has binary notation of 1111 1000.
The binary string of family B has a Hamming Distance of 5 to family E. Therefore the score of B compared to E is the top score minus the Hamming Distance, i.e. 4 -5 = -1. Similarly, Family N has binary notation of 1000 0010, with a Hamming Distance to Family B of 6. The score for this combination is therefore 4 -6 = -2. The maximum score of 4 is arbitrary, and was chosen by trial and error to generate the most amenable output for the next stage. Pairwise scoring matrices are standard in bioinformatics in the pairwise comparison of both proteins and nucleic acids.
Menu option 7 is now used to generate a dynamic programming matrix (Figure 7). A gap score must be entered. The more negative the gap score the less likely that a gap will be generated in the final alignment of the two chains. The dynamic programming algorithm is created using the similarity matrix and the gap score. For those wishing to look more deeply into how this matrix is generated, the Wikipedia page https://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm provides the easiest introduction. A deeper introduction can be found in Lesk (2014 pp. 182-196) and a more formal treatment in the original paper (Smith and Waterman 1981). The exact values in the dynamic programming matrix will vary according to the gap score adopted (in this case -2), and of course will be different for different input chains. The dynamic programming matrix is converted into an alignment by running the traceback algorithm using menu option 8, generating the final alignment ( Figure 8). Option 9 runs the entire process. In the case shown, the chains play in duet for the first two sequences, then the lower chain rests while the upper chain plays sequence N, they then duet again for a further two sequences, followed by another rest in the lower chain. Three more duet chains are followed by a rest of two sequences in the upper chain.
Audio output can be generated using option 10 ( Figure 9). A single mono sound file can be produced or one mono file corresponding to each chain. The latter is preferable as each file can be entered into Audacity and a stereo effect added ( Figure 10).

Fully Automated Contrapuntal generation
To simplify the process of creating a contrapuntal composition, option 0 (Figure 1), sequentially runs the processes seen above. The two initial chains can either be entered manually or generated automatically ( Figure 11). In this mode, the sequence, rather than family notation is used throughout.
This eliminates the need for a stochastic choice to be made by the program when converting back to sequence notation in defining which sequences will be used to build the final composition.
In this mode different sequences of the same family are displayed as matches when viewing the alignment. (Figure 12)

Remaining Issues
Scambi Kit uses a set of Scambi files converted to a uniform length of 42 seconds. The 30 second sequences are therefore decelerated in tempo. This is necessary when two sequences of different lengths are to be aligned. This issue was discussed in the previous paper (Gatherer 2020).