The protein set includes proteins (amino-acid sequences) from the well-known
Structural Classification of Proteins (SCOP) database.
We used all sequences in release 1.63 of SCOP, after eliminating
For the English text we chose the well-known `Calgary Corpus', which
is traditionally used for benchmarking lossless
The music set was assembled from MIDI
files of music pieces.
The musical benchmark was compiled using a variety of well-known pieces of different styles.
The styles we included are: classical, jazz and rock/pop. All the pieces we consider are
polyphonic (played with several instruments simultaneously).