Abstract

Spectral Reanimation is a member of the family of cross synthesis techniques where aspects of one sound are transferred to another. Unlike traditional cross synthesis methods, notably convolution (Moore 1990) and LPC filtering (Cann 1985), Spectral Reanimation does not impart time-varying aspects of the spectral source materials. Rather, the source is "texture-mapped" onto the target sound, resulting in rhythmic behavior of the target articulated through the spectral content of the source.

The Idea

Spectral Reanimation is a technique for creating new sounds using the spectral content from one sound and the temporal behavior from another. In this paper we will refer to the sound providing spectral content as the “source” and the sound providing temporal behavior as the “driver”. The central idea is to replace all spectra in the driver sound with spectra from the source sound. We select each source spectrum to be in some sense maximally related to the driver spectrum. As a result the synthesized sound will be as spectrally close to the driver sound as the spectrum of the source permits. For example, spectra from a hi-hat sound in the driver might be replaced with spectra from tape hiss in the source sound. The choice of source spectrum is automated. If the source lacks any spectra that are close to a particular region of target spectra in the driver, that region will exhibit marked spectral deviation in the reanimated sound compared to the source sound.

The Implementation

In the first part of the procedure, an input sound is analyzed into a series of short time Fourier Transform (STFT) measurements or spectra. Each spectrum is normalized such that its component amplitudes sum to unity. In the second part of the procedure, a driver sound is similarly analyzed. For each spectral frame of the driver, a frame from the source is selected that is closest to the driver frame. This source frame then replaces the original driver frame in the reanimated sound. Thus in the resultant sound, all spectral information derives from the source sound, but the sequencing and relative amplitudes of the spectra all derive from the driver sound. The “closest” spectrum has the smallest sum of the absolute value of amplitude differences across the spectrum.

Comparison to Other Cross Synthesis Methods

The division in labor between source and driver is found in traditional cross-synthesis methods such as Linear Predictive Coding (LPC) analysis/synthesis. However in these traditional methods of cross synthesis, temporal behavior of the source is just as important as in the driver. Temporal scrambling of the source will create a dramatically different result for LPC analysis/synthesis. This is not the case for Spectral Reanimation, which will patiently scan the entire source spectra until it finds the spectrum closest to that of the current driver frame, regardless of where it falls in the source.

Vocal Uses

When both source and driver are speech sounds, an interesting effect is obtained. Since all spectra must come from the source, the reanimated sound will contain the text content of the driver but formant aspects of the source. This results in a kind of speech mimicry where the voice of the source appears to be speaking the words of the driver. It should be noted that the reanimated sound contains many artifacts since the spectra are of course not contiguous from the original sound, and the level of intelligibility is usually quite low. However intelligibility is much improved when the driver contains the same words as the source and not necessarily in the same order. The results of this speech mimicry are often amusing and have proved musically useful. Further refinements of the technique would be required for better speech mimicry, but this would likely defeat the automatic nature of the procedure in its current form.

Musical Uses

There are many possible musical uses of Spectral Reanimation. Since all spectra come from the source sound, one can completely control the spectral content of the reanimated sound, while driving it with any kind of sound imaginable. It is interesting to drive harmonically constrained and spectrally simple sounds with energetic and complex improvised sounds. The reverse is also interesting. Driver sounds with strong rhythmic profiles are often quite successful. Reanimating piano music of Webern with music by the rap group Cypress Hill produces a result that invites the listener to hear both original pieces in a new context. Since the reanimation process is entirely automatic once the analyses are complete, composing with the technique focuses on imaginative selection of source and driver sounds.

Further Enhancements

The method of determining the closeness of two spectra is quite simple but effective. However incorporating further feature analysis such as variance, cepstrum or formant matching could yield even better results (at the cost of considerably more computation). Spectral Reanimation has a characteristic artifact that could be described as spectral stuttering. Where successive spectra in the driver alternatively find very good and poor matches in the source, a stuttering effect is created as the matching analysis scans wildly over the source in search of a better match. This can be ameliorated somewhat by choosing better matching source sounds. Another possibility is to constrain the matching process to specific regions of the source, depending on the local success at finding good matches for a particular region of the driver sound. These regions could be determined in advance by an analysis of the source sound according to various criteria including gross amplitude features. This could at least keep bad matches bounded within a limited region of the source sound, resulting in less extreme spectral fluctuation in problematic regions.

Conclusion

Spectral Reanimation has proved a useful compositional signal-processing tool. Although related to earlier methods of cross-synthesis, the technique has its own distinctive charms as well as its less charming artifacts. The basic technique is quite simple to implement, and might well be improved by integrating various forms of feature analysis.

References

Cann, Richard, “An Analysis/Synthesis Tutorial.” In Foundations of Computer Music. Eds. Curtis Roads and John Strawn. Cambridge, Massachusetts: MIT Press, 1985. 114-144.

Moore, F. Richard. Elements of Computer Music. Englewood Cliffs, New Jersey: Prentiss-Hall, 1990.

Lyon, E.,"Spectral Reanimation," Proceedings of the Ninth Biennial Symposium on Arts and Technology, New London, CT, 27 February - 1 March 2003. New London: Ammerman Center for Arts and Technology, 2003. 103-105.