In its setup and control, convolver is the same as tvfilter. Its processing, however, is different. In tvfilter, filtering is produced by multiplying the magnitudes from the polar form of the two analyses; leaving the phases (or frequencies) of the source intact while modifying the amplitudes of those frequencies. Convolver goes a bit further by multiplying the two analyses in their Cartesian forms. This produces an intersection of the two spectra. Unlike tvfilter which produces a shadowlike intersection, shadowing the analysis file characteristic onto the input sound file, convolver creates a true spectral intersection, allowing only that which is common to both sounds to be heard. The effect is a sound that is somewhat garbled as it outputs the more intermittently common spectral components of the two. The form of the multiplication in convolver does not allow some of the filter transposition controls associated with tvfilter. There is however a convolution panpot that offers control of the mix between the convolution and source sounds.
Amplitude Reports Print Mode
Two flags are provided for controlling the output amplitude statistics; one turns the statistics on or off, and the other sets how often they will be reported. The statistics provide the peak output level in amplitude and decibels. With integer format output files, output values exceeding the normalized peak amplitude of 1. (0 dB) are clipped to a value of 1.0, and the statistics placed in clip mode; in clip mode reports are made only for frames where clipping occurs. The peak amplitude, its time, and the number of clipped samples are reported at the end of processing. With floating-point format output files, output values exceeding the normalized peak amplitude of 1. are not clipped since they will be rescaled in the second pass; output statistics proceed normally throughout. The levels before and after rescaling are reported at the end of processing.
0 turns amplitude reports off, 1 turns them on.
Auto Stop
When on, this parameter terminates synthesis if a window boundary is reached. 0 is off, 1 is on.
Convolution Gain in Decibels
Increases or decreases the amplitude of the convolution of the two sound files. A change of +/- 6 dB doubles or halves amplitude.
Data Access Mode
Determines how the data in the analysis file is read. In rate mode (0), the data is read from the time point parameter at a rate set in the rate multiplier parameter.
In explicit mode, the data starts at the time point parameter and moves according to a user-defined function in the time point parameter.
End Time
The time, in seconds, at which to stop processing the soundfile. 0 or less is equivalent to the duration of the soundfile.
Envelope Modifications
The rate at which amplitude changes are allowed to occur effects how smooth spectral evolutions will be. To control this, many routines contain attack and decay response times controls: once translated these controls manipulate the coefficients of the following filter.
y(n) = (1. - A) * x(n) + A * y(n)
The filter is a lowpass designed to increasingly smooth the sudden changes in a signal as the value of the coefficient, A, is increased. Its control is through the response time parameter which is the time in seconds it takes a signal, shifting from one state to another, to decay to -60 dB of its former state. Response times are transformed to create the necessary coefficients for the selected frame rate. The response time is separated into attack and decay; this allows seperate control of the smoothing of the signal depending upon whether it is increasing or decreasing in amplitude. Short attack/decay response times can be used in places where dynamic processing induces garble or even pops. You can use longer response times to generally smooth or blur the onset/offset of sound components, particularly if the response controls are being applied to a time-varying filter. When applied to amplitudes, longer decay respsonse-times do not sound good, for in their delay of the decay, they end up amplifying te residual noise of a sound.
Envelope Attack Time
Envelope attack time affects the speed at which the amplitude of a sound changes. Large values blur the sound's attack, smaller values sharpen it.
Envelope Release Time
Envelope release time affects the speed at which the amplitude of a sound changes. Large values cause the sound to fade for a longer period, smaller values cause the sound to cut off more suddenly.
Low/High Shelf Equalization
Equalization has been provided at various points in routines to allow for the needed adjustment of spectra. The EQ consists of low and hi shelf segments, whose width is adjusted through control of the shelf breakpoint frequency. The region between the shelf segments is represented by a linear decibel gradient between the decibel levels of the two shelves. Some routines implement the EQ before pitch changes, others after. EQ placed before pitch changes (pre-transpose/shift) will cause the EQ to be transposed with the pitch changes, whereas afterwards (post-transpose/shift) will keep them fixed as shifts and transpositions occur.
Low Shelf Gain
Determines how the amplitude of sounds below the low shelf frequency will be affected.
High Shelf Gain
Determines how the amplitude of sounds above the high shelf frequency will be affected.
Low Shelf Frequency
Determines the frequency below which the low shelf gain will be used.
High Shelf Frequency
Determines the frequency above which the high shelf gain will be used.
Frequency Shift Factor
With the frequency shift control, a constant or function value is added to all the bin frequencies to produce a nonlinear pitch domain translation of the spectrum. Frequency shift is related to things like ring modulation and their similarly nonlinear shifts of pitch characteristics. Use this to create small distortions of the harmonic integrity of a sound.
Input Sound Gain in Decibels
Increases or decreases the amplitude of the input sound file. A change of +/- 6 dB doubles or halves amplitude.
Input Sound Panpot Domain Warp
Many of the routines employ the principle of warping in which a distribution of values is transformed by an identity function. In these places an exponential function is employed to remap a 0-1 range of values into a new orientation that preserves the minima (0) and maxima (1) while bringing the distribution closer to either extreme as a result of the curvature of the exponential function selected. The curvature of the exponential function is selected through a warp index. Specifically, warp index w will reorient the input x through the function below (^ = exponentiation).
y = (1. - (e^(x * w))) / (1. - (e^w))
In this function, the warp index of 0 produces a linear function and an untransformed output. Positive warp index values of increasing magnitude produce curves of increasing concavity (increasing slope) that draw values towards the 0-valued minima, and reduce the function integral. Negative values do the opposite, drawing values towards the maxima of 1, increasing the integral.
The practical use of this mechanism is found in various places. One such place is the reshaping of the frequency response distribution characteristics. In this, positive warp indeces cause the peaks of the response to be accentuated while the weaker frequencies are expanded out (i.e. pushed towards 0). Negative values have the opposite effect as they compress the dynamic range of the response and raise the relative level of the weaker noise components. Another place where warp applies is in the remapping of FFT amplitudes through the spectrum warpshape. In this, the sucessive FFT frames have their amplitudes remapped by the identity function, similiarly expanding or compressing the dynamic range depending upon the warp specified; 0 (linear warp function) leaves the amplitudes unchanged.
Oscillator Resynthesis Threshold in Decibels
The phase vocoder resynthesizes the signal using one of two methods, depending on the type of changes made to the FFT. If the changes are only to the magnitudes (amplitudes), then the faster overlap/add method is used. If however changes in frequency are made, then the FFT integrity is compromised, necessitating use of the oscillator bank method in which each bin is synthesized as a sine wave changing in frequency and amplitude. This method is slower, although a resynthesis threshold is available that can be used to increase the computation speed by turning off bins whose amplitude falls below the threshold. A threshold of -60dB is appropriate, although safety warrants using a lower threshold if the spectrum is thin and its decays exposed; use your ear.
Output Format
The output sound file is written as a NeXT/Sun format sound file in either 16-bit short or 32-bit floating point format, of one or more channels. The channels are processed one at a time beginning with the first channel. The first pass writes zeros in the channels yet to be processed, replacing them when processing proceeds to those channels.
0 tells PVCX to use the format of the input file, 1 equals integer format, and 2 equals rescaled floats.
Output Gain in Decibels
Increases or decreases the amplitude of the output sound file. A change of +/- 6 dB doubles or halves amplitude.
Output Pitch Transposition in Semitones
With the pitch transposition control, a constant or function value is multiplied against all bin frequncies. This is classic transposition, here specified in semitones of transposition (12 semitones equals an octave). Conversion is made to produce the appropriate frequency multiplier.
Peak Rescale Level
Selection of the floating-point, output-file format invokes an amplitude rescaling feature. Once processing is complete, a second pass through the sound file is made to rescale the values to the decibel level specified. A dB rescale level of 1 causes rescaling to the level of the original input file.
Position Between Convolved Sounds
Determines which sound, the input or the pvanalysis file, is accented, using a range of -1 to 1. -1 fully accents the input sound file, 1 fully accents the pvanalysis file, and 0 strikes an even balance.
Pvanalysis Sound Gain in Decibels
Increases or decreases the amplitude of the pvanalysis input file. A change of +/- 6 dB doubles or halves amplitude.
Pvanalysis File
If the 'Use existing Pvanalysis' button is on, this field contains the path of the pvanalysis file to use.
Pvanalysis File Output Channel
Similar to the resynthesis channel parameter, this determines the output channel for the pvanalysis sound. 0 equals all channels.
Pvanalysis Sound Panpot Domain Warp
Many of the routines employ the principle of warping in which a distribution of values is transformed by an identity function. In these places an exponential function is employed to remap a 0-1 range of values into a new orientation that preserves the minima (0) and maxima (1) while bringing the distribution closer to either extreme as a result of the curvature of the exponential function selected. The curvature of the exponential function is selected through a warp index. Specifically, warp index w will reorient the input x through the function below (^ = exponentiation).
y = (1. - (e^(x * w))) / (1. - (e^w))
In this function, the warp index of 0 produces a linear function and an untransformed output. Positive warp index values of increasing magnitude produce curves of increasing concavity (increasing slope) that draw values towards the 0-valued minima, and reduce the function integral. Negative values do the opposite, drawing values towards the maxima of 1, increasing the integral.
The practical use of this mechanism is found in various places. One such place is the reshaping of the frequency response distribution characteristics. In this, positive warp indeces cause the peaks of the response to be accentuated while the weaker frequencies are expanded out (i.e. pushed towards 0). Negative values have the opposite effect as they compress the dynamic range of the response and raise the relative level of the weaker noise components. Another place where warp applies is in the remapping of FFT amplitudes through the spectrum warpshape. In this, the sucessive FFT frames have their amplitudes remapped by the identity function, similiarly expanding or compressing the dynamic range depending upon the warp specified; 0 (linear warp function) leaves the amplitudes unchanged.
Rate Multiplier
The rate at which data is read in rate mode. In explicit mode, this parameter is ignored.
Resynthesis Channel
All routines allow both monophonic and multi-channel input files to be processed. With multi-channelled files, you can either select one channel and produce a monophonic output file, or process all the channels. Channels are numbered beginning with 1. Processing of multi-channelled files is done one channel at a time beginning with channel 1, with zeros written to channels which have yet to be processed. Processing one channel at a time requires less memory and allows you to audition the output sooner than if you did all channels at once.
Use 0 to process all channels.
Time Expansion/Contraction Factor
Once the spectral modifications are made to the FFT analysis, an inverse FFT is invoked to produce the samples of a time-domain signal. The classic phase vocoder paradigm controls the number of samples through the interpolation value and its relation to the decimation. The arcane relationship of decimation and interpolation is here translated into the parameter of time expansion/contraction, allowing for the direct scaling of time. Use values greater than 1 to expand time, less than 1 contract it.
Time Interval Between Reports
Determines the interval in seconds of the soundfile between amplitude reports. See Amplitude Reports Print Mode for a further explaination.
Time Point
Where the data starts to read from, in seconds of the analyzed sound. In rate mode, this should be set to the initial position to read from. In explicit mode, it should be a function.
Time Window: Lower Boundary
Determines the earliest time in seconds of the analyzed sound to read from.
Note that the window is circular; once the end is reached, the beginning will be read from again unless auto-stop is on.
Time Window: Upper Boundary
Determines the latest time in seconds of the analyzed sound to read from. If it is less than 0, this parameter defaults to the file duration.
Note that the window is circular; once the end is reached, the beginning will be read from again unless auto-stop is on.
Window Size in Samples
The window size is a less opaque parameter; like the FFT, it must be a power of 2. Windows twice the size of the FFT work well. Larger window sizes may resolve frequencies better. Specifying 0 for the window size will automatically set the window to twice the FFT size.
Window Type
The FFT and inverse FFT are computed using a window. Like the FFT size, the shape of the window used can effect the quality of the analysis and resynthesis. (See F.R.Moore, Stieglitz, or Roads for further explanation.) A variety of windows are available including: Hamming, Rectangular, Blackman, Triangular, and Kaiser (in 8 different forms as related to 8 different alpha values). Blackman (-w2) or Kaiser (-w8) are recommended for most applications. In some unusual cases where transient behavior is being lost, consider using other windows such as the Rectangular, although take care to assure that it is not producing pops or a buzzy sound.