The Constant Q Transform
Benjamin
Blankertz
[gzipped
ps-file or pdf-file]
Introduction
The constant Q transform as introduced in [Brown
91] is very close related to the Fourier transform. Like the Fourier
transform a constant Q transform is a bank of filters, but in contrast
to the former it has geometrically spaced center frequencies
(k=0,...), where b dictates the number of filters per octave. To
make the filter domains adjectant one chooses the bandwidth of the k-th
filter as
.
This yields a constant ratio of frequency to resolution
.
What makes the constant Q transform so useful is that by an appropriate
choice for
f0 (minimal center frequency) and b
the center frequencies directly correspond to musical notes. For instance
choosing b=12 and f0 as the frequency of midinote
0 makes the k-th cq-bin correspond the midinote number k.
Another nice feature of the constant Q transform is its increasing time
resolution towards higher frequencies. This resembles the situation in
our auditory system. It is not only the digital computer that needs more
time to perceive the frequency of a low tone but also our auditory sense.
This is related to music usually being less agitated in the lower registers.
Deriving Filter Parameters
for the Constant Q Transform
To derive the calculation of the constant Q transform of some sequence
x
we begin with an inspection of the familiar formula of a Fourier filter
at z
| (1) |
![\begin{displaymath}\sum_{n<N} x[n]\;e^{-2\pi i n z / N}\end{displaymath}](img5.gif) |
Each component of the constant Q transform, in the sequel called cq-bin,
will be calculated as such a filter, but suitable values for z and
window length N have to be found in order to match the properties
discussed above. The bandwidth of the filter (1)
is
(fs denotes the sampling rate) independently of z.
Thus the desired bandwidth
can be realized by choosing a window of length
.
The frequency to resolution ratio of the filter in (1)
is
.
To achieve a constant value Q for the frequency to resolution ratio
of each cq-bin one has to select z:= Q. Thus for integer
values Q the k-th cq-bin is the Q-th DFT-bin with
window length
.
Summerizing we get the following recipe for the calculation of a constant
Q transform: First choose minimal frequency f0 and the
number of bins per octave
b according to the requirements of the
application. The maximal frequency
only affects the number of cq-bins to be calculated1):
[Sorry: equation numbers should be (2) to (5)]
To reduce spectral leakage (cf. [Harris
78]), it is advisable to use the filter in conjunction with some window
function:
<w
is some analysis window of length N. We followed Judith Brown by
using Hamming windows.
1)
denotes the least interger greater than or equal to x.
Direct Implementation
For comparison first the usual FT-algorithm is given, not in its compact
form
ft= (x .* win) * exp(-2*pi*i*(0:N-1)'*(0:N-1)/N),
but with a loop.
function ft= slowFT(x, N)
for k=0:N-1;
ft(k+1)= x(1:N) * (hamming(N)
.* exp( -2*pi*i*k*(0:N-1)'/N)) / N;
end
And here is the direct implementation of the constant Q transform.
function cq= slowQ(x, minFreq, maxFreq, bins,
fs)
Q= 1/(2(1/bins)-1);
for k=1:ceil(bins*log2(maxFreq/minFreq));
N= round(Q*fs/(minFreq*2((k-1)/bins)));
cq(k)= x(1:N) * (hamming(N)
.* exp( -2*pi*i*Q*(0:N-1)'/N)) / N;
end
An Efficient Algorithm
Since the calculation of the constant Q transform according to the formula
(5) is very time consuming, an efficient algorithm
is highly desirable. Using matrix multiplication the constant Q transform
of a row vector x of length N (
for all k<K) is
| (6) |
 |
where T* is the complex conjugate of the temporal kernel2)
| (7) |
![\begin{displaymath}T_{nk}:=\begin{cases}\frac{1}{N_k} \, w_{N_k}[n] \; e^{......k}}& \text{if } n<N_k \\0 & \text{otherwise}\end{cases}\end{displaymath}](img19.gif) |
Since the temporal kernel is independent of the input signal x one
can speed up successive constant Q transforms by precalculating T*.
But this is very memory consuming and since there are many non vanishing
elements in T the calculation of the matrix product
still takes quite a while.
Luckily Judith Brown and Miller Puckette came up with a very clever
idea for improving the calculation [Brown and Puckette 92]. The idea is
to carry out the matrix multiplication in the spectral domain. Since the
windowed complex exponentials of the temporal kernel have a DFT that vanishes
almost everywhere except for the immediate vicinity of the corresponding
frequency the spectral kernel
| (8) |
 |
is a sparse matrix (after eliminating components below some threshold value).
This fact can be exploited for the calculation of
owing to the identity
| (9) |
![\begin{displaymath}\sum_{n<N} x[n] \; {y[n]^{*}} \;=\;\frac{1}{N}\sum_{n<N} x^{\text{ft}}[n] \; {y^{\text{ft}}[n]^{*}}\end{displaymath}](img23.gif) |
where x and y are sequences of length N and
,
denote their unnormalized discrete Fourier transform. Applying this identity
to the formula of constant Q transform (5)
using definitions (7) and (8)
yields
![\begin{displaymath}x^{\text{cq}}[k]= \frac{1}{N} \sum_{n<N} x^{\text{ft}}[n] \; {S_{nk}^{*}}\end{displaymath}](img26.gif)
or equivalently in matrix notation
| (10) |
 |
Due to the sparseness of S the calculation of the product
involves essentially less multiplications than
.
The Fourier transforms that arise in the efficient algorithm should
of course be calculated using FFTs. To this end N is chosen as the
lowest power of 2 greater than or equal to N0 (which
is the maximum of all Nk). The calculation of the spectral
kernel is quite expensive, but having done this once all succeeding constant
Q transforms are performed much faster.
2) In (7)
we center the filter domains on the left for the ease of notation. Right-centering
is more appropriate for real-time applications. Middle-centering has the
advantage of making the spectral kernel (8)
real.
Implementation of the Efficient Algorithm
function sparKernel= sparseKernel(minFreq, maxFreq, bins, fs, thresh)
if nargin<5 thresh= 0.0054; end % for Hamming
window
Q= 1/(2^(1/bins)-1);
(3)
K= ceil( bins * log2(maxFreq/minFreq) );
(2)
fftLen= 2^nextpow2( ceil(Q*fs/minFreq) );
tempKernel= zeros(fftLen, 1);
sparKernel= [];
for k= K:-1:1;
len= ceil( Q * fs / (minFreq*2^((k-1)/bins)) );
(4)
tempKernel(1:len)= hamming(len)/len .* exp(2*pi*i*Q*(0:len-1)'/len);
(7)
specKernel= fft(tempKernel);
(8)
specKernel(find(abs(specKernel)<=thresh))= 0;
sparKernel= sparse([specKernel sparKernel]);
end
sparKernel= conj(sparKernel) / fftLen;
(10).1
function cq= constQ(x, sparKernel) % x must be
a row vector
cq= fft(x,size(sparKernel,1)) * sparKernel;
(10).2
A matlab implementation of the efficient algorithm for the constant
Q transform [Brown and Puckette 92] coded by Benjamin Blankertz. The sparseKernel
has only to be calculated once. This might take some seconds. After that,
constant Q transforms of any row vector
x can be done very efficiently
by calling constQ. The function
hamming is included in
the dsp-toolbox. It returns the Hamming window of the given length:
hamming(len)= 0.46-0.54*cos(2*pi*(0:len-1)'/len).
Bibliography
-
[Brown 91] Judith C. Brown, Calculation of a constant Q spectral transform,
J.
Acoust. Soc. Am., 89(1): 425-434, 1991.
-
[Brown and Puckette 92] Judith C. Brown and Miller S. Puckette, An efficient
algorithm for the calculation of a constant Q transform, J. Acoust.
Soc. Am., 92(5): 2698-2701, 1992.
-
[Harris 78] F. J. Harris, On the Use of Windows for Harmonic Analysis with
Discrete Fourier Transform, in: Proc. IEEE, Bd. 66, pp. 51-83, 1978.
mail to: <blanker@math.uni-muenster.de>
go to: dsp-page
go to: home-page