FFT Demo
EE 123 Spring 2016 Discussion Section 03
Jon Tamir

This demo shows off the power of the Fast Fourier Transform (FFT) algorithm. The demo was adapted from a blog post by Jake Vanderplas at the University of Washington. His Python Perambulations blog has wonderful Python demos on a variety of DSP and statistics topics.

The content of the blog post is available to us under the BSD license.

The original Cooley-Tukey FFT paper is available here: Link.

In [1]:
# Import functions and libraries
from __future__ import division
import numpy as np
from numpy import pi
from numpy import exp

DFT Recap

Forward DFT:
$X[k] = \sum_{n=0}^{N-1} x[n] W_N^{kn}$

Recall from class that the DFT can be implemented as a matrix-vector multiplication. We know that the complexity of dense matrix-vector multiplication is $\mathcal{O}(N^2)$.

We can write this as $\mathbf X = \mathbf{Dx}$, where $(\mathbf D)_{kn} = W_N^{kn}$

In [2]:
def W_N(N,k,n):
    return exp(-1j * 2 * np.pi * k * n / N)

def mydft(x):
    """Returns the 1D DFT of x using matrix-vector multiplication"""
    N = x.shape[0]
    n_idx = np.arange(N)
    k_idx = n_idx[:, None]
    return np.dot(W_N(N,k_idx,n_idx), x)

L = 1024
x = np.random.random(L) + 1j * np.random.random(L)

res = np.allclose(mydft(x), np.fft.fft(x))
if res:
    print 'DFT matrix matches built-in FFT'
else:
    print 'Error in DFT matrix multiplication'

%timeit mydft(x)
%timeit np.fft.fft(x)
DFT matrix matches built-in FFT
10 loops, best of 3: 56.8 ms per loop
10000 loops, best of 3: 30.8 µs per loop

Symmetry in the DFT

Recall the periodicity/symmetry of the DFT: $$ \begin{align} X[k + mN] &= \sum_{n=0}^{N-1} x[n] W_N^{(k+mN)n} \\ &= \sum_{n=0}^{N-1} x[n] W_N^{kn} \\ &= X[k], \end{align} $$ where we used the periodicity of $W_N$.

We can use this symmetry to split the DFT into two smaller DFTs: $$ \begin{align} X[k] &= \sum_{n=0}^{N-1} x[n] W_N^{kn} \\ &= \sum_{m=0}^{N/2-1} x[2m] W_N^{k(2m)} + \sum_{m=0}^{N/2-1} x[2m+1] W_N^{k(2m+1)} \\ \end{align} $$

The FFT uses this property to decompose the length-N DFT into two length-N/2 DFTs, ad infinitum. This amounts to a complexity of $\mathcal{O}(N\log N)$

In [3]:
def myfft(x):
    """A recursive implementation of the 1D Cooley-Tukey FFT"""
    N = x.shape[0]
    
    if N % 2 > 0:
        raise ValueError("size of x must be a power of 2")
    elif N <= 32:  # this cutoff should be optimized
        return mydft(x)
    else:
        X_even = myfft(x[::2])
        X_odd = myfft(x[1::2])
        n_idx = np.arange(N)
        factor = W_N(N,n_idx, 1)
        return np.concatenate([X_even + factor[:N // 2] * X_odd,
          
                               X_even + factor[N // 2:] * X_odd])
    
res = np.allclose(myfft(x), np.fft.fft(x))
if res:
    print 'Our FFT matches built-in FFT'
else:
    print 'Error in recursive FFT implementation'
    
    
%timeit mydft(x)
%timeit myfft(x)
%timeit np.fft.fft(x)
Our FFT matches built-in FFT
10 loops, best of 3: 55.7 ms per loop
100 loops, best of 3: 3.14 ms per loop
10000 loops, best of 3: 30.8 µs per loop

Leveraging Numpy

To get the most out of Numpy (and Matlab...), it is best to use vectorized operations instead of for loops or recursion. We can speed up our FFT by doing this.

In [4]:
def myfft_vec(x):
    """A vectorized, non-recursive version of the Cooley-Tukey FFT"""
    N = x.shape[0]

    if N % 2 > 0:
        raise ValueError("size of x must be a power of 2")

    # N_min here is equivalent to the stopping condition above,
    # and should be a power of 2
    N_min = min(N, 32)
    
    # Perform an O[N^2] DFT on all length-N_min sub-problems at once
    n_idx = np.arange(N_min)
    k_idx = n_idx[:, None]
    D = W_N(N_min, n_idx, k_idx)
    X = np.dot(D, x.reshape((N_min, -1)))

    # build-up each level of the recursive calculation all at once
    while X.shape[0] < N:
        X_even = X[:, :X.shape[1] // 2]
        X_odd = X[:, X.shape[1] // 2:]
        factor = W_N(X.shape[0], np.arange(X.shape[0]), 0.5)[:, None]
        X = np.vstack([X_even + factor * X_odd,
                       X_even - factor * X_odd])

    return X.ravel()


res = np.allclose(myfft_vec(x), np.fft.fft(x))
if res:
    print 'Our vectorized FFT matches built-in FFT'
else:
    print 'Error in recursive FFT implementation'
    
%timeit mydft(x)
%timeit myfft(x)
%timeit myfft_vec(x)
%timeit np.fft.fft(x)
Our vectorized FFT matches built-in FFT
10 loops, best of 3: 53.6 ms per loop
100 loops, best of 3: 2.92 ms per loop
1000 loops, best of 3: 400 µs per loop
10000 loops, best of 3: 31.4 µs per loop