{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab3 - Time Frequency Part I\n", "
\n", "Spectrogram part was originally designed by John Pauly, modified and extended to include SDR and FM processing by Michael Lustig, Translated to python by Frank Ong\n", "\n", "This week we will look at the processing and spectrum of time-varying signals. In the first part of the lab we will look at the short-time fourier transform and spectrograms. We will use this to analyze audio signals and broadcast FM." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Import functions and libraries\n", "from __future__ import division\n", "import numpy as np, matplotlib.pyplot as plt\n", "from numpy import *\n", "from numpy.fft import *\n", "import scipy.signal as signal\n", "from matplotlib.pyplot import *\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part I: Spectrograms\n", "In class we used the DFT to estimate the frequency of a segment of a signal. When we do this, we are implicitly assuming that the frequency is constant over that interval. This is a reasonable assumption for some signals, especially for a very short time windows. \n", " \n", "There are many times when we'd like analyze signals whose frequency is changing over time. In fact, most signals aren't interesting unless they do change! There are many examples, including speech, music, and the sounds that surround you in daily life. In this lab we will learn how to process these signals to determine how their spectrum changes with time.\n", "\n", "The basic problem is that we have long segment of a signal $x[n]$, where $n=0, ... ,N-1$. We want to know what its frequency is as a function of time. There are two basic approaches. One is to pass the signal through a bank of bandpass filters, and plot the outputs of the filters as a function of time. \n", "\n", "\n", "The second approach is based on the short-time Fourier transform and is to break the signal up into short segments, and compute the spectrum of each of these separately. This is the approach we will use in this lab.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A sample signal is shown in Figure 1. You've seen this signal during the class. Obviously you can tell that the amplitude is changing as a function of time. When you examine the signal closely, you can also tell that the frequency is changing.\n", "Since the frequency is changing, we want to break the signal into segments over which the frequency is relatively constant. Then we will analyze each segment using the DFT. The result of analyzing the first 256 samples, and another block of 256 samples that is 1024 samples later is shown in Figure 2. As is the convention with the DFT, zero frequency is the first sample, and the Nyquist rate corresponds to the middle sample. The sampling rate in this example was 8192Hz, so the maximum frequency is 4096Hz. Note that the frequency has dropped about 300 Hz between the first and second block." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\"sampsig\"
\n", "
Figure 1: A segment of a sampled bird song.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a long signal, there would be a tremendous number of plots that would have to be examined, and this doesn't convey the information very effectively. Instead, the most common presentation is to display the information in an image format, with frequency on the vertical axis, time on the horizontal axis, and the value of a particular pixel being the magnitude of the spectra at that time and frequency. For the first 4 tasks, we are assuming that the signal is real and the magnitude of the spectrum is symmetric about the origin. We therefore need only display the positive frequencies. Later, in task 5 where we acquire data using the SDR the signal is going to be complex and we will need to display both positive and negative frequencies. This presentation is known as a spectrogram." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\"two_segmentsc\"
\n", "
Figure 2: The spectra from two 256 sample blocks of the signal from Fig. 1
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An example of this type of plot for the signal that Fig. 1 was taken from is shown in Fig. 3:\n", "
\"bird_spectroc\"
\n", "
Figure 3: A segment of a sampled bird song.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A function that takes a two-dimensional array y and makes an image of the log-magnitude is given below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Plot an image of the spectrogram y, with the axis labeled with time tl,\n", "# and frequency fl\n", "#\n", "# t_range -- time axis label, nt samples\n", "# f_range -- frequency axis label, nf samples\n", "# y -- spectrogram, nf by nt array\n", "# dbf -- Dynamic range of the spectrum\n", "\n", "def sg_plot( t_range, f_range, y, dbf = 60, fig = None) :\n", " eps = 10.0**(-dbf/20.0) # minimum signal\n", " \n", " # find maximum\n", " y_max = abs(y).max()\n", " \n", " # compute 20*log magnitude, scaled to the max\n", " y_log = 20.0 * np.log10( (abs( y ) / y_max)*(1-eps) + eps )\n", " \n", " # rescale image intensity to 256\n", " img = 256*(y_log + dbf)/dbf - 1\n", " \n", " \n", " fig=figure(figsize=(16,6))\n", " \n", " plt.imshow( np.flipud( 64.0*(y_log + dbf)/dbf ), extent= t_range + f_range ,cmap=plt.cm.gray, aspect='auto')\n", " plt.xlabel('Time, s')\n", " plt.ylabel('Frequency, Hz')\n", " plt.tight_layout()\n", " \n", " return fig\n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The only parameter of note is dbf which determines the dynamic range in dB that will be presented. The default is 60 dB, or a factor of 1000. This allows you to see small spectral components without seeing all of the background noise, and is reasonably suitable for our purposes here. For other applications with higher or lower noise levels, you may want to change this. For example in Task 5, the signal to noise ratio coming out of the SDR will require adjusting the dynamic range (somewhere between 20-40 was appropriate for my reception)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Several different sound files are available on the class web site to for you to work with. These are\n", "\n", "- `s1` A bird call.\n", "\n", "- `s2` A creaking door.\n", "\n", "- `s3` An orca whale call.\n", "\n", "- `s4` A sound effect.\n", "\n", "- `s5` 24Week Fetal Doppler Ultrasound \n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read each .wav files with the function `data = read_wav( filename )` and play them with the following commands:\n", " \n", " p = pyaudio.PyAudio() # instantiate PyAudio\n", " play_audio( data, p, fs ) # play audio\n", "\n", "Here are the definitions of these functions that read wav files and play them." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import pyaudio\n", "import wave\n", "\n", "# function that reads wav file to array\n", "def read_wav( wavname ):\n", " \n", " wf = wave.open(wavname, 'rb')\n", " \n", " CHUNK = 1024\n", " frames = []\n", " data_str = wf.readframes(CHUNK) #read a chunk\n", " \n", " while data_str != '':\n", " data_int = np.fromstring( data_str, 'int16') # convert from string to int (assumes .wav in int16)\n", " data_flt = data_int.astype( np.float32 ) / 32767.0 # convert from int to float32\n", " frames.append( data_flt ) #append to list\n", " data_str = wf.readframes(CHUNK) #read a chunk\n", "\n", " return np.concatenate( frames )\n", "\n", "\n", "def play_audio( data, p, fs):\n", " # data - audio data array\n", " # p - pyAudio object\n", " # fs - sampling rate\n", " \n", " # open output stream\n", " ostream = p.open(format=pyaudio.paFloat32, channels=1, rate=fs,output=True)\n", " # play audio\n", " ostream.write( data.astype(np.float32).tostring() )\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task:** Play all the wav files from s1 to s5.\n", "\n", "Try to visualize what you would expect the spectrum to look like as a function of time. In each case, the sampling rate is 44100 Hz." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "## Play sounds\n", "fname = \"s1.wav\"\n", "\n", "fs = 44100\n", "\n", "# get sound file\n", "data = read_wav( fname )\n", "\n", "# instantiate PyAudio\n", "p = pyaudio.PyAudio()\n", "# play sound\n", "play_audio( data, p, fs )\n", " \n", "# terminate pyAudio\n", "p.terminate()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Task 1: Computing a simple spectrogram" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\"rect_win\"
\n", "
Figure 4: Extracting a segment of a long signal with a rectangular window
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For our first attempt at making a spectrogram, we will simply take the original signal of length N, and split it into blocks of length M. We then compute the DFT of each block. This corresponds to using a rectangular window to extract each block from the signal, as is illustrated in Fig. 4.\n", "\n", "Write a python function that computes the spectrogram for a signal. If the original signal is a vector `x`, It should\n", "\n", " * Break the signal up into `m`-sample blocks, stored in the columns of a 2D matrix `xm`. This may require padding the signal with zeros, so that the length is a multiple of the block size.\n", "\n", " * Apply the fft to the matrix using `xmf = fft(xm,len(xm),axis=0)`. This operation applies FFT along each column of the matrix. It does not do a 2D FFT.\n", "\n", " * Compute ranges for time `t_range` and frequency `freq_range`\n", "\n", " * Call `sg_plot(t_range,f_range,xmf[1:m/2,:])`, where we are only plotting the positive frequencies. Your routine should be invoked with\n", "\n", " myspectrogram(x,m,fs)\n", " \n", " where `x` is the signal, `m` is the block size and `fs` is the sampling rate in Hz.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One trick that can help you here is the reshape command. To create the `xm` matrix, first zero pad `x` to be a multiple of `m` in length:\n", " # pad x up to a multiple of m \n", " lx = len(x);\n", " nt = (lx + m - 1) // m # // enforces integer division\n", " xp = append(x,zeros(-lx+nt*m))\n", " \n", "Use reshape to make it an m by nt matrix in column major fashion. The order 'F' means a Fortran order, which is column first as opposed to the 'C' order, which is C language order and is row first. Matlab uses a default of column order, whereas numpy uses a default of row order:\n", " \n", " xm = reshape( xp, (m,nt), order='F')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each column of `xm` is one block of `x`.\n", "\n", "To compute the time and frequency range, recall that the DFT frequencies go from `0` to the sampling frequency `fs` Hz in steps of `fs/m` Hz. We are only going to plot the positive frequencies, so \n", "\n", " f_range = [0.0, fs / 2.0]\n", " \n", "The time of a particular block is the period of one sample, `1/fs` seconds, multiplied by the number of samples in the block. If there are nt blocks, the time range is\n", "\n", " t_range = [0.0, lx / fs]\n", " \n", "Try your spectrogram routine with the bird call sound file, `s1`, with a block size of `256` samples. Note, that the sampling rate of the data is `44100` Hz (Compact Disc rate). The spectrogram should look vaguely reminiscent of Fig. 3. However, there is noticeable streaking in the spectral dimension. Include a copy of this spectrogram in your notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Solution for Task 1: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def myspectrogram(x,m,fs, dbf=60):\n", " # function myspectrogram(x,m,fs)\n", " # Plot the spectrogram of x.\n", " # First take the original signal x and split it into blocks of length m\n", " # This corresponds to using a rectangular window \n", " # Inputs: \n", " # x - data\n", " # m - window size\n", " # fs - sampling rate\n", " \n", " # Your code here\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Plot\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Task 2: A better spectrogram." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly to what we have seen in class, the problem with the spectrogram from task 1 is that we have used a square window to cut out blocks of the signal. In the spectral domain this corresponds to convolving a periodic sinc with the spectra. The sinc sidelobes are high, and fall off slowly. We need to do something a little more gentle with the data.\n", "\n", "To improve our spectrogram, we will extract each block of data with a Hann window. We can do this by multiplying our `xm` matrix with a matrix that has a Hann window along its columns,\n", "\n", " xmw = xm * outer(hanning(m), ones(nt) );\n", "\n", "The `outer(hanning(m), ones(nt) )` term is a matrix that has the same Hann window in each of the nt columns. \n", "\n", "Another more \"pythony\" way is to use broadcasting:\n", "\n", " xmw = xm * hanning(m)[:,None]\n", "\n", "Incorporate this into your a new `myspectrogram_hann(x,m,fs)` function. \n", "\n", "Try your function on the bird song again. This time the spectrogram should look very similar to the Fig. 3. The location and size of the windows is shown in Fig. 5. Note that parts of the signal are effectively not used, at the ends of each window. This can result in discontinuities in the temporal direction of the spectrogram.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\"hann_win\"
\n", "
Figure 5: Extracting segments of a long signal with a Hann window
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Solution for Task 2:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def myspectrogram_hann(x,m,fs):\n", " # Plot the spectrogram of x.\n", " # First take the original signal x and split it into blocks of length m\n", " # This corresponds to using a Hanning window\n", " # Inputs: \n", " # x - data\n", " # m - window size\n", " # fs - sampling rate\n", " \n", " # Your code here\n", " \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Task 3: An even better spectrogram!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a final enhancement, we will overlap the Hann windows, as shown in Fig. 6. Each block of the signal, and column of the `xm` matrix overlaps the adjacent blocks by 50%. There are a number of ways to do this in python. One is to replicate each half block in `xp` before the reshape. Another direct approach would be to program it in a `for` loop. Note that you also have to modify the time vector, since you have twice as many time samples. Write a new Incorporate this into your a new `myspectrogram_hann_ovlp(x,m,fs)` function. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try this on the bird song. it should look much smoother than the previous case. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\"ol_hann_win\"
\n", "
Figure 6: Extracting segments of a long signal with a Hann window
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Solution for Task 3:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def myspectrogram_hann_ovlp(x,m,fs, dbf=60):\n", " # Function breaks the signal into 50%-overlapping windows, multiplies by hann window, computes the DFT of each window\n", " # The function then calls sg_plot to display the result\n", " #\n", " # Inputs: \n", " # x - data\n", " # m - window size\n", " # fs - sampling rate\n", " \n", " # Your Code Here:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Plot\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Task 4: Time and frequency resolution." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So far we've just set the block size to 256 samples, which is about 5.8 ms of data. The frequency resolution is then 44100/256 = 172.27 Hz. Ideally we'd like to have both time and frequency resolution. However, one comes at the expense of the other. For example if we wanted to improve our frequency resolution to 86.13 Hz, it would increase the time between samples to 11.6 ms.\n", "\n", "The optimum tradeoff between the two factors depends on the nature of the spectrum you are analyzing. A good example is the orca whale call, signal s3. This has segments that are very rapidly changing, and segments where the changes are slow. Try different block sizes, from 128, 256, 512, and 1024 samples. Note how the frequency resolution improves, while the temporal resolution degrades." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Solution for Task 4:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Your Code Here:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# You are now ready to move on to Part II of the lab!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }