Lab1, Part III: Software Defined Radio

The rtl-sdr usb dongles enables you to obtain samples from the electromagnetic spectrum around you. In very general terms, the dongle contains several components:

1. The antenna couples to received electromagnetic fields and tiny currents are produced in it.
2. A tuner IC circuit amplifies the signal, filters it, demodulates it to an intermediate frequency where it is filtered again. The dongles we distributed in class contain either the Refael Micro 820T (Black dongles) tuner or the Elonics E4000 (white dongles).
3. All dongles are equipped with the realtek RTL2832U (hence rtl-sdr). Although the chip is capable for many things, we use it for analog to digital conversion and for the USB interface. The RTL2832U samples the signal that is coming from the tuner and spits out samples to the computer through the USB interface.

The SDR returns samples at a desired rate up to 2.4MS/s of a part of the spectrum around a desired center frequency. For example, setting a center frequency $f_0 = 88.5\cdot 10^6$ and a sampling rate of $Fs=2\cdot 10^6$ will result in a complex valued sequence $x[n]$ whos DTFT corresponds to the physical frequency range of $87.5\cdot 10^6 < f < 89.5\cdot 10^6$. In other words, the digital frequency $\omega=0$ of $X(e^{j\omega})$, the DTFT of $x[n]$, will correspond to the physical frequency $88.5$MHz. The digital frequency $\omega=\pi$ will correspond to $89.5$MHz and $\omega=-\pi$ will correspond to $87.5$MHz.

1. How come the sequence $x[n]$ is complex valued ???

2. Consider the case when there is a transmitter which outputs a pure frequency at 89MHz. We choose a center frequency of 88.5MHz and sampling rate of 2MHz. The spectrum of $x[n]$ will not be symmetric, and has to be complex valued! The received signal would be $x[n] = e^{i2\pi500000/2000000n} = e^{i\pi/2n}$ which will have a single frequency at $\omega=\pi/2$ --> corresponding to 89MHz.

Before you start, you must install the rtl-sdr drivers and the python support. Follow the instructions on the class website.

To learn about what you can do with SDR's, I recommend you watch this youtube video. Most (not all) the stuff shown there can be done using rtl-sdr. At minute 5:00 you will see an example of ADS-B, which you will partly implement in this lab.

In [40]:
from IPython.display import YouTubeVideo
# A video on what you can do with Software defined radio. The B200 is a high-end SDR which is capable to much more than the rtl-sdr.
# however, most of the stuff shown in the video could be done with the rtl-sdr as well.

Out[40]:
In []:
# Import functions and libraries
import numpy as np
from numpy import *
import matplotlib.pyplot as plt
import pyaudio
from numpy import pi
from numpy import sin
from numpy import cos
from numpy import sqrt
from numpy import zeros
from numpy import r_
from scipy import signal

from rtlsdr import RtlSdr
from numpy import mean
from numpy import power
from numpy.fft import fft
from numpy.fft import fftshift
from numpy.fft import ifft
from numpy.fft import ifftshift
%matplotlib inline


Task I: Capture data and compute the spectrum

Let's look at a simple example of acquring data and looking at its spectrum. NOAA weather radio is transmitted from San Francisco at 162.4MHz It is a 5KHz frequency modeulated audio signal that voices a recording of the weather. If you live in the east-bay, you might also be able to pick up the Mt. Diablo transmitter at 162.425MHz. Here's a map of the SF coverage:

It's best if you are outside of the building when collecting samples. Let's collect 4 seconds worth of samples ,sampled with a center frequency of 162MHz and a sampling rate of 1MHz.

To Instantiate the sdr with the following parameters:

sdr = RtlSdr()
sdr.sample_rate = 1000000    # sampling rate
sdr.center_freq = 162e6   # 162MhZ center frequency
sdr.gain = 36
In []:
# Your Code Here:


The python interface to the SDR required that the number of samples be a multiple of 256. To read samples from the SDR, run the following command:

N_samples = 1024000*4 # approximately seconds
y = sdr.read_samples(sdr.sample_rate*t_total)   # get samples

If you are done with the device, you can clear it by: sdr.close()

In []:
# Your Code Here:


We will only use a small portion of it to process and look at the spectrum. Let’s crop the samples and analyze the spectrum of a portion of 8000 samples of it. We will use the function fft to compute the DFT and then fftshift to center the DFT around $\omega=0$. Intead of showing the spectrum in terms of $\omega$, plot it in terms of the physical frequencies. Use a Kaiser window with $\beta=6$. Use the function plt.semilogy to plot in log-scale. Make sure the axis is tight using plt.axis and that the aspect ratio of the figure is wide, so you can see the spectrum better

In []:
# Your Code Below:

# Plot
width, height = figaspect(0.2)
fig=plt.figure(figsize=(width,height))
plt.semilogy( f/1e6, abs(Y_w)), plt.xlabel('frequency [MHz]');
plt.title('spectrum');


You should be able to see some energy in the spectrum around the right frequency. However, what you will notice immediately is that the spectrum is noisy. In addition there might be some spurious peaks and in the white dongles (E4000) there is also a large peak at the DC frequency. The peak at the DC frequency is due to constant bias in the ADC of the device. This translates to an impulse at $\omega = 0$. Spourius peaks come from local oscillator leak and also quantization erros.

To reduce the noise and get a finer look at the spectrum, we will break our entire sequence to smaller sequences. We don't need such a good spectral resolution, so we will break it to chunks sized 800 samples, calculate their magnitude spectrum and average. This is also called average power specrum.

• Reshape the sequence into a matrix with row size of 800. Remember that the ordering of the matrix is 'C' style and hence row-first.
• Multiply the rows by a kaiser window with $\beta=6$
• Compute fft and fftshift along the 2nd dimension.
• Compute the absolute square of the result and average along the 1st dimension
• Plot the result
In []:
# Your Code Below:
# Compute the average power spectrum

In []:
# Plot the average power spectrum:


This plot is called average power spectrum. As you can see, the spectrum looks much nicer now. Can you identify the NOAA weather station? Only the wide spectrum peaks are real. BTW, The very narrow spiky peaks in the spectrum are probably not real signals. They are due to leakage of the local oscilator, non-linearity in the receiver, and poor dynamic range of the ADC of the device. Still... for 12

• What is the spectral resolution in the plot?
• Repeat the the measurement and the average power spectrum for the center frequency 88.3MHz. Can you identify NPR 88.5 Station?

Note: The rectangular bands you see around the main signal are HD radio signals. These are transmitted at the band edges and provide digital radio programming. Unfortunately HD radio in the USA uses proprietery protocol, and we can not decode it at this time.

In []:
# Your code here:


Task II: Capturing, detecting and extracting Mode S ADS-B Packets with RTL-SDR

Mode S is the transponder protocol used in modern commercial aircraft. A Mode S-equipped aircraft replies to radar interrogation by either ground radar (secondary surveillance) or other aircraft ("Traffic Collision Avoidance System", or TCAS).

Automatic Dependent Surveillance-Broadcast (ADS-B) is a communication protocol using the Extended Squitter capability of the Mode S transport layer. The protocol is:

• Automatic: it requires no pilot input
• Dependent: it is dependent on altimeter, GPS, and other aircraft instrumentation for information
• Surveillance: it provides current information about the transmitting aircraft

ADS-B-equipped aircraft broadcast ("squitter") their position, velocity, flight number, and other interesting information to any receiver within range of the aircraft. Position reports are typically generated once per second and flight indentification every five seconds.

Packets are sent at 1090MhZ and use Pulse Position Modulation (PPM) at a rate of 1Mbit per second. The simple modulation scheme is designed to be trancieved by cheap hardware -- which is very much what we are doing in this lab!

Pulse position modulation belongs to the family of amplitude modulated signals and in particular to the type of amplitude-shift-keying (ASK) or on-off-keying (OOK). In OOK, 0, 1 bits are sent by turning the carrier on and off. PPM uses Manchester encoding variant of OOK. The idea in Machester encoding is that in each bit interval there is always a transition between on and off and the transition occurs in the middle of the bit interval. For example, a transition from off to on would represent a 0 and a transition from on to off would represent a 1. Manchester codes have the advantage of being self clocking so the clock can be recovered from the data stream.

A mode S ADS-B packet consists of an $8\mu$s preamble followed by a data block of 56 or 112 $8\mu$s data block. Here's a diagram that illustrates a typical packet:

Now, since the bit-rate is 1Mbit per second, then a sampling rate of 2Msample/sec should be sufficient to capture the bit transitions, detect the preamble and deode packets. In practice, ADS-B backets can be decoded using the rtl-sdr, but the captured data does not look like the ideal packet. Here's an example of a part of a captured packet:

As you can see, there are many distortions: the key-off is not zero and there are variations in the amplitudes of the key-on. These come from various sources including the anti-aliasing filters, interfering signals, noise, gain control and more. However, it is possible to clearly see the preamble and the packet bit content! This is pretty amazing as ADS-B equipment is very expensive.

There are several types and subtypes of "Squitters": Short squitters are 56bits (8bits Control, 24bits address, 24bits parity check), Long squitters are 112bits (8bits control, 24bits address, 56bits ADS message, 24bits parity). Only the long squitters have location information in them.

Acquire samples

When acquiring data is highly recommneded that you be OUTSIDE or close to a BIG window. Electromagnetic radiation at 1090MHz does not penetrate through walls. In addition, the antenna you got with the dongle is not ideal for ADS-B reception. It is also recommended that you do this part during the day and not at night, when the air-traffic in the bay area is low.

• Set the rtl-sdr to a center frequency of 1090MHz, sampling rate of 2MS/s and a gain of 38.6 . You can play with the gain later if the results are not satisfactory.
• Acquire 2,048,000 samples (approx 1 sec) and take their magnitude. We will only work with magnitude data.
• Plot the magnitude of the received signal. Do you see any strong bursts in the amplitude? If you don't, repeat this task untill you do. This will make sure you have data to work with.
In []:
# Your code below:

width, height = figaspect(0.2)
fig = figure(figsize=(width,height))
plot( abs(y ))
xlabel('samples')
sdr.close()

• Save the magnitude of the data to a file. This will help you in the programming and debugging portions, so you don't have to reacquire date and rely on airplanes to be present all the time. Use the feature numpy.save('adsb.npy',y) to save the array. Use numpy.load to load it back.
In []:
#numpy.save('adsb.npy',y)

In [38]:
# Here's an example of packets we (Frank and Miki) got:


Before we start at automatically detecting packet, let's look at one.

• Manually look for a packet in the received signal. Look for ones with high amplitude.
• Display the packet.
In [39]:
# Our example would look like:
packet = abs(y_saved[282579:282579+135])
width, height = figaspect(0.2)
fig = figure(figsize=(width,height))
stem(packet)


Out[39]:
<Container object of 3 artists>


• Is this a long or short squitter?
• Manually decode the 1st 8 bits of the packet. Provide the values with your answer.

• This is a short squitter
• The bits are: 01011101, which is 5D in hex

Preamble Detection

We would like to automatically detect packets. Detecting the preamble is a good way to start, as the preamble is the same for all packets. In general, a good detector should have the minimum number of false negatives and false positives. For ADS-B packets, we care more about false negatives, since this means that we may be missing an important packet. At the same time, a false positive would result in a packet which we can not decode, which is not harmful, except that we spend time in extra computation to perform parity checks.

Noise floor estimation

When looking at detection, it is useful to know what is the noise level compared to the signal level. This can help setting thresholds for detecting signals. When noise has Gaussian statistics, we can calculate the sample mean and sample standard deviation to estimate its distribution. Unfortunately, our received signal is not only contaminated by Gaussian random noise. It is also contaminated by interference as well as bursts of ADS-B packet data. If we compute the sample mean and standard deviation, they could be biased by strong outliers coming from the packet bursts.

Here's where the median can come to our rescue. The median is a measure, which is robust to outliers, which we can use to estimate the mean. In order to robustly estimate the standard deviation we can use the Median Absolute Deviation (MAD) estimator. The MAD is computed in the following way: $\mathrm{MAD}(X) = \mathrm{median}(|X - \mathrm{median}(X)|)$

As it turns out, the standard deviation can be approximated from the MAD by: $\sigma \approx 1.4826 \mathrm{MAD}$

• Compute the median
• Compute the MAD and estimate the standard deviation.
In []:
# Your code below:


Signal Amplitude Detector

Optimal thresholds based on noise are beyond the scope of this lab. However, we need to come up with a threshold for which we can say that a received signal (ANY signal, not just ADS-B) is above it. A save bet would be the mean plus some multiples of the standard deviation. A factor of 3 to 5 is reasonable.

• set the threshold to be thresh = y_median + N*y_std, when N is 3 to 5. (Experiment!)
• find all the indexes for which the received signal is greater than the threshold. We will use these indexes later to detect preambles

Once you estimated the threshold, you can reuse it for subsequenct captures, and do not need to recalculate it. You can refresh the estimates periodically as things may change over time.

In []:
# Your code below:


Normalized Cross Correlation Detection

Detecting that a signal is received does not mean that it is the right signal! We would like to make sure it is the preamble. There are many ways we can go about detecting the preamble. We already saw in part II of the lab that matched filtering can be used for detection. Take a look at the ideal preamble below, and a preamble we (Frank and I) measured when preparing the lab.

We saw that matched filtering is implemented using cross correlation. Here, we will also use cross correlation. However, we need to remember to subtract the mean from each of the signals! (We didn't do it for the chirp, since the chirp has zero mean). The cross correlation is defined as: $R_{xy}[n] = \sum_{k=0}^{15} (x[n+k]- \hat{x}_n) (y[k] - \hat{y} )$ , where $\hat{x}_n$ is the mean of $x$ in a window size 16 around the index $n$.

The issue is that the result would depend on the signal's magnitude. It will be hard to give a "score" of detected or not. Therefore we will use a normalized cross correlation instead, which is defined as $\hat R_{xy}[n] = \frac{\sum_{k=0}^{15} (x[n+k]- \hat{x}_n) (y[k] - \hat{y} )}{||x[n]-\hat{x}_n||\cdot ||y-\hat y||}$ This will guarentee that the result is bounded between -1 and 1.

To make the implementation fast and easy, we will first write a function that calculates the normalized cross correlation of a 16 sample window of the signal. We will then, loop through only the indexes of the signal for which the signal amplitude detection was above the threshold. Using for loops is not an efficient implementation in Python, but it will do for this purpose. You are more than welcome to write your own efficient implementation.

• Write a function detectPreambleXcorr(chunck,corrthresh). The function accepets an array of 16 samples, e.g., $x[n]\cdots x[n+15]$. It evaluates the normalized cross correlation with an ideal preamble, and returns a True if it is greater than corrthresh
In []:
def detectPreambleXcorr(chunck,corrthresh):
# The function accepets an array of 16 samples, e.g., x[n]⋯x[n+15].
# It evaluates the normalized cross correlation with an ideal preamble,
# and returns a True if it is greater than corrthresh

# preamble withour mean
preamble = array([1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0]) - 0.25


• Loop over the inputs for which the signal amplitude was above the amplitude threshold to detect packets. Set a correlation threshold of 0.85. Store all the positions of detected packet in an array idx_corr. If you did not detect any packets, reduce the threshold. (Warning: This operation may take more than a minute on some laptops for a low signal threshold)
In []:
# Your code below:

• Create a matrix, in which each row contains a detected packet. Make each row 112*2 long to accomodate the long squitters.
• Display the matrix as an image. Do you see Short and long squitters?
In []:
# Your code below:

In []:
# Display code below:

pwidth, height = figaspect(0.3)
fig = figure(figsize=(width,height))
plt.imshow(msgs, aspect='auto', cmap="gray")
plt.ylabel("Incoming messages")


Preamble Detection Using Logic

Unfortunately, our received preambles have variying total amplitudes (planes are near and far), have bias, and also exhibit amplitude modulation. At the same time, the preamble is relatively short. Instead of matched filtering, we can devise a non-linear descriptor which may be more robust. Here, we will look at detecting the preamble using logic.

Considering again the ideal preamble. Notice, that there are four values which are high and there are 12 values which are low. If we just consider logic, e.g., comparing high's and low's , we can make our detection significantly more robust to the bias and amplitude modulation.

Of course, it is possible to introduce more conditions. For example, in order to avoid detection of noise signal that resembels the logic, we can set a minimum limit on the amplitude of high's we consider a detection. In addition, to be able to separate between high's and low's we need to have a good amplitude margin -- so we can set a threshold for a minimum average margin between them.

• Implement a function detectPreambleLogic(chunk). The function will accept a chunck of 16 samples and will return True if a preamble was detected and False if it was not. The basic logic of the function should be based on comparing high's and low's bits. If you want to add thresholds as well, you are more than welcome!
In []:
def detectPreambleLogic(chunck):
# Function accepts a 16 length array. Returns True if an ads-b preamble is detected


• the inputs for which the signal amplitude was above the amplitude threshold to detect packets. Store all the positions of detected packet in an array idx_logic. Note that depending on your signal threshold, this task may take more than 30 seconds to complete.
In []:
# Your code below:


• Create a matrix, in which each row contains a detected packet. Make each row at least 112*2 long
• Display the matrix as an image. Do you see Short and long squitters?
• Did you get more packets? Less? More false positives? ( Your results are going to depend on the signal detection threshold as well)
In []:
# Your code below:

In []:
# Display code below:


Extract Bits

First, we will need to separate between long and short squitters. There are many ways to do this.

• Write a function long_sq, short_sq = separateSquitter(msgs). The function takes a matrix in which rows are the detected messages. The function returns two matrixes, one for long squitters and one for short.

• Use the function to separate to short and long squitters

In []:
def separateSquitter(msgs):
# detect based on energy in second half compared to first.

return long_sq, short_sq

In []:
# Your code below:


Once packets are detected, it is very easy to extract the bits of the packets. Recall that each bit is encoded in 2 samples and that the transition defines the bit value. Therefore if sample 1 is greater than sample 2 then the bit value will be 1. Otherwise it will be zero.

• Compute the bits of the messages you detected for long and short squitters.
• Long squitters always start with the bits: 10001. You can use that to remove some falsly detected packets.
In []:
# Your code Below:


Bonus: Decode Airplane position:

• Decoding the packets is beyond the scope of this lab. However, if you are interested you can take a look at this link: http://www.lll.lu/~edward/edward/adsb/DecodingADSBposition.html. There's a nice explenation how to decode position.