Eulerian Video Magnification

Revealing the subtle temporal changes by leveraging the power of the Fast Fourier Transform
Tianhui (Lily) Yang and Kamyar Salahi CS194 SPRING 2020

Introduction

When viewing a scene, there is far more than what meets the eye. Often, scenes that may otherwise appear static contain minute temporal changes that are invisible to the human eye. However, these minute changes are still captured through sensory input and can thus be extracted and magnified to “enhance” otherwise imperceptible phenomena.

In order to extract these minute temporal changes, we will be using a bandpass filter that will enable amplifications of particular frequency bands corresponding to a particular phenomena (ie. breathing rate or the human pulse). We implement such a band pass filter by performing a fast-fourier transform and will leverage a laplacian pyramid to enable additional spatial frequency specificity. We will be following this paper in this “Eulerian Magnification”.

Procedure

How does this work?
(1) We decompose the frames into spatial frequencies using laplacian pyramids.
(2) We then utilize FFT to transform the time-series data into the frequency domain.
(3) Through element-wise multiplication, we create a band-pass filter by specifying desired frequency bands.
(4) We then magnify the output signal according to the desired amplification.

Note: Since we are working on a laplacian pyramid, we can specify the desired amplification for each layer in the laplacian pyramid. Thus, we can effectively achieve spatial amplification specificity by attenuating the amplification for certain spatial frequency bands.

(5) The extracted band pass signals are added back to the original signals.
(6) We perform an inverse FFT to transform the data back to the time domain.
(7) We then collapse the laplacian pyramid to obtain the final output

Laplacian Pyramid

In order to generate the laplacian pyramid, we first repeatedly blur and subsample the image to half its resolution, creating a gaussian pyramid. We then find the difference between the gaussian pyramid layers, thus dividing the image into spatial frequency bands, to create the laplacian pyramid. Here, we used cv2.pyrDown and cv2.resize for layer generation. Since the laplacian pyramid is the difference between the layers of a gaussian pyramid, simply collapsing the laplacian pyramid will generate the original image at full resolution.

Temporal Filtering

This is as simple as performing an FFT over the time axis of the frames. Here we used numpy.rfft, numpy.rfftfreq, and numpy.irfft, to generate the real valued FFT over the frames, determine the frequency at each index point, and invert the real valued FFT to return to the time domain. Standard numpy multiplification of the frequency domain data with the mask generated through numpy.rfftfreq was used here to filter the desired frequency bands.

Pixel Magnification

This is as simple as multiplying the data by a particular amplification factor which can be defined as variable dependent on the laplacian pyramid layer.

Image Reconstruction

This is as simple as collapsing the laplacian pyramid to generate each frame.

Results

Man’s Face

High Frequency: 1.5 Hz
Low Frequency: 1 Hz
Minimum Cut-Off (Floor to 0): 4th Laplacian Layer
Maximum (Ceil to Amplitude): 6th Laplacian Layer
Amplification: 120

Baby in Hospital

High Frequency: 2.7 Hz
Low Frequency: 2.3 Hz
Minimum Cut-Off (Floor to 0): 5th Laplacian Layer
Maximum (Ceil to Amplitude): 6th Laplacian Layer
Amplification: 120

Tea Diffusion

I decided to brew some red tea (hibiscus and rooibos) and record the brewing process. I subsequently fed the video into this algorithm with the following parameters to obtain the image shown below.
High Frequency: 1.5 Hz
Low Frequency: 0.5 Hz
Minimum Cut-Off (Floor to 0): 4th Laplacian Layer
Maximum (Ceil to Amplitude): 5th Laplacian Layer
Amplification: 20

Magnified Diffusion:

For reference, the original video is shown below:

Bells and Whistles

Baby in Crib

Power vs Frequency

Plot of Power vs Frequency for the Middle Pixel of Face at Pyramid Level 4 After Magnification

In-Phase Power vs Frequency:

Luma Power vs Frequency:

Quadrature Power vs Frequency:

Plot of Power vs Frequency for the Middle Pixel of Face at Pyramid Level 4 Before Magnification

In-Phase Power vs Frequency:

Luma Power vs Frequency:

Quadrature Power vs Frequency: