Final Project. Eulerian Video Magnification

Michael Wan, SID: 3034012128

Overview

This project implements a technique known as Eulerian video magnification to extract certain "temporal variations" in videos that typically can't be seen by the eye and magnify them. Videos are filtered temporally to extract these signals, which are then added back to the original video to give the magnified result. This technique is very powerful and allows us to even see blood flow.

Part 1. Constructing the Laplacian Pyramid

The Laplacian pyramid was constructed by subtracting between consecutive levels of the Gaussian pyramid. The last level of the Laplacian pyramid is simply the last level of the Gaussian pyramid, to allow for reconstruction. Because the images on each level of the Laplacian pyramid are different, the reconstruction process requires us to continually upsample the pyramid layers when summing them up.

Time-series video of level 0 of "Baby2" Laplacian pyramid

Part 2. Filtering

I implement a simple bandpass (ideal temporal) filter. Specifically, I convert the RGB video into YIQ frames, and then calculate the Laplacian pyramid values over time. For each level of the Laplacian pyramid, I filter across the time axis. The filtration is done using FFT, where the time series signal is converted into the frequency domain, where we only keep frequencies between our frequency cutoff parameters. Then, we use IFFT to convert back into the time domain, reconstruct the Laplacian pyramid, and then convert the YIQ frames back into a RGB videos.

Part 3. Outputs

It is important to note that the videos were read at a given sample rate to preserve memory due to persistent crashes in Google Colab / on my local machine.

Magnified "Face" video (0.83 Hz - 1 Hz, 30 FPS, Filter Amplified 30x, Sample Rate = 3) and the filter, respectively


Magnified "Face" video (0.83 Hz - 1 Hz, 30 FPS, Filter Amplified 10x, Sample Rate = 5) and the filter, respectively


Magnified "Baby2" video (2.33 Hz - 2.67 Hz, 30 FPS, Filter Amplified 30x, Sample Rate = 5) and the filter, respectively


Magnified "Andrew" video (2 Hz - 3 Hz, 30 FPS, Filter Amplified 15x, Sample Rate = 3) and the filter, respectively

Part 3. Output Analysis

Overall, although the videos created did have visible pulsating effects, they weren't as crisp as the ones reproduced in the paper. This is likely due to the lack of granularity in the hyperparameter search. Any modification to the video sample rate, frequency cutoffs, amplifying magnitude, and other filter hyperparameters could result in nontrivial changes to the output video. Additionally, in filming the "Andrew" video, micro-tremors from the cameraman (me) can lead to noise in the filtering process and thus the output video. This problem proved to be quite challenging to overcome.