CS194 Final Project: HDR and Eulerian Video Magnification

Billy Chau

Part 1: HDR

The idea behind HDR is to recover an image of a scene such that this image can not be taken from any exposures of a camera because of the limitation of the scene, e.g. blown-out background, etc. In order to recover high dynamic range image, the program first has to recover a radiance map from a collection of images (ranging from different exposures of the same scene). Then the program has to convert the radiance map into a display image by either global or local tone mapping. For more detail of the maths behind the algorithm, please see the website.

Recovering Radiance

In order to recover the radiance, we have to pick some pixel locations across all images to calculate the scene exposure. Since we have the information about the exposure time, we can recover the exposure by recovering the inverse function, which maps pixel value (0 - 255) to the corresponding radiance-exposure product. In addition, we create constraints to force the inverse function to behave as with the priors that we know for the g function, which is that the tails should be pointing to infinity. After setting up the problem, we can then solve this problem with least square with L2 constraint (which can be reduced into least square).

Radiance Map
Relative Radiance Map
Radiance Map
Relative Radiance Map

The left column is the recovered radiance map with the right column depicts the relative radiance map. We can explore the recovered g function further below.

Recovered g function for r channel
Recovered g function for g channel
Recovered g function for b channel
Recovered g function for r channel
Recovered g function for g channel
Recovered g function for b channel

It is interesting to see that the first row which is generated from the church does not have the infinite tail at the right, and I think it is because the number of bright pixels are relatively less than the number of dim pixels, so the contributions of the actual bright pixels are really small. We can also see the impact of lambda here. With lambda=100, the curve at the top row is really smooth while with lambda=10, the curve is not as smooth at the bottom row.

Tone Mapping

After being able to recover the radiance map, we will transform it into a representable image via either global or local tone mapping. One simple global tone mapping is gamma correction in which all the pixels of the image is raised by some power. On the other hand, local tone mapping only works on specific areas where satisfy the criteria of tone mapping. For example, bilateral tone mapping will extract the large scale intensity from the image so that we can separately add contrast to detail and structure layers.

Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected
Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected
Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected
Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected
Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected

Bell & Whistle - Try out on own image

0.02s
0.008s
0.002s
0.0005s
0.00025s
Global Gamma Corrected
Detail Layer from Bilateral Filter
Structure Layer from Bilateral Filter
Bilateral Filter Local Tone Mapping
Bilateral Filter Local Tone Mapping with Global Gamma Corrected

Part 2: Eulerian Video Magnification"

The idea behind Eulerian Video Magnification is to amplify some temporal signals of interest to unveil the secret that human eyes can't see normally. For example, someone might want to use this algorithm to recover the heart beat of a baby in ICU with a sequence of images. There are four parts: Laplacian Pyramid, Temporal Filtering, Pixel Change Magnification, and Image and Video Reconstruction.

Laplacian Pyramid

We should be really familiar with the idea of laplacian pyramid because we have done something similar in project two except that is a stack instead of a pyramid. In order to get the pyramid, the program has to scale down the gaussian stack as the level goes down and generate the corresponding laplacian pyramid by scaling back the next level gaussian pyramid and subtracting with the original level.

Temporal Filtering and Pixel Change Magnification

With the laplacian pyramid at hand, we could extract components with similar bands from the image, i.e. we could extract high and low frequency content separately. With this, we could temporally bandpass frequencies of interest. I have used a butter bandpass filter in this case, and this can filter out frequencies of interest of each pixel across time. For example, we could set the low and high cutoff frequency of the filter to be 0.83 and 1 to extract pixel changes that correspond to heart pulse. After filtering out the frequencies of interest, we could amplify this pixel signals accordingly to show its hidden pattern.

For example, the temporal pattern of the pixel change is not obvious if the images are not appropriately filtered and magnified (top row). When the sequence of images are properly filtered and magnified, the temporal pattern of the pixel change is much obvious (bottom row).

Image and Video Reconstruction

In order to reconstruct the filtered and amplified image, we have to merge the scaled laplacian pyramid across different levels and add the last layer of the scaled gaussian pyramid to reconstruct the final image. After that, we have to create the video by merging images in a sequence.

Result

Amplified - lower resolution due to memory limits

Observation: By increasing the alpha, the signal is obviously amplified but at the same time the background (noise) is also amplified. In order to fix the amplfication of the noise, I tried to shrink the range of the low and high cutoff frequencies.

Amplified

Please note that the noise is really bad for this case, and I expect it is from the wide range of cutoff frequencies.

Amplified - E String

Amplified - A String

Finally, we have a magnified fire on the stove.

Challenge

Although the concept of this algorithm is very straight-forward, there are lots of challenges in tuning for the result. In order to unveil the hidden patterns in the best way, there are lots of trial and error of trying out different parameters, including low and high cutoff frequencies, size of the laplacian pyramid, choice of band-pass filter, and amplifying factors for different channels. This results in the most difficult part of the project. Moreover, the fact that the algorithm requires lots of memory for a video limits my implementation to get better result because I have to batch process the video so it might impact the performance of the temporal filtering.

Final Remarks

I have learnt a tons from this class and thanks for everyone making this possible especially in this crazy time - 2020.