The idea behind HDR is to recover an image of a scene such that this image can not be taken from any exposures of a camera because of the limitation of the scene, e.g. blown-out background, etc. In order to recover high dynamic range image, the program first has to recover a radiance map from a collection of images (ranging from different exposures of the same scene). Then the program has to convert the radiance map into a display image by either global or local tone mapping. For more detail of the maths behind the algorithm, please see the website.
In order to recover the radiance, we have to pick some pixel locations across all images to calculate the scene exposure. Since we have the information about the exposure time, we can recover the exposure by recovering the inverse function, which maps pixel value (0 - 255) to the corresponding radiance-exposure product. In addition, we create constraints to force the inverse function to behave as with the priors that we know for the g function, which is that the tails should be pointing to infinity. After setting up the problem, we can then solve this problem with least square with L2 constraint (which can be reduced into least square).
|
|
|
|
The left column is the recovered radiance map with the right column depicts the relative radiance map. We can explore the recovered g function further below.
|
|
|
|
|
|
It is interesting to see that the first row which is generated from the church does not have the infinite tail at the right, and I think it is because the number of bright pixels are relatively less than the number of dim pixels, so the contributions of the actual bright pixels are really small. We can also see the impact of lambda here. With lambda=100, the curve at the top row is really smooth while with lambda=10, the curve is not as smooth at the bottom row.
After being able to recover the radiance map, we will transform it into a representable image via either global or local tone mapping. One simple global tone mapping is gamma correction in which all the pixels of the image is raised by some power. On the other hand, local tone mapping only works on specific areas where satisfy the criteria of tone mapping. For example, bilateral tone mapping will extract the large scale intensity from the image so that we can separately add contrast to detail and structure layers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The idea behind Eulerian Video Magnification is to amplify some temporal signals of interest to unveil the secret that human eyes can't see normally. For example, someone might want to use this algorithm to recover the heart beat of a baby in ICU with a sequence of images. There are four parts: Laplacian Pyramid, Temporal Filtering, Pixel Change Magnification, and Image and Video Reconstruction.
We should be really familiar with the idea of laplacian pyramid because we have done something similar in project two except that is a stack instead of a pyramid. In order to get the pyramid, the program has to scale down the gaussian stack as the level goes down and generate the corresponding laplacian pyramid by scaling back the next level gaussian pyramid and subtracting with the original level.
With the laplacian pyramid at hand, we could extract components with similar bands from the image, i.e. we could extract high and low frequency content separately. With this, we could temporally bandpass frequencies of interest. I have used a butter bandpass filter in this case, and this can filter out frequencies of interest of each pixel across time. For example, we could set the low and high cutoff frequency of the filter to be 0.83 and 1 to extract pixel changes that correspond to heart pulse. After filtering out the frequencies of interest, we could amplify this pixel signals accordingly to show its hidden pattern.
For example, the temporal pattern of the pixel change is not obvious if the images are not appropriately filtered and magnified (top row). When the sequence of images are properly filtered and magnified, the temporal pattern of the pixel change is much obvious (bottom row).
In order to reconstruct the filtered and amplified image, we have to merge the scaled laplacian pyramid across different levels and add the last layer of the scaled gaussian pyramid to reconstruct the final image. After that, we have to create the video by merging images in a sequence.
Amplified - lower resolution due to memory limits
Observation: By increasing the alpha, the signal is obviously amplified but at the same time the background (noise) is also amplified. In order to fix the amplfication of the noise, I tried to shrink the range of the low and high cutoff frequencies.
Amplified
Please note that the noise is really bad for this case, and I expect it is from the wide range of cutoff frequencies.
Amplified - E String
Amplified - A String
Finally, we have a magnified fire on the stove.
Although the concept of this algorithm is very straight-forward, there are lots of challenges in tuning for the result. In order to unveil the hidden patterns in the best way, there are lots of trial and error of trying out different parameters, including low and high cutoff frequencies, size of the laplacian pyramid, choice of band-pass filter, and amplifying factors for different channels. This results in the most difficult part of the project. Moreover, the fact that the algorithm requires lots of memory for a video limits my implementation to get better result because I have to batch process the video so it might impact the performance of the temporal filtering.
I have learnt a tons from this class and thanks for everyone making this possible especially in this crazy time - 2020.