**CS294-26 Project 1 - Colorizing the Prokudin-Gorskii Photo Collection** By Neerja Thakkar Part 1: Fun with Filters =============================================================================== Finite Difference Operator ----------------- First, I convolve my image with the finite difference operators: $D_x = [1 -1]$ and $D_y = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$ This results in the x and y-derivatives of the image: ![x-derivative](out/cameraman_grad_x.png width=200) ![y-derivative](out/cameraman_grad_y.png width=200) Now, I combine these two images to get the gradient magnitude $||\nabla f||$. For this, I use the equation $$ ||\nabla f|| = \sqrt{ (\frac{\partial f}{\partial x})^2 + (\frac{\partial f}{\partial y})^2 }$$ $\frac{\partial f}{\partial x}$ is the x-derivative image and $\frac{\partial f}{\partial x}$ is the y-derivative image. This resulted in the following image: ![gradient magnitute](out/cameraman_grad_mag.png width=300) Finally, I binarized the image (everything above a certain threshold was set to 1, while everything below that threshold became 0). ![binarized gradient magnitude](out/cameraman_edges.png width=300) Not all of the noise was suppressed in the image, but it still shows the true edges reasonably well. Derivative of Gaussian (DoG) Filter ------------------------------------------------------------------------------- Now, we smooth the image with the low-pass Gaussian filter. This is the filter that I used: ![2d 11x11 Gaussian](out/2d_gaussian.png width=110) And this is the resulting blurred cameraman image: ![blurred cameraman](out/gaussian_blurred_cameraman.png width=300) Now, applying the same finite difference operators to this image, we get the following gradient magnitude and edges. ![blurred cameraman gradient magnitude](out/blurred_cameraman_grad_mag.png width=200)![blurred cameraman binarized gradient](out/blurred_cameraman_edges.png width=200) We see that the edges are much stronger and there is much less noise. Now, I was able to set a threshold where all of the true edges are easily visible, but the noise in the grass is not. Next, I created a Difference of Gaussian filter, from convolving the Gaussian filter with each of the finite difference operators. (Note that these results are normalized for visualization.) ![Difference of Gaussian in x](out/dog_x.png width=110)![Difference of Gaussian in y](out/dog_y.png width=110) Then, I convolved the cameraman image with each of these filters, and combined the resulting images into a gradient magnitude image. The results were almost exactly the same as first convolving the image with the Gaussian filter and then using the finite difference operators, since convolution is commutative and associative. ![DoG gradient magnitude](out/dog_cameraman_grad_mag.png width=200)![DoG binarized gradient](out/dog_cameraman_edges.png width=200) Image Straightening ------------------------------------------------------------------------------- It would be great to be able to automatically straighten a photo that has the most possible veritical and horizontal edges. In order to do this, I used the following method: 1. For an image, test out rotation angles from -8 degrees to 8 degrees, in 1 degree increments. For each angle, do steps 2-7 2. Rotate the image by the proposed angle 3. Crop 15% from each side of the image so that we are just using roughly the middle 2/3 of the image 4. Compute the edges of the images by blurring it and then using the finite difference operators, then binarize the image to find the true edges 5. Compute the gradient angles of the entire image 6. Treat the edges computed in step 4 as a mask, and get the gradient angles from step 5 just at the edges 7. From the gradient angles at the edges, count the number of pixels that have gradient angles within half a degree of -180, -90, 0, 90 or 180 degrees. Divide this by the number of pixels that I computed angles for, just to make sure that we're looking at a percentage of "straight" angles, since the angles counted might vary when rotating 8. Pick the rotation that yields the maximum close to straight edges Here are some results, along with their orientation histograms (note that rotated images are cropped): Facade image, rotated by -3 degrees ![facade - before](facade.jpg width=300)![facade - rotated](out/facaderotated_-3.png width=300) ![facade histogram - before](out/facadehist_0.png width=300)![facade histogram - rotated](out/facadehist_-3.png width=300) Bodega image, rotated by -7 degrees ![bodega - before](cava.jpg width=300)![bodega - rotated](out/cavarotated_-7.png width=300) ![bodega histogram - before](out/cavahist_0.png width=300)![bodega histogram - rotated](out/cavahist_-7.png width=300) Bed image, rotated by -2 degrees ![bed - before](bed.jpg width=300)![bed - rotated](out/bedrotated_-2.png width=300) ![bed histogram - before](out/bedhist_0.png width=300)![bed histogram - rotated](out/bedhist_-2.png width=300) The following image is a failure case, because there aren't very many edges that would reasonably be "straight". It is rotated by -8 degrees ![montserrat - before](montserrat.jpg width=300)![montserrat - rotated](out/montserratrotated_-8.png width=300) ![montserrat histogram - before](out/montserrathist_0.png width=300)![montserrat histogram - rotated](out/montserrathist_-8.png width=300) Part 2: Fun with Frequencies ======= Image Sharpening ----------------- Here is the original image sharpened both with the subtraction + adding back of high frequencies method, and with the unsharp mask filter. ![taj image original](taj.jpg width=300)![taj image high frequencies](out/sharpening_details.jpg width=300) ![taj image sharpened with addition of high details, $\alpha = 0.3$](out/taj_sharpened_subtraction.jpg width=300)![sharpened with unsharp mask filter, $\alpha = 0.3$](out/taj_unsharp_mask.jpg width=300) I then played around with increasing the alpha value to get even more "sharpness". ![sharpened with unsharp mask, $\alpha = 1.0$](out/taj_unsharp_mask_alpha_1.jpg width=300)![sharpened with unsharp mask, $\alpha = 4.0$](out/taj_unsharp_mask_alpha_4.jpg width=300) Next, I tried my sharpening on a few more images. ![original image](montserrat.jpg width=300)![sharpened image, $\alpha=1.0$](out/montserrat_sharp.jpg width=300) ![original image](zgz.jpg width=300)![sharpened image, $\alpha=1.0$](out/zgz_sharp.jpg width=300) Finally, I tried first blurring an image, and then "sharpening it". It's interesting to note that it does not look like the original image, even though the larger edges of the image are more prominent, since a lot of high frequency information is lost through the blurring, and this sharpening trick cannot recover that information. ![blurred image](out/zgz_blurred.jpg width=300)![sharpened image, $\alpha=4.0$](out/zgz_blurred_sharp.jpg width=300) Hybrid images --------------- Here are my example results. For all of them, I played around with the low and high pass filter sigma values. Barack Obama and Kamala Harris * Low frequency: Harris, $\sigma = 6$ * High frequency: Obama, $\sigma = 3$ ![Obama](out/obama_smile.png width=200)![Harris](out/harris.png width=200) ![Obama-Harris hybrid](out/obama_harris_hybrid.png width=400) Barack Obama frowning and smiling * Low frequency: Obama frowning, $\sigma = 2$ * High frequency: Obama smiling, $\sigma = 1.5$ ![obama smiling](out/obama_smile.png width=200)![obama frowning](out/obama_frown.png width=200) ![Obama expression hybrid](out/obama_expression_hybrid.png width=400) Class example * Low frequency: Derek, $\sigma = 8$ * High frequency: Nutmeg, $\sigma = 9$ ![Derek](out/derek.png width=200)![Nutmeg](out/nutmeg.png width=200) ![Nerek-Nutmeg hybrid](out/derek_nutmeg_hybrid.png width=400) Failure case - Obama and Biden * Low frequency: Biden, $\sigma = 3$ * High frequency: Obama, $\sigma = 3$ ![Obama](out/obama_smile.png width=200)![Biden](out/biden.png width=200) ![Obama-Biden hybrid](out/obama_biden_hybrid.png width=400) Since it was hard to get a good alignment of their faces just using 2 points for the transformation, given that they seem to have different face shapes/features that don't align well, the images didn't nicely blend into each other, and therefore the effect wasn't as convincing as some of the others. I did still play with the $\sigma$ values so it works reasonably well, but I was less convinced by it than my other examples. Here are the FFT images from my Obama changing expression example. ![obama frowning](out/im1_fft.png width=200)![obama frowning, low pass filtered](out/im1_lp_fft.png width=200) ![obama smiling](out/im2_fft.png width=200)![obama frowning, high pass filtered](out/im2_hp_fft.png width=200) ![hybrid image](out/hybrid_fft.png width=300) We can clearly see the low-pass filter removing high frequencies, while the high-pass filter amplifies high frequencies. Then, the hybrid image has both of the frequencies. I also made some color hybrid images. I found that it worked best to use color for both parts, and that the effect looked better than in grayscale! ![Obama-Harris hybrid](out/obama_harris_hybrid_color.png width=400) ![Obama expression hybrid](out/obama_expression_color.png width=400) Gaussian and Laplacian Stacks --------------------------------- Here are Gaussian and Laplacian stacks for the Dali painting. The Gaussian stack is in the top row, and the Laplacian stack is in the bottom row. ![Dali painting stacks](out/lincoln_stacks.png width=600) And here they are for Mona Lisa's face. ![Mona Lisa painting stacks](out/mona_lisa_stacks.png width=600) Here are the Gaussian and Laplacian stacks for a hybrid image. ![Hybrid image stacks](out/hybrid_stacks.png width=600) Multiresolution Blending ------------------- For bells and whistles, and for the fun effect, I implemented multi-resolution blending in color. Here is my oraple: ![apple](apple.jpeg width=200)![orange](orange.jpeg width=200)![mask](out/mask1.png width=200) ![Oraple](out/oraple.png width=600) Here is a photo blended from the same scene photographed at fall and winter. Since the images aren't perfectly aligned, we get some ghosting artifacts, but the effect is still pretty nice. ![Fall](dartmouth_fall.jpg width=200)![Winter](dartmouth_winter.jpg width=200)![mask](out/mask2.png width=200) ![Changing seasons](out/blended_seasons.png width=600) And here is a photo I took of the mountains in Montserrat, blended in with a starry sky. For this one, I used an irregular mask that matched the shape of the mountains: ![Stars](stars.jpg width=200)![Mountains](montserrat.jpg width=200)![mask](out/mask_mountain.png width=200) ![Mountains blended with stars](out/blended_mountains.png width=600) Obviously, the result is kind of unrealistic, since mountains actually lit under a starry sky would have completley different lighting, but I think it's a cool effect! To get the mask, I used scikit-images's implementation of Otsu thresholding to separate the foreground from the background, and then manually filled in noise. Here's a visualization of some levels of the blending: ![Mountains level 1](out/ls1_0.png width=200)![Mountains level 3](out/ls1_2.png width=200)![Mountains level 5](out/ls1_4.png width=200) ![Stars level 1](out/ls2_0.png width=200)![Stars level 3](out/ls2_2.png width=200)![Stars level 5](out/ls2_4.png width=200)