Minjune Hwang, SID: 3032674400
This project is to develop a program that takes glass plate images with RGB filters and aligns those three images to produce a color image with image processing techniques. For this project, I used Prokudin-Gorskii glass plate images and generated colorized images.
We first divide the glass plate image created with BGR filters into three parts corresponding to red, green, and blue. I simple (roughly) divided them to equally sized parts and generated the unaligned version of images to later compare them with the aligned versions. Then, I examined different search methods and different metrics that can best produce the colorized image. I aimed to align the green-channel image and the red-channel image to the blue channel so that we can superimpose aligned R and G channel images onto the original B channel image.
For images with low resolution (i.e. jpg files), an exhaustive search method worked perfectly. We compute how well images match with possible displacements in x and y, ranging from -15 to 15 pixels. In order to measure how well the shifted image matches with the given image, I used two different metrics, SSD and NCC. SSD measures the l-2 distance between the two image, while NCC measures the normalized correlation (i.e. cosine similarity) of two images. I got almost identical results with those metrics, so I decided to use SSD since it took little bit shorter to compute. When computing these metrics, I computed the displacement in the central (internal) part such that it will not account for (shifted) edges. I tried to find the displacement that would result in the minimum distance between the shifted and the target image.
For images will high resolution (i.e. tif files), we cannot exhaustively search displacements as the number of pixels of the image is too large. Thus, I used image pyramid, which represents an image in multiple level, by the factor of 2. Then, from the lowest level, it finds the best alignment and moves onto the next level. Since this is essentially identical to binary representation, it significantly lowers the search space; for each level, I searched in the range of [-1, 1], and I used the level of 6 or 7. This was because some images were not aligned by using the level of 6. As mentioned above, I computed the SSD (in the central part) between the shifted image and the target image to compute how much images are not matching.
Contents
For small images, I used an exhaustive search method with the window of displacements with the range of [-15, 15].
Here, offsets are calculated relatively to the B (blue) film; they are the displacement I applied on R and C channel images.
Unaligned | Aligned | Offset (x=h, y=w) |
---|---|---|
R: (12, 3), G:(5, 2) |
Unaligned | Aligned | Offset (x=h, y=w) |
---|---|---|
R: (3, 2), G:(-3, 2) |
Unaligned | Aligned | Offset (x=h, y=w) |
---|---|---|
R: (7, 3), G:(3, 3) |
For large images, I used the image pyramid with the factor of two; I used the level of 6 or 7, and for each level I searched over the window of displacements with the range of [-1, 1].
Depending on the image, I had to use a different level of the image pyramid to correctly align the image, which I added on the table for each image.
"emir.tif" was the only image that I couldn't align correctly; I had to use the edge detection to produce the correct version (see the bells and whistles).
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (71, 39), G:(48, 24) | 6 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (123, 14), G:(59, 17) | 6 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (89, 23), G:(41, 17) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (114, 12), G:(55, 8) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (178, 14), G:(81, 10) | 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (108, 37), G:(51, 27) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (175, 37), G:(78, 29) | 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (111, 12), G:(52, 14) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (86, 32), G:(42, 6) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (137, 22), G:(64, 12) | 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (106, -12), G:(53, -1) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (28, 27), G: (10, 18) | 6 & 7 |
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (92, 30), G:(18, 18) | 6 & 7 |
In addition to the project, I implemented some extra ideas to improve the result. This includes
I used the Sobel transform to detect edges in the image and found optimal displacements that will minimize SSD between the edge-transformed versions of the shifted (R & G) and the target (B) image.
"emir.tif" was the only image that I couldn't align correctly in the previous part, so I used edge detection to align the RGB channel images, and I could correctly allign (colorize) emir. I used align_edges_pyramid
function, which aligns with edge detection using image pyramid.
Unaligned | Aligned | Offset (x=h, y=w) | Level |
---|---|---|---|
R: (107, 40), G:(49, 24) | 6 & 7 |
Again, I used the Sobel tranformation to detect edges in the image to automatically crop the border in the image. I've done this by detecting the edge between the border and the image, as shown in the crop_border
function. I found where the vertical and horizontal averages of the edge-transformed side parts of image are above a certain threshold (0.02 ~ 0.04), and I used those points to cut the image.
Base (Aligned) | Cropped |
---|---|
Automatic constrasting is done by rescaling RGB values so that the darkest value (minimum) is mapped to 0 and the brightest value (maximum) is mapped to 1. I used linear mapping to automatically contrast the image.
I chose harvesters.tif, as I thought color balcancing of those photos was poor. Unfortunately, this had a minimal effect on the image; it seems like we have to stretch the distribution such that it will be uniform.
Base (Aligned & Cropped) | Adjusted |
---|---|