Parameters: shift (number of pixels to roll) = 10, num_shifts (number of shifts per search space) = 10. For the sake of time (each image taking less than 1 minute to run), num_shifts = 4.
Challenges
It was hard to align the red channel because the x and y shifts were learned from the local NCC maximum, not the global maximum. We clearly see this in lady.tif.
There was no nice set of global parameters to align all of the images. This was because the solution was heavily dependent on the measurement function (NCC) and effects of window size were very magnified in the pyramid solution. icon.tif
Colorizing the images went pretty ok, but was especially tricky for large images.
Additionally, red and blue were far enough in the color spectrum to not determine distinct enough optimum values with NCC. A fix was to align green to blue, then red to green (with green already aligned) because red and green's values would correlate better.
Results
Example
Small (jpg)
Right: Output Cathedral
Red: (10, 0), Green: (0, 0), Blue: (0, 0)
Right: Output Monastery
Red: (10, 0), Green: (0, 0), Blue: (0, 0)
Right: Output Nativity
Red: (10, 0), Green: (10, 0), Blue: (0, 0). This failed to align well because of the high exposure.
Right: Output Settlers
Red: (20, 0), Green: (10, 0), Blue: (0, 0)
Large (tif)
Right: Output Emir
Red: (80, 10), Green: (30, -10), Blue: (0, 0)
Right: Output Harvesters
Red: (90, -10), Green: (60, 0), Blue: (0, 0). This didn't align because of the high exposure.
Right: Output Icon
Red: (90, 10), Green: (20, 20), Blue: (0, 0)
Right: Output Lady
Red: (90, -40), Green: (40, -10), Blue: (0, 0)
Right: Output Self Portrait
Shift: 15, Red: (135, 0), Green: (60, 0), Blue: (0, 0). This aligned, but not well because it seems like since there is a lot of green in the picture, the red and green channels would align to the blue channel less.
Right: Output Three Generations
Red: (90, 0), Green: (70, -10), Blue: (0, 0)
Right: Output Train
Red: (90, -20), Green: (20, 0), Blue: (0, 0)
Right: Output Turkmen
Red: (90, 10), Green: (50, 0), Blue: (0, 0)
Right: Output Village
Red: (90, 20), Green: (60, -10), Blue: (0, 0). This didn't align well because the high exposure from the sky and noticeably darker land could potentially bias NCC.
Chosen
Right: Output 00451u (tif)
Red: (90, -20), Green: (70, -10), Blue: (0, 0)
Right: Output 00998u (tif)
Red: (90, 0), Green: (60, 0), Blue: (0, 0). This didn't align well because it looks like the red channel didn't really correlate to either of the blue or green channels enough to shift to the right direcitons.
Right: Output 1520u (tif)
Red: (70, 0), Green: (20, 0), Blue: (0, 0)
Bells and Whistles
Automatic Cropping
Automatic Cropping with Canny Edge Detector
Used canny edge detector to find pixel positions that marked edges (a change between a True/False value). To smooth out the noise, a gaussian blur (sigma=1) was treated on the image before it was cropped. Since color channel alignment resulted in white borders on the output images, the cropped images prioritized cropping out the white borders first. This could be tuned with an "aggressiveness" level by looking for the next edge (moving towards the center of the image).
Aligned image with white borders.
Aligned image with without white borders (the first set of edges).
Sobel Featurization was pretty similar to using Canny Edge Detection for cropping.
Automatic Cropping with Gradient
Get two signals by calculating the mean of all three channels across the x and y axis respectively. Calculate the gradient of the signal, then smooth the gradient to get a more defined boundary/edge.
Gradient (in blue), smoothed gradients (in red) across x axis.
Take the absolute value of the gradient signal respective to each axis. The pixel with the largest magnitude is at the edge for each edge respective to an axis. We can tune how aggresively we want to crop by picking the nth largest magnitude.
aggro=0 (pick the largest magnitude). Cropping the red channel as a dummy grayscale image.
Auto Contrast
I cropped, then took the average of the aligned r, g, b channels into a gray scale image. The approach is to rescale the image intensity (or brightness) of the gray scaled image so that the darkest pixel is 0 (on the darkest color channel) and brightest pixel is 1 (on the brightest color channel).
Left: Before, Right: After