My first attempt at solving the problem was exactly as described in the project specification: I tried implementing a sum of squared distance metric to measure similarity, which must be minimized over. Next, I tried a normalized correlation metrix which had to be maximized over.
First, I cropped the border (roughly 8%) off the image. Then, I implemented a pyramid search, by dividing the problem up into 2 at each step, returning [0, 0] for the offset once the image size became smaller than 30x30. I used a search window of size [-2, 4] to [2, -4] at each recursive step. In experimenting with different window sizes, I found that my images seemed to have more of an issue aligning along the y-coordinate, and increasing the search space along the dimension improved the matching.
I attempted two different bells and whistles.
First, finding edges. For this, I used the Canny Edge Filter. This edge detector first adds gaussian noise to blur the image. Next, it calculates the gradients to detect rapidly changing colors. Finally, it detects edges in various directions, and figures out which orientation the edge actually lies in (above some threshold.) The main parameter that I modified here was the standard deviation of the gaussian noise added. In my exploration, I did not find the standard deviation to have any significant visible impact. Note: the edges-only images are shown without the borders, since those are always edges, explaining the oddity in size. Green Offset: [5, 2] Green Offset: [5, 2] Green Offset: [5, 2] Green Offset: [5, 2] Green Offset: [49, 24] Green Offset: [49, 24] Green Offset: [49, 23] Green Offset: [49, 23] Green Offset: [60, 16] Green Offset: [60, 16] Green Offset: [357, 1714] Green Offset: [60, 18] Green Offset: [40, 17] Green Offset: [40, 17] Green Offset: [38, 16] Green Offset: [41, 16] Green Offset: [55, 8] Green Offset: [55, 8] Green Offset: [56, 10] Green Offset: [57, 9] Green Offset: [-3, 2] Green Offset: [-3, 2] Green Offset: [-3, 2] Green Offset: [-3, 2] Green Offset: [3, 1] Green Offset: [3, 1] Green Offset: [3, 1] Green Offset: [3, 1] Green Offset: [79, 29] Green Offset: [79, 29] Green Offset: [77, 29] Green Offset: [80, 31] Green Offset: [7, 0] Green Offset: [7, 0] Green Offset: [7, 0] Green Offset: [7, 0] Green Offset: [54, 14] Green Offset: [54, 14] Green Offset: [56, 12] Green Offset: [57, 17] Green Offset: [44, 6] Green Offset: [44, 6] Green Offset: [40, 8] Green Offset: [40, 8] Green Offset: [56, 21] Green Offset: [56, 21] Green Offset: [57, 22] Green Offset: [836, -621] Green Offset: [65, 12] Green Offset: [65, 12] Green Offset: [67, 12] Green Offset: [64, 10] church Green Offset: [55, 16] gifts Green Offset: [26, 23] levee Green Offset: [14, 11] railway Green Offset: [66, 20]
Second, automatically adjusting the contrast in the images. The technique used was adaptive histogram equalization, which basically looks at the histogram of the pixels of the image, and stretches them to occupy the entire 0-255 pixel range. In my opinion, the high contrast images almost always looked the best!
Examples
NC only
Red Offset: [0, 0]
Time: 0.10738587379455566
SSD only
Red Offset: [0, 0]
Time: 0.10652327537536621
SSD + 1sig edges
Red Offset: [12, 3]
Time: 0.20093011856079102
NC + 3sig edges
Red Offset: [11, 30]
Time: 0.19147968292236328
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [8, -818]
Time: 12.516111135482788
SSD only
Red Offset: [8, -818]
Time: 13.754069328308105
SSD + 1sig edges
Red Offset: [107, 40]
Time: 19.000343084335327
NC + 3sig edges
Red Offset: [107, 40]
Time: 20.790307760238647
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [123, 13]
Time: 13.211221933364868
SSD only
Red Offset: [123, 13]
Time: 13.29931092262268
SSD + 1sig edges
Red Offset: [477, -1280]
Time: 20.408152103424072
NC + 3sig edges
Red Offset: [540, -1154]
Time: 20.764681100845337
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [89, 23]
Time: 12.891743898391724
SSD only
Red Offset: [89, 23]
Time: 13.350109100341797
SSD + 1sig edges
Red Offset: [90, 22]
Time: 21.427409887313843
NC + 3sig edges
Red Offset: [90, 22]
Time: 21.1442768573761
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [116, 11]
Time: 12.378465175628662
SSD only
Red Offset: [116, 11]
Time: 13.042216777801514
SSD + 1sig edges
Red Offset: [120, 13]
Time: 20.597461938858032
NC + 3sig edges
Red Offset: [120, 13]
Time: 19.158432960510254
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [3, 2]
Time: 0.14462900161743164
SSD only
Red Offset: [3, 2]
Time: 0.16754770278930664
SSD + 1sig edges
Red Offset: [3, 2]
Time: 0.27144694328308105
NC + 3sig edges
Red Offset: [3, 2]
Time: 0.25899696350097656
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [7, 0]
Time: 0.0902400016784668
SSD only
Red Offset: [7, 0]
Time: 0.10136890411376953
SSD + 1sig edges
Red Offset: [8, 0]
Time: 0.2027890682220459
NC + 3sig edges
Red Offset: [7, 0]
Time: 0.19925713539123535
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [176, 36]
Time: 13.444835186004639
SSD only
Red Offset: [176, 36]
Time: 13.952284812927246
SSD + 1sig edges
Red Offset: [175, 37]
Time: 21.82030701637268
NC + 3sig edges
Red Offset: [175, 37]
Time: 20.310746908187866
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [14, -1]
Time: 0.17267799377441406
SSD only
Red Offset: [14, -1]
Time: 0.16895604133605957
SSD + 1sig edges
Red Offset: [14, -1]
Time: 0.26313018798828125
NC + 3sig edges
Red Offset: [15, -1]
Time: 0.2500460147857666
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [111, 11]
Time: 13.114645957946777
SSD only
Red Offset: [111, 11]
Time: 13.328622102737427
SSD + 1sig edges
Red Offset: [111, 8]
Time: 19.907155990600586
NC + 3sig edges
Red Offset: [115, 12]
Time: 19.552978038787842
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [86, 32]
Time: 13.66454005241394
SSD only
Red Offset: [86, 32]
Time: 13.210375785827637
SSD + 1sig edges
Red Offset: [85, 29]
Time: 21.304930686950684
NC + 3sig edges
Red Offset: [85, 34]
Time: 19.69898819923401
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [116, 28]
Time: 14.648415088653564
SSD only
Red Offset: [116, 28]
Time: 13.084060907363892
SSD + 1sig edges
Red Offset: [1183, -990]
Time: 19.927490949630737
NC + 3sig edges
Red Offset: [903, -609]
Time: 18.67224407196045
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
NC only
Red Offset: [37, -80]
Time: 13.821240663528442
SSD only
Red Offset: [37, -80]
Time: 13.436881303787231
SSD + 1sig edges
Red Offset: [138, 22]
Time: 20.5000479221344
NC + 3sig edges
Red Offset: [137, 21]
Time: 18.735404014587402
Blue Channel Edges (1 sig)
Best(Edges, No Edges) + Contrast
Other Examples
Professional Recreation
1sig edges + SSD + Contrast
Red Offset: [131, 25]
Time: 29.131471872329712
Professional Recreation
1sig edges + SSD + Contrast
Red Offset: [71, 33]
Time: 30.516719818115234
Professional Recreation
1sig edges + SSD + Contrast
Red Offset: [39, 19]
Time: 27.758996963500977
Professional Recreation
1sig edges + SSD + Contrast
Red Offset: [133, 16]
Time: 29.60896611213684