I first used the given get_harris_corners() funciton to get interest points, then implemented the paper's Adaptive Non-Maximal Suppression algorithm to find an $r$ for each point, then sort the $r$s to further downsize the set of collected interest points. Here's a figure of the Harris corners overlaid on the image
imgsB, coordsB = single_scale_feature_selection('9b.jpg', min_harris=20, discard=40)
imgsC, coordsC = single_scale_feature_selection('9c.jpg', min_harris=20, discard=40)
plot_interest_points([imgsB], [coordsB], idx=0, layer=0)
For each interest point, if using single-scale processing, I sampled a 40x40 patch around it, then jump every 5 pixels to downsize it to an 8x8 patch; if using multi-scale processing (see Bells&Whistles), the patch is sampled from centering the corresponding point at one layer above in the image pyramid.
Below shows some extracted features
dessA = descriptor(imgsA, coordsA)
dessB = descriptor(imgsB, coordsB)
dessC = descriptor(imgsC, coordsC)
fig, axs = plt.subplots(1,14,figsize=(15,3))
for i in range(14):
axs[i].imshow(dessA[0][i], cmap="gray")
Here I set a threshold value that bounds acceptable similarity between matched points, making use of the given dist2() function. After matching we are left with even fewer interest points, as shown below:(both matched descriptors and matched points)
mA, mB, pairs = descriptor_matching(dessA, coordsA, dessB, coordsB, idx=0, thres=6)
plot_matched_points(mA, mB, imgsA, imgsB, layer=0)
show_matched_descriptors(pairs)
I implemented RANSAC, which computes the best homography matrix H based on the number of inliners it successfully matches. I used 1000+ iterations for each matching process to make sure all points are itereated sufficiently. Also I found it easier to sanity check by warping a small image patch first and examine the results
H, auto_warp = auto_warp_pair(mA, imgsA[0], mB, imgsB[0], delta=3, box=box, itr=1500)
box = (0,3024, -3000, 1000)
left = warpImage(imgsA[0], H, box)
plt.imshow(left)
canvas = np.zeros((3024, 4032+3000, 3))
canvas[:, :4000,:] = left
canvas[:, 3000:,:] = imgsB[0]/imgsB[0].max()
plt.imshow(canvas)
I choose to warp surrounding images to one center image, which is easier to calculate the final coordinates, then stitch them together to produce the following images:
names = ['2.jpg', '5.jpg', '7.jpg', '8.jpg']
imgs = [skio.imread(n) for n in names]
_, axs = plt.subplots(4,1,figsize=(100,40))
for i in range(4):
axs[i].imshow(imgs[i])
fig, axs = plt.subplots(1,3,figsize=(20,40), sharex=True, sharey=True)
axs[0].imshow(skio.imread("2a_cyl.jpg"))
axs[1].imshow(skio.imread("2b_cyl.jpg"))
axs[2].imshow(skio.imread("2c_cyl.jpg"))
As shown below, since we can get a second version of the original image from the warped image, we can use this overlap to better blend the two. I came up with this probability-based assigning method: in the overlap area, assign a source pixel for each pixel based on the difference between two image's pixel values and a uniform probability distribution.
Below visualizes the pixel assignment (bright are from right image and dark from left), and compares naive stitching with results from this blending method:
fig, axs = plt.subplots(3,1,figsize=(40,20))
axs[0].imshow(skio.imread("8vis.jpg"))
axs[1].imshow(skio.imread("8naive.jpg"))
axs[2].imshow(skio.imread("8random.jpg"))
Like the one above, images with big empty areas of sky often show a really clear edge between neighboring images. Taking from the decriptor idea, I sample a random "sky patch", slide it over the mosaic, and detect similar sky areas based on SSD, then smooth those patches with the sky color. Setting the delta is important, and this is the best-looking one I produced:
fig, axs = plt.subplots(1,1,figsize=(25,15))
plt.imshow(skio.imread("8skyed.jpg"))
Use a gaussian pyramid with scale 2, I was able to separate image processing to multiple layers. As printed and plotted below, we can start from the bottom, i.e. the smallest and blurrest image, searching over lots of points and only select out a subset of interest points, whose coordinates are scaled up and selected again at the upper layer. As for feature descriptor, this pyramid structure allows us to sample from one layer above, so that each descriptor is a normalized patch sampled from the image above. This multi-scale processing allows for faster runtime and more robust descriptors.
layer, idx = 3, 0
imgs1, caches1 = imgs[idx], caches[idx]
fig, axs = plt.subplots(1,layer, figsize=(20, 20*layer))
for i in range(layer):
cache = caches1[i]
axs[i].imshow(imgs1[i])
axs[i].scatter( cache[1,:], cache[0,:], s=20)
#dess_ls = [descriptor(img_py, cache) for img_py, cache in zip(imgs, caches)]
idx1, idx2 = 0, 1
layer = -1
dess1, cache1 = dess_ls[idx1], caches[idx1]
dess2, cache2 = dess_ls[idx2], caches[idx2]
m1, m2, pairs = descriptor_matching(dess1, cache1, dess2, cache2, idx=layer, thres=1)
imgs1, imgs2 = imgs[idx1], imgs[idx2]
plot_matched_points(m1, m2, imgs1, imgs2, layer=layer)
imgs1, imgs2 = imgs[idx1], imgs[idx2]
plot_matched_points(m1, m2, imgs1, imgs2, layer=0)
According to the paper, our selected descriptors should already be rotation invariant, we can check that by rotating one image above and see if we still get the same selection/detection result.
layer, idx = 3, -1
imgs1, caches1 = imgs[idx], caches[idx]
fig, axs = plt.subplots(1,layer, figsize=(20, 20*layer))
for i in range(layer):
cache = caches1[i]
axs[i].imshow(imgs1[i])
axs[i].scatter( cache[1,:], cache[0,:], s=20)
The way I implemented the algorithm is this: takes in an unordered list of unwarped images and
# names = ["2b_cyl.jpg", "2c_cyl.jpg", "2a_cyl.jpg"]#, '5c.jpg']
# imgs, caches = [], []
# for name in names:
# img, cach = multi_scale_feature_selection(name, layer=4)
# imgs.append(img)
# caches.append(cach)
ordered, dess_ordered, imgs_ordered, caches_ordered, canvas = \
recognize(imgs, caches, thres=3, layer=-1)
After this we can proceed to warp or stitch images
n = len(caches)
fig, axs = plt.subplots(2, n, figsize=(15, 5*n))
for i in range(n):
axs[0,i].imshow(imgs[i][-2])
axs[1,i].imshow(imgs[ordered[i]][-2])
n = len(caches)
fig, axs = plt.subplots(2, n, figsize=(30, 6*n))
for i in range(n):
axs[0,i].imshow(imgs[i][-2])
axs[1,i].imshow(imgs[ordered[i]][-2])
It's really important to have good workflow and good code structures and visualization utilities.
Geometry is cool. Once you know the math the projection follows naturally.
## Some relic plots, just leaving them here
import numpy as np
import matplotlib.pyplot as plt
import skimage.io as skio
from skimage.transform import pyramid_gaussian as p_g
from proj5A import *
from harris import *
from plot_utils import *
from proj5B import *