In [1]:
from skimage import color, transform, filters
import skimage.io as skio
import numpy as np
import os
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import pickle
import cv2

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

Project 5A: Image Warping and Mosaicing

David Yi

Shoot and Digitize Pictures

My goal for the first part of the project was to create image panoramas through manual homographic projections. I took the pictures below near my house with locked aperture and exposure camera settings, but the pictures were taken on my smartphone so the point of view may not necessarily be constant. Regardless, the perspective transforms turned out fairly well, as you will see in later parts.

In [58]:
matplotlib.use('module://ipykernel.pylab.backend_inline')
# matplotlib.use('TkAgg')

# img = plt.imread('garage_r.jpg')
# plt.imshow(img)
# pts = plt.ginput(n=13, timeout=0)

# with open('garage_r.p', 'wb') as fp:
#     pickle.dump(pts, fp)

Recovering Homographies

The homography transform between a set of corresponding points between two images is defined by:

$p' = Hp$, where $H$ is a $3x3$ matrix with 8 unknowns. We only need four equations (pairs of correspondence points) to recover the 8 unknowns, but the recovery may be unstable and prone to noise. Alternatively, we can solve an overdetermined system by providing more than 4 correspondences and rearranging $H$ into a $9x1$ vector. Our new system is $PH = \vec{0}$, where $P$ is the $Nx9$ matrix composed by stacking the following submatrix $p_i$ for each correspondence pair. source: http://6.869.csail.mit.edu/fa12/lectures/lecture13ransac/lecture13ransac.pdf

\begin{equation*} p_i = \begin{bmatrix} -x_i & -y_i & -1 & 0 & 0 & 0 & x_ix_i' & y_ix_i' & x_i' \\ 0 & 0 & 0 & -x_i & -y_i & -1 & x_iy_i' & y_iy_i' & y_i' \end{bmatrix} \end{equation*}

and

\begin{equation*} H = \begin{bmatrix} h_1 \\ h_2 \\ h_3 \\ h_4 \\ h_5 \\ h_6 \\ h_7 \\ h_8 \\ h_9 \end{bmatrix} \end{equation*}

where the least squares solution to $H$ is the right singular vector corresponding to the smallest eigenvalue of the Singular Value Decomposition of $P$

In [4]:
def computeH(im1_pts, im2_pts):
    P = np.empty((0, 9))
    for i in range(len(im1_pts)):
        im1 = im1_pts[i]
        im2 = im2_pts[i]
        p_1 = np.array([-im1[0], -im1[1], -1, 0, 0, 0, im1[0]*im2[0], im1[1]*im2[0], im2[0]])
        p_2 = np.array([0, 0, 0, -im1[0], -im1[1], -1, im1[0]*im2[1], im1[1]*im2[1], im2[1]])
        P = np.vstack((P, p_1.reshape(1, 9)))
        P = np.vstack((P, p_2.reshape(1, 9)))
    return np.linalg.svd(P)[2][-1].reshape(3, 3)
In [5]:
def warpImage(im, H):
    # Code adapted from: https://stackoverflow.com/questions/46520123/how-do-i-use-opencvs-remap-function
    H_inv = np.linalg.inv(H)
    H_inv = H_inv / np.sqrt(np.sum(H_inv**2))
    h, w = im.shape[:2]
    indy, indx = np.indices((h, w), dtype=np.float32)
    lin_homg_ind = np.array([indx.ravel(), indy.ravel(), np.ones_like(indx).ravel()])

    # warp the coordinates of src to those of true_dst
    map_ind = H_inv @ lin_homg_ind
    map_x, map_y = map_ind[:-1]/map_ind[-1]  # ens|ure homogeneity
    map_x = map_x.reshape(h, w).astype(np.float32)
    map_y = map_y.reshape(h, w).astype(np.float32)

    # remap!
    dst = cv2.remap(im, map_x, map_y, cv2.INTER_LINEAR)
    return dst

Image Rectification (Warping)

Although I warp all of my left images to align with the right image, I actually use the inverse Homography matrix $H^{-1}$ to perform an inverse warp and fill in pixels using linear interpolation. Before rectifying the images shown above, I test that my homography matrix $H$ is correct by "rectifying" images to be front-parallel by guessing the correspondence points. The "guesses" are shown as blue dots in the original image.

In [6]:
box = skio.imread('box.jpg')
matplotlib.use('module://ipykernel.pylab.backend_inline')
matplotlib.use('TkAgg')

# plt.imshow(box, cmap='gray')

# pts = plt.ginput(n=4, timeout=0)

# with open('box_p.p', 'wb') as fp:
#     pickle.dump(pts, fp)
In [13]:
matplotlib.use('module://ipykernel.pylab.backend_inline')
with open('box_p.p', 'rb') as fp:
    box_p = np.array(pickle.load(fp))

new_pts = np.array([(250, 1500), (2750, 1500), (250, 2500), (2750, 2500)])

f,axs = plt.subplots(1, 2, figsize=(8, 8))

plt.subplot(1, 2, 1)
plt.imshow(box, cmap='gray')
plt.scatter(box_p[:, 0], box_p[:, 1], s=50)
plt.scatter(new_pts[:, 0], new_pts[:, 1], s=50, color='r')
plt.axis('off')

plt.subplot(1, 2, 2)
H = computeH(box_p, new_pts)
plt.imshow(warpImage(box, H), cmap='gray')
plt.axis('off');
In [14]:
frame = skio.imread('frame.jpg')

matplotlib.use('module://ipykernel.pylab.backend_inline')
matplotlib.use('TkAgg')

# plt.imshow(frame, cmap='gray')

# pts = plt.ginput(n=4, timeout=0)

# with open('frame_p.p', 'wb') as fp:
#     pickle.dump(pts, fp)
In [18]:
matplotlib.use('module://ipykernel.pylab.backend_inline')
with open('frame_p.p', 'rb') as fp:
    frame_p = np.array(pickle.load(fp))

new_pts = np.array([(500, 500), (2500, 500), (500, 3500), (2500, 3500)])

f,axs = plt.subplots(1, 2, figsize=(8, 8))

plt.subplot(1, 2, 1)
plt.axis('off')
plt.imshow(frame, cmap='gray')
plt.scatter(frame_p[:, 0], frame_p[:, 1], s=50)
plt.scatter(new_pts[:, 0], new_pts[:, 1], s=50, color='r')b

plt.subplot(1, 2, 2)
plt.axis('off')
H = computeH(frame_p, new_pts)
plt.imshow(warpImage(frame, H), cmap='gray');

These were fairly simple examples, let us now put our algorithm to the test by creating mosaics from our original images!

Blending the Images into a Mosaic

My process of creating an image is as follows:

  1. Manually assign correspondence points between the left and right image
  2. Create "bounding box" for left image by warping the corner points with the homography matrix $H$.
  3. Recompute the homography matrix $H$ and the correspondence points to account for the displacement from padding
  4. Warp the left image to align with the right image
  5. Combine the two images by taking the maximum pixel between overlapping points and crop!

Below, I show the rectified image on the left and the final panoramic image on the right.

The algorithm works fairly well, although we can definitely see small issues with blurriness and brightness in some of the overlapping sections.

In [41]:
with open('window_l.p', 'rb') as fp:
    window_lp = np.array(pickle.load(fp))
with open('window_r.p', 'rb') as fp:
    window_rp = np.array(pickle.load(fp))
window_l = plt.imread('window_l.jpg')
window_r = plt.imread('window_r.jpg')
In [42]:
# matplotlib.use('module://ipykernel.pylab.backend_inline')

# plt.imshow(window_l)
# plt.scatter(window_lp[:, 0], window_lp[:, 1], c='r')
In [43]:
# plt.imshow(window_r)
# plt.scatter(window_rp[:, 0], window_rp[:, 1], c='r')
In [44]:
H = computeH(window_lp, window_rp)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0)) + 2000
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))
In [45]:
x_diff = int((new_x - window_l.shape[1])/2)
y_diff = int((new_y - window_l.shape[0])/2)

f, axs = plt.subplots(1, 2, figsize=(20, 15))

plt.subplot(1, 2, 1)
plt.axis('off')
window_l_padded = np.zeros((new_y, new_x, 3))
window_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = window_l/255
window_r_padded = np.zeros((new_y, new_x, 3))
window_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = window_r/255

window_lp[:, 0] += x_diff
window_lp[:, 1] += y_diff
window_rp[:, 0] += x_diff
window_rp[:, 1] += y_diff
new_H = computeH(window_lp, window_rp)
window_lnew = warpImage(window_l_padded, new_H)
plt.imshow(window_lnew)

plt.subplot(1, 2, 2)
plt.axis('off')
window_mosaic = np.maximum(window_lnew, window_r_padded)
window_mosaic = window_mosaic[500:3500, 2000:, :]
plt.axis('off')
plt.imshow(window_mosaic);
In [83]:
with open('piano_l.p', 'rb') as fp:
    piano_lp = np.array(pickle.load(fp))
with open('piano_r.p', 'rb') as fp:
    piano_rp = np.array(pickle.load(fp))
piano_l = plt.imread('piano_l.jpg')
piano_r = plt.imread('piano_r.jpg')
In [84]:
H = computeH(piano_lp, piano_rp)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0)) + 2000
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))
In [86]:
f, ax = plt.subplots(1, 2, figsize=(20, 15))

x_diff = int((new_x - window_l.shape[1])/2)
y_diff = int((new_y - window_l.shape[0])/2)

plt.subplot(1, 2, 1)
plt.axis('off')
piano_l_padded = np.zeros((new_y, new_x, 3))
piano_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = piano_l/255
piano_r_padded = np.zeros((new_y, new_x, 3))
piano_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = piano_r/255
piano_lp[:, 0] += x_diff
piano_lp[:, 1] += y_diff
piano_rp[:, 0] += x_diff
piano_rp[:, 1] += y_diff
new_H = computeH(piano_lp, piano_rp)
piano_lnew = warpImage(piano_l_padded, new_H)
plt.imshow(piano_lnew)

plt.subplot(1, 2, 2)
plt.axis('off')
piano_mosaic = np.maximum(piano_lnew, piano_r_padded)
piano_mosaic = piano_mosaic[1000:4000, :6000, :]
plt.imshow(piano_mosaic);
In [59]:
with open('garage_l.p', 'rb') as fp:
    garage_lp = np.array(pickle.load(fp))
with open('garage_r.p', 'rb') as fp:
    garage_rp = np.array(pickle.load(fp))
garage_l = plt.imread('garage_l.jpg')
garage_r = plt.imread('garage_r.jpg')
In [60]:
H = computeH(garage_lp, garage_rp)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0)) + 2000
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))
In [63]:
f, axs = plt.subplots(1, 2, figsize=(20, 15))

x_diff = int((new_x - garage_l.shape[1])/2)
y_diff = int((new_y - garage_l.shape[0])/2)

plt.subplot(1, 2, 1)
plt.axis('off')
garage_l_padded = np.zeros((new_y, new_x, 3))
garage_l_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = garage_l/255
garage_r_padded = np.zeros((new_y, new_x, 3))
garage_r_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = garage_r/255
garage_lp[:, 0] += x_diff
garage_lp[:, 1] += y_diff
garage_rp[:, 0] += x_diff
garage_rp[:, 1] += y_diff
new_H = computeH(garage_lp, garage_rp)
garage_lnew = warpImage(garage_l_padded, new_H)
plt.imshow(garage_lnew)

plt.subplot(1, 2, 2)
plt.axis('off')
garage_mosaic = np.maximum(garage_lnew, garage_r_padded)
garage_mosaic = garage_mosaic[500:3400, :5200, :]
plt.imshow(garage_mosaic);

What I've learned

The coolest thing from part A is definitely image rectification, it really seems like magic. You can take a picture and essentially recreate the image from any perspective just by marking a few correspondence points on the image.

Project 5B: Feature Matching for Autostiching

In this part, we attempt to automate the correspondence point matching process which we did manually in part (A). The techniques in this part are strongly influenced from the paper Multi-Image Matching using Multi-Scale Oriented Patches.

Step 1: Harris Corner Detection

We begin by identifying "corners" of our images with the harris corner detection algorithm. Although the algorithm does a great job identifying corners, it is clear that there are way too many correspondence points and outliers.

In [64]:
import numpy as np
from skimage.feature import corner_harris, peak_local_max, corner_peaks


def get_harris_corners(im, edge_discard=20):
    """
    This function takes a b&w image and an optional amount to discard
    on the edge (default is 5 pixels), and finds all harris corners
    in the image. Harris corners near the edge are discarded and the
    coordinates of the remaining corners are returned. A 2d array (h)
    containing the h value of every pixel is also returned.

    h is the same shape as the original image, im.
    coords is 2 x n (ys, xs).
    """

    assert edge_discard >= 20

    # find harris corners
    h = corner_harris(im, method='eps', sigma=1)
    coords = corner_peaks(h, min_distance=1, indices=True)

    # discard points on edge
    edge = edge_discard  # pixels
    mask = (coords[:, 0] > edge) & \
           (coords[:, 0] < im.shape[0] - edge) & \
           (coords[:, 1] > edge) & \
           (coords[:, 1] < im.shape[1] - edge)
    coords = coords[mask].T
    return h, coords


def dist2(x, c):
    """
    dist2  Calculates squared distance between two sets of points.

    Description
    D = DIST2(X, C) takes two matrices of vectors and calculates the
    squared Euclidean distance between them.  Both matrices must be of
    the same column dimension.  If X has M rows and N columns, and C has
    L rows and N columns, then the result has M rows and L columns.  The
    I, Jth entry is the  squared distance from the Ith row of X to the
    Jth row of C.

    Adapted from code by Christopher M Bishop and Ian T Nabney.
    """
    
    ndata, dimx = x.shape
    ncenters, dimc = c.shape
    assert(dimx == dimc, 'Data dimension does not match dimension of centers')

    return (np.ones((ncenters, 1)) * np.sum((x**2).T, axis=0)).T + \
            np.ones((   ndata, 1)) * np.sum((c**2).T, axis=0)    - \
            2 * np.inner(x, c)
In [65]:
window_lbw = color.rgb2gray(window_l)
h_l, corners_l = get_harris_corners(window_lbw)

window_rbw = color.rgb2gray(window_r)
h_r, corners_r = get_harris_corners(window_rbw)
In [66]:
fig, ax = plt.subplots(1, 2, figsize=(20, 15))

plt.subplot(1, 2, 1)
plt.axis('off')
plt.imshow(window_l)
plt.scatter(corners_l[1], corners_l[0], s=20, color='r')

plt.subplot(1, 2, 2)
plt.axis('off')
plt.imshow(window_r)
plt.scatter(corners_r[1], corners_r[0], s=20, color='r');

Step 2: Adapative Non-Maximal Suppression

We can limit the number of correspondence points with the adaptive non-maximal suppression (ANMS) technique. Essentially, the ANMS algorithm attempts to find the most "influential" corners by identifying the corners with the strongest harris response within its local radius. In practice, we can simply calculate the radius of each corner as: $$r_i = \min_jdist(x_i,x_j) \, s.t. f(x_i) < 0.9f(x_j)$$ where $f(x_i)$ is the harris response for the i'th corner. I simply keep the corners correspondeing to the top 500 radii, as done in the original paper.

In [67]:
hl_max = np.unravel_index(np.argmax(h_l), h_l.shape)

all_lr = []
for p_i in corners_l.T:
    stronger = []
    for p_j in corners_l.T:
        if h_l[p_i[0]][p_i[1]] < 0.9*h_l[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hl_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_lr.append([p_i, np.min(r)])

hr_max = np.unravel_index(np.argmax(h_r), h_r.shape)

all_rr = []
for p_i in corners_r.T:
    stronger = []
    for p_j in corners_r.T:
        if h_r[p_i[0]][p_i[1]] < 0.9*h_r[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hr_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_rr.append([p_i, np.min(r)])
In [68]:
fig, ax = plt.subplots(1, 2, figsize=(20, 15))

best_lcorners = sorted(all_lr, key=lambda x: -x[1])
best_lcorners = np.stack(np.array(best_lcorners)[:, 0], axis=0)[:500]

plt.subplot(1, 2, 1)
plt.axis('off')
plt.imshow(window_l)
plt.scatter(best_lcorners[:, 1], best_lcorners[:, 0], s=20, color='r')

best_rcorners = sorted(all_rr, key=lambda x: -x[1])
best_rcorners = np.stack(np.array(best_rcorners)[:, 0], axis=0)[:500]

plt.subplot(1, 2, 2)
plt.axis('off')
plt.imshow(window_r)
plt.scatter(best_rcorners[:, 1], best_rcorners[:, 0], s=20, color='r');

Step 3: Extracting Feature Descriptions

After identifying the prominent corners, I "featurize" each corner by extracting the 40x40 image patch surrounding the pixel and downsampling it to an 8x8 patch using a gaussian filter. The feature is then normalized and standardized to mitigate the effect of noise.

In [183]:
l_features = []
for corner in best_lcorners:
    if corner[0]-20 > 0 and corner[0]+20 < window_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < window_lbw.shape[0]:
        patch = window_lbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        l_features.append((corner, patch))
In [184]:
r_features = []
for corner in best_rcorners:
    if corner[0]-20 > 0 and corner[0]+20 < window_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < window_lbw.shape[0]:
        patch = window_rbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        r_features.append((corner, patch))

Step 4: Matching Feature Descriptions

After featurizing the corners for both images, I match the correspondences between the left image A and the right image B by finding the patch that minimizes $SSD(A, B)$ where SSD is the sum squared distance between the two patches for each corner. Additionally, I ensure that the feature itself is significant by finding the ratio between the first and second nearest neighbors in B and keep the corners where the ratio is larger than 0.7. I chose a fairly large ratio, but the RANSAC algorithm in the next part removes a lot of corners that are outliers and extraneous.

In [185]:
new_lcorners = []
new_rcorners = []
for l_feat in l_features:
    all_ssd = [] 
    
    for r_feat in r_features:
        ssd = np.sum((l_feat[1]-r_feat[1])**2)
        all_ssd.append(ssd)
        
    all_ssd = np.array(all_ssd)
    ssd_cpy = all_ssd.copy()
    best_arg = r_features[ssd_cpy.argmin()][0]
    x, y = best_arg
    best_ssd = ssd_cpy.min()
    ssd_cpy[ssd_cpy.argmin()] = np.float('inf')
    ratio = best_ssd / ssd_cpy.min()
    if ratio <= 0.7 and (x, y) not in new_rcorners:
        new_lcorners.append((l_feat[0][0], l_feat[0][1]))
        new_rcorners.append((x, y))
In [186]:
new_lcorners = np.stack(np.array(new_lcorners), axis=0)
new_rcorners = np.stack(np.array(new_rcorners), axis=0)
In [366]:
f, axs = plt.subplots(1, 2, figsize=(20, 15))
plt.subplot(1, 2, 1)
plt.imshow(window_l)
plt.scatter(new_lcorners[:, 1], new_lcorners[:, 0], c='r')
plt.axis('off');

plt.subplot(1, 2, 2)
plt.imshow(window_r)
plt.scatter(new_rcorners[:, 1], new_rcorners[:, 0], c='r')
plt.axis('off');
In [367]:
most_inliers = 0
left_inliers = []
right_inliers = []

for i in range(1000):
    four_random = np.random.choice(a=range(new_lcorners.shape[0]), size=4, replace=False)
    four_l = new_lcorners[four_random, :]
    four_r = new_rcorners[four_random, :]
    H = computeH(four_l, four_r)

    l_inliers = []
    r_inliers = []
    for i in range(len(new_lcorners)):
        warped = H @ np.append(new_lcorners[i], 1)
        warped = warped[:2] / warped[2]
        dist = np.sqrt((warped[0] - new_rcorners[i][0])**2 + (warped[1] - new_rcorners[i][1])**2)
        if dist <= 2:
            l_inliers.append(tuple(new_lcorners[i]))
            r_inliers.append(tuple(new_rcorners[i]))

    if len(l_inliers) > most_inliers:
        most_inliers = len(l_inliers)
        left_inliers = l_inliers
        right_inliers = r_inliers
        
left_inliers = np.array(left_inliers)
right_inliers = np.array(right_inliers)

Step 5: The RANSAC Algorithm

After the feature matching in step 4, there are still way too many bad points in our correspondence. I use the following RANSAC algorithm to further narrow the number of correspondence points:

  1. Randomly sample four correspondence points and create a sample Homography matrix $H'$
  2. Transform all of the corners in the left image, A, with $H'$ and count the number of "inliers". Inliers are simply transformed points that are "close" to the corresponding point in the right image B. I counted all pairs with an euclidean distance less than 2 as inlier points.
  3. Save inlier points as the inlier set if it is larger than the previously computed inlier set.

The process above is repeated 1,000 times, though the number of iterations is fairly arbitrary. The final homography matrix $H$ is computed with the largest found inlier set. With the new homography matrix, I can finally create new (and hopefully improved!) mosaics. The final set of correspondence points is shown below:

In [368]:
f, axs = plt.subplots(1, 2, figsize=(20, 15))
plt.subplot(1, 2, 1)
plt.imshow(window_l)
plt.scatter(left_inliers[:, 1], left_inliers[:, 0], color='r')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(window_r)
plt.scatter(right_inliers[:, 1], right_inliers[:, 0], color='r')
plt.axis('off');
In [69]:
H = computeH(left_inliers, right_inliers)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0)) + 2000
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))

x_diff = int((new_x - window_l.shape[1])/2)
y_diff = int((new_y - window_l.shape[0])/2)

window_l_padded = np.zeros((new_y, new_x, 3))
window_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = window_l/255

window_r_padded = np.zeros((new_y, new_x, 3))
window_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = window_r/255

left_inliers[:, 0] += y_diff
left_inliers[:, 1] += x_diff
right_inliers[:, 0] += y_diff
right_inliers[:, 1] += x_diff

left_inliers = np.array(list(zip(left_inliers[:, 1], left_inliers[:, 0])))
right_inliers = np.array(list(zip(right_inliers[:, 1], right_inliers[:, 0])))

new_H = computeH(left_inliers, right_inliers)
window_lauto = warpImage(window_l_padded, new_H)
plt.imshow(window_lauto)

Automatic Mosaic Composition

The final results are shown below. The mosaics on the leftside are created from the autostiching algorithms, and the mosaics on the rightside are the manually created mosaics from part A.

In [373]:
plt.subplots(1, 2, figsize=(20, 15))

plt.subplot(1, 2, 1)
window_mosaic_auto = np.maximum(window_lauto, window_r_padded)
window_mosaic_auto = window_mosaic_auto[600:3600, 1300:, :]
plt.axis('off')
plt.imshow(window_mosaic_auto)

plt.subplot(1, 2, 2)
plt.axis('off')
plt.imshow(window_mosaic);
In [466]:
piano_lbw = color.rgb2gray(piano_l)
h_l, corners_l = get_harris_corners(piano_lbw)

piano_rbw = color.rgb2gray(piano_r)
h_r, corners_r = get_harris_corners(piano_rbw)

hl_max = np.unravel_index(np.argmax(h_l), h_l.shape)
all_lr = []
for p_i in corners_l.T:
    stronger = []
    for p_j in corners_l.T:
        if h_l[p_i[0]][p_i[1]] < 0.9*h_l[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hl_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_lr.append([p_i, np.min(r)])

hr_max = np.unravel_index(np.argmax(h_r), h_r.shape)
all_rr = []
for p_i in corners_r.T:
    stronger = []
    for p_j in corners_r.T:
        if h_r[p_i[0]][p_i[1]] < 0.9*h_r[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hr_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_rr.append([p_i, np.min(r)])

best_lcorners = sorted(all_lr, key=lambda x: -x[1])
best_lcorners = np.stack(np.array(best_lcorners)[:, 0], axis=0)[:500]
best_rcorners = sorted(all_rr, key=lambda x: -x[1])
best_rcorners = np.stack(np.array(best_rcorners)[:, 0], axis=0)[:500]

for corner in best_lcorners:
    if corner[0]-20 > 0 and corner[0]+20 < piano_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < piano_lbw.shape[0]:
        patch = piano_lbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        l_features.append((corner, patch))

r_features = []
for corner in best_rcorners:
    if corner[0]-20 > 0 and corner[0]+20 < piano_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < piano_lbw.shape[0]:
        patch = piano_rbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        r_features.append((corner, patch))
In [477]:
new_lcorners = []
new_rcorners = []
for l_feat in l_features:
    all_ssd = []
    
    for r_feat in r_features:
        ssd = np.sum((l_feat[1]-r_feat[1])**2)
        all_ssd.append(ssd)
        
    all_ssd = np.array(all_ssd)
    ssd_cpy = all_ssd.copy()
    best_arg = r_features[ssd_cpy.argmin()][0]
    x, y = best_arg
    best_ssd = ssd_cpy.min()
    ssd_cpy[ssd_cpy.argmin()] = np.float('inf')
    ratio = best_ssd / ssd_cpy.min()
    if ratio <= 0.7 and (x, y) not in new_rcorners:
        new_lcorners.append((l_feat[0][0], l_feat[0][1]))
        new_rcorners.append((x, y))

new_lcorners = np.stack(np.array(new_lcorners), axis=0)
new_rcorners = np.stack(np.array(new_rcorners), axis=0)

most_inliers = 0
left_inliers = []
right_inliers = []

for i in range(1000):
    four_random = np.random.choice(a=range(new_lcorners.shape[0]), size=4, replace=False)
    four_l = new_lcorners[four_random, :]
    four_r = new_rcorners[four_random, :]
    H = computeH(four_l, four_r)

    l_inliers = []
    r_inliers = []
    for i in range(len(new_lcorners)):
        warped = H @ np.append(new_lcorners[i], 1)
        warped = warped[:2] / warped[2]
        dist = np.sqrt((warped[0] - new_rcorners[i][0])**2 + (warped[1] - new_rcorners[i][1])**2)
        if dist <= 2:
            l_inliers.append(tuple(new_lcorners[i]))
            r_inliers.append(tuple(new_rcorners[i]))

    if len(l_inliers) > most_inliers:
        most_inliers = len(l_inliers)
        left_inliers = l_inliers
        right_inliers = r_inliers
        
left_inliers = np.array(left_inliers)
right_inliers = np.array(right_inliers)

H = computeH(left_inliers, right_inliers)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0) + 2000)
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))

x_diff = int((new_x - piano_l.shape[1])/2)
y_diff = int((new_y - piano_l.shape[0])/2)

if new_x - 2*x_diff > 4032:
    piano_l_padded = np.zeros((new_y, new_x, 3))
    piano_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = piano_l/255
    piano_r_padded = np.zeros((new_y, new_x, 3))
    piano_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = piano_r/255
elif new_y - 2*y_diff > 3024:
    piano_l_padded = np.zeros((new_y, new_x, 3))
    piano_l_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = piano_l/255
    piano_r_padded = np.zeros((new_y, new_x, 3))
    piano_r_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = piano_r/255
else:
    piano_l_padded = np.zeros((new_y, new_x, 3))
    piano_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = piano_l/255
    piano_r_padded = np.zeros((new_y, new_x, 3))
    piano_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = piano_r/255
left_inliers[:, 0] += y_diff
left_inliers[:, 1] += x_diff
right_inliers[:, 0] += y_diff
right_inliers[:, 1] += x_diff

left_inliers = np.array(list(zip(left_inliers[:, 1], left_inliers[:, 0])))
right_inliers = np.array(list(zip(right_inliers[:, 1], right_inliers[:, 0])))

new_H = computeH(left_inliers, right_inliers)
piano_lauto = warpImage(piano_l_padded, new_H)
In [481]:
plt.subplots(1, 2, figsize=(20, 15))

plt.subplot(1, 2, 1)
piano_mosaic_auto = np.maximum(piano_lauto, piano_r_padded)
piano_mosaic_auto = piano_mosaic_auto[1000:4200, :6500, :]
# plt.axis('off')
plt.imshow(piano_mosaic_auto)

plt.subplot(1, 2, 2)
plt.axis('off')
plt.imshow(piano_mosaic);
In [508]:
garage_lbw = color.rgb2gray(garage_l)
h_l, corners_l = get_harris_corners(garage_lbw)

garage_rbw = color.rgb2gray(garage_r)
h_r, corners_r = get_harris_corners(garage_rbw)

hl_max = np.unravel_index(np.argmax(h_l), h_l.shape)
all_lr = []
for p_i in corners_l.T:
    stronger = []
    for p_j in corners_l.T:
        if h_l[p_i[0]][p_i[1]] < 0.9*h_l[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hl_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_lr.append([p_i, np.min(r)])

hr_max = np.unravel_index(np.argmax(h_r), h_r.shape)
all_rr = []
for p_i in corners_r.T:
    stronger = []
    for p_j in corners_r.T:
        if h_r[p_i[0]][p_i[1]] < 0.9*h_r[p_j[0]][p_j[1]]:
            stronger.append(p_j)
    stronger.append(np.array(hr_max))
    r = dist2(p_i.reshape(1, 2), np.array(stronger))
    all_rr.append([p_i, np.min(r)])

best_lcorners = sorted(all_lr, key=lambda x: -x[1])
best_lcorners = np.stack(np.array(best_lcorners)[:, 0], axis=0)[:500]
best_rcorners = sorted(all_rr, key=lambda x: -x[1])
best_rcorners = np.stack(np.array(best_rcorners)[:, 0], axis=0)[:500]

for corner in best_lcorners:
    if corner[0]-20 > 0 and corner[0]+20 < garage_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < garage_lbw.shape[0]:
        patch = garage_lbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        l_features.append((corner, patch))

r_features = []
for corner in best_rcorners:
    if corner[0]-20 > 0 and corner[0]+20 < garage_lbw.shape[1] and corner[1]-20 > 0 and corner[1]+20 < garage_lbw.shape[0]:
        patch = garage_rbw[corner[0]-20:corner[0]+20, corner[1]-20:corner[1]+20]
        patch = cv2.resize(cv2.GaussianBlur(patch, (5, 5), 0), (8, 8)).flatten()
        patch = (patch-patch.mean())/patch.std()
        r_features.append((corner, patch))
In [509]:
new_lcorners = []
new_rcorners = []
for l_feat in l_features:
    all_ssd = []
    
    for r_feat in r_features:
        ssd = np.sum((l_feat[1]-r_feat[1])**2)
        all_ssd.append(ssd)
        
    all_ssd = np.array(all_ssd)
    ssd_cpy = all_ssd.copy()
    best_arg = r_features[ssd_cpy.argmin()][0]
    x, y = best_arg
    best_ssd = ssd_cpy.min()
    ssd_cpy[ssd_cpy.argmin()] = np.float('inf')
    ratio = best_ssd / ssd_cpy.min()
    if ratio <= 0.7 and (x, y) not in new_rcorners:
        new_lcorners.append((l_feat[0][0], l_feat[0][1]))
        new_rcorners.append((x, y))

new_lcorners = np.stack(np.array(new_lcorners), axis=0)
new_rcorners = np.stack(np.array(new_rcorners), axis=0)

most_inliers = 0
left_inliers = []
right_inliers = []

for i in range(1000):
    four_random = np.random.choice(a=range(new_lcorners.shape[0]), size=4, replace=False)
    four_l = new_lcorners[four_random, :]
    four_r = new_rcorners[four_random, :]
    H = computeH(four_l, four_r)

    l_inliers = []
    r_inliers = []
    for i in range(len(new_lcorners)):
        warped = H @ np.append(new_lcorners[i], 1)
        warped = warped[:2] / warped[2]
        dist = np.sqrt((warped[0] - new_rcorners[i][0])**2 + (warped[1] - new_rcorners[i][1])**2)
        if dist <= 2:
            l_inliers.append(tuple(new_lcorners[i]))
            r_inliers.append(tuple(new_rcorners[i]))

    if len(l_inliers) > most_inliers:
        most_inliers = len(l_inliers)
        left_inliers = l_inliers
        right_inliers = r_inliers
        
left_inliers = np.array(left_inliers)
right_inliers = np.array(right_inliers)

H = computeH(left_inliers, right_inliers)

coords = np.zeros((0, 2))
points = np.array([[0, 0, 1], [4032, 0, 1], [0, 3024, 1], [4032, 3024, 1]])

for p in points:
    new_p = H @ np.array(p)
    new_p = new_p[:2] / new_p[2]
    coords = np.vstack((coords, new_p))
    
new_x = int(np.max(coords[:, 0]) - min(np.min(coords[:, 0]), 0) + 2000)
new_y = int(np.max(coords[:, 1]) - min(np.min(coords[:, 1]), 0))

x_diff = int((new_x - garage_l.shape[1])/2)
y_diff = int((new_y - garage_l.shape[0])/2)

if new_x - 2*x_diff > 4032:
    garage_l_padded = np.zeros((new_y, new_x, 3))
    garage_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = garage_l/255
    garage_r_padded = np.zeros((new_y, new_x, 3))
    garage_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff-1, :] = garage_r/255
elif new_y - 2*y_diff > 3024:
    garage_l_padded = np.zeros((new_y, new_x, 3))
    garage_l_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = garage_l/255
    garage_r_padded = np.zeros((new_y, new_x, 3))
    garage_r_padded[y_diff:new_y-y_diff-1, x_diff:new_x-x_diff, :] = garage_r/255
else:
    garage_l_padded = np.zeros((new_y, new_x, 3))
    garage_l_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = garage_l/255
    garage_r_padded = np.zeros((new_y, new_x, 3))
    garage_r_padded[y_diff:new_y-y_diff, x_diff:new_x-x_diff, :] = garage_r/255
left_inliers[:, 0] += y_diff
left_inliers[:, 1] += x_diff
right_inliers[:, 0] += y_diff
right_inliers[:, 1] += x_diff

left_inliers = np.array(list(zip(left_inliers[:, 1], left_inliers[:, 0])))
right_inliers = np.array(list(zip(right_inliers[:, 1], right_inliers[:, 0])))

new_H = computeH(left_inliers, right_inliers)
garage_lauto = warpImage(garage_l_padded, new_H)
In [515]:
plt.subplots(1, 2, figsize=(20, 15))

plt.subplot(1, 2, 1)
garage_mosaic_auto = np.maximum(garage_lauto, garage_r_padded)
garage_mosaic_auto = garage_mosaic_auto[500:3200, :5500, :]
# plt.axis('off')
plt.imshow(garage_mosaic_auto)

plt.subplot(1, 2, 2)
plt.axis('off')
plt.imshow(garage_mosaic);

Final Takeaways

I think the coolest part from part (B) is the RANSAC algorithm. It seems like an extremely simple but effective way of removing outliers and extraneous features from a dataset. I do wonder if the algorithm has uses in other data science algorithms, however, since overfitting is not a huge problem and there are many other methods like model validation that already do a pretty good job. Also, in the age of big data, it's clear that neural networks are not picky and will take any data that they can get.

In [70]:
FILE = "main.html"

with open(FILE, 'r') as html_file:
    content = html_file.read()

# Get rid off prompts and source code
content = content.replace("div.input_area {
	display: none;","div.input_area {
	display: none;\n\tdisplay: none;")    
content = content.replace(".prompt {
	display: none;",".prompt {
	display: none;\n\tdisplay: none;")

f = open(FILE, 'w')
f.write(content)
f.close()