To compute the homography matrix mapping a set of points x y
to another x' y
, we solve the below equations:
d * x'y' = xy @ H.T, where H[-1, -1] = 1
-AKA-
ax + by + c = dx'
dx + ey + f = dy'
gx + hy + 1 = d
ax + by + c = (gx + hy + 1) x'
dx + ey + f = (gx + hy + 1) y'
ax + by + c - gxx' - hyx'= x'
dx + ey + f - gxy' - hyy'= y'
M @ H = x'y'
homography matrix [[ 5.43205351e+00 1.08981878e+00 4.33645260e+02] [-6.05716714e-01 1.03122818e+00 1.17001073e+03] [ 7.45561496e-04 -9.55204138e-04 1.00000000e+00]] average homography error 8.462772680970875 difference with to `cv2.findHomography` -0.007752398240070041
Image warping is done by inverse sampling, that is for every pixel in the target image, look up the corresponding source point(s) and billinearly interpolate over them. This requires computing and applying the homography matrix from the target image to the source image.
See below resuts where some tiles and a book are "rectified" to be on a plane paralell to the camera's.
For mosaicing, all images are warped to be aligned with the target image, then blended using multi-frequency blending which we implemented in project 2.
Note how there are slight artifacts due to camera postprocessing / exposure being different in the two frames. It's most visible in the kitchen scene, where the two images have visibly different "warmth" or blue-orange balance.
One thing I learned here is how importaint it is that the constraints of projective alignment are respected. (co-planar images if camera pos varies, or camera can only rotate) If they aren't then nasty artifacts will appear due to the 3d structure of the scene making a valid homography unsolveable.
im1_feature idx11, loc[352 270] im2_feature idx11, loc[363 33] nn1/nn2 0.3987499959318181 nn1_dist 4.558091104563188
im1_feature idx29, loc[410 231] im2_feature idx123, loc[379 26] nn1/nn2 0.4758341835146864 nn1_dist 6.475559728160555
im1_feature idx37, loc[336 267] im2_feature idx41, loc[345 28] nn1/nn2 0.1888189817077809 nn1_dist 1.0175526751197754
im1_feature idx41, loc[391 287] im2_feature idx45, loc[399 54] nn1/nn2 0.47619068502730066 nn1_dist 7.785778434153011
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
The most interesting thing I learned from part B was how fragile the features being used could be. When their results were compared to the hand-labeled mosaics - there were very noticeable artifacts due to issues like slight positional mismatches of features, or them being in ambiguous or transient places. Examples of these are edge features along a wall or junctions resulting from intersections / occlusions of objects.