CS194-26 Project1

1. overview:

the project’s main idea is to align R, G, B channels to form a single image. We are going to set displacement based on two score functions: either ssd or ncc.

2. for small .jpg file:

we can simply search for all possible cases in a search space from [-15, 15] which is 31 * 31 for green and blue or red and blue pairs. Then we can simply use np.dstack to stack these three channels to produce the final result. I accidently implement dstack(maybe not the exact same as np.dstack as I didn’t zero out the rolling numbers) AlsO, I implement a crop_border_further function and crop_broder function. The former crop an image by one tenth of the axis = 1 direction for each four corners. The latter hard code the crop process to crop out the white and black border of the original image.

monastery.jpg

[gx = -3, gy = 2, rx = 3, ry = 2]
cathedral.jpg

[gx = 5, gy = 2, rx = 12, ry = 3]
tobolsk.jpg

[gx = 3, gy = 3, rx = 6, ry = 3]

3. for larger .tif files

use the algorithm in the spec called the image pyramid. The processing is done from the coarsest scale and going down the pyramid, updating the displacememt for each level. This is just a slight modification for simple .png update method.

Note: (1). for the emir.tff, I choose to only crop the original image by applying the crop_border function, which makes the image looks much better. (2). tried to use 7 layers first, but it seems to be slow[more than 1 min per .tff image], so lower the pyramid level to 5. (faster and looks pretty the same as the 7 layer ones.)

workshop.tff
gx, gy, rx, ry: 0 52 -12 104
emir.tff
gx, gy, rx, ry: 22 32 36 28
three_generations.tff