In [5]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')
Out[5]:
The raw code for this IPython notebook is by default hidden for easier reading. To toggle on/off the raw code, click here.
In [2]:
import matplotlib.pyplot as plt
%matplotlib inline

Project 1: Images of the Russian Empire

In this project, we are trying to use image processing technique to automatically produce a color image with as few visual artifacts as possible. Here's the implementation we did:

  1. single-scale alignmen for low-resolution images (jpg image)
  2. pyramid searching for high-resolution images (tif image)
  3. automatic cropping the artificial edges

Issues I encountered and the way I solved the issues:
The runtime at first is pretty long for pyramid algorithm. I found two ways to solve the issue:

  1. When I calculate the SSD, I cropped the images to only calculate the SSD of 80% regions. By doing so, the alignment can be more accurate as it removes the black and white borders. And it can save the runtime.
  2. The two parameters "search times" and "window size" matter for the runtime. Large window size or large search times cause long runtime. But smaller ones would not suffice to align the images well. By varying these two parameters, I found the optimal value is window size=3, search times=6.

1. single-scale alignment

For low-resolution images (jpg images), we implement exhaustic search to align the three color channels (r,g,b) and produce a color image. Algorithm:

  1. We firstly read an image from the disc, and seperate the image into three channels (r,g,b). Then we aligned r and g with respective to b.
  2. We shift r image over a user-defined window, say $[-15,15]$ pixels. For each shifted image, we calculte the score that quantifies how well the image is aligned with b. The score is measured as the Sum of Squared Differences (SSD) distance which is simply sum(sum((image1-image2).^2)). Notice that here we crop the images by 10% when calculating the SSD to remove the edges of the images. Lower SSD means better alignment. We search over the window, and find the displacement that gives the smallest SSD.
  3. We shift r image by the displacement got from step2. Repeat this process and find the displacement of g.
  4. Stack r,g,b together to produce a color image.

Here's the result for jpg images, the displacements are indicated in the bracket:

In [24]:
img1=plt.imread('out_cathedral.jpg')
img2=plt.imread('out_monastery.jpg')
img3=plt.imread('out_obolsk.jpg')

f=plt.figure(figsize=(20, 20))
ax1=f.add_subplot(1,3,1)
ax2=f.add_subplot(1,3,2)
ax3=f.add_subplot(1,3,3)

ax1.imshow(img1)
ax2.imshow(img2)
ax3.imshow(img3)
ax1.set_axis_off()
ax2.set_axis_off()
ax3.set_axis_off()
ax1.set_title('cathedral.jpg, g[5,2], r[12,3]')
ax2.set_title('monastery.jpg, g[-3,2], r[3,2]')
ax3.set_title('tobolsk.jpg, g[3,3], r[6,3]')

plt.show()

2. pyramid searching

For high-resolution images (tif images), we implement pyramid search by recursively calling the exhaustic search algorithm to align the three color channels (r,g,b) and produce a color image. Algorithm:

  1. To align r with respective to b, we firstly generate a image pyramid of r and b. An image pyramid represents the image at multiple scales (scale factor is a multiple of 2). We assume that we generate a pyramid of r with 6 levels (more levels means more searching times and would increase the runtime).
  2. We start with the 6-th level, that is we scale the image r and b by factor 1/2**(6-1)=1/32. We then call the exhaustic search algorithm to align the scaled r and b images based on SSD. In this case, the search window could be small, say $[-3,3]$ to fully take advantage of pyramid and save runtime.Then we know the displacement between r and b at 6-th level.
  3. We then align r and b at 5-th level by calling the exhaustic search algorithm again. But at this time, the search window is centered at the image shift form 6-th level. Notice that the shift need to be multiplied by 2 in order to address the different scale between these two level. Update the displacement bwteen r and b.
  4. Repeat this process until we goes to level-0, that is the original image. And following the steps to align g relative to b.
  5. Stack the shifted r,g,b to produce the color image.

Here's the result for tif images, the displacements are indicated in the bracket:

In [37]:
img1=plt.imread('out_emir.jpg')
img2=plt.imread('out_harvesters.jpg')
img3=plt.imread('out_con.jpg')
img4=plt.imread('out_lady.jpg')
img5=plt.imread('out_melons.jpg')
img6=plt.imread('out_onion_church.jpg')
img7=plt.imread('out_self_portra.jpg')
img8=plt.imread('out_hree_generations.jpg')
img9=plt.imread('out_rain.jpg')
img10=plt.imread('out_village.jpg')
img11=plt.imread('out_worksho.jpg')
In [39]:
f=plt.figure(figsize=(20, 20))
ax1=f.add_subplot(4,3,1)
ax2=f.add_subplot(4,3,2)
ax3=f.add_subplot(4,3,3)
ax4=f.add_subplot(4,3,4)
ax5=f.add_subplot(4,3,5)
ax6=f.add_subplot(4,3,6)
ax7=f.add_subplot(4,3,7)
ax8=f.add_subplot(4,3,8)
ax9=f.add_subplot(4,3,9)
ax10=f.add_subplot(4,3,10)
ax11=f.add_subplot(4,3,11)


ax1.imshow(img1)
ax2.imshow(img2)
ax3.imshow(img3)
ax4.imshow(img4)
ax5.imshow(img5)
ax6.imshow(img6)
ax7.imshow(img7)
ax8.imshow(img8)
ax9.imshow(img9)
ax10.imshow(img10)
ax11.imshow(img11)

ax1.set_axis_off()
ax2.set_axis_off()
ax3.set_axis_off()
ax4.set_axis_off()
ax5.set_axis_off()
ax6.set_axis_off()
ax7.set_axis_off()
ax8.set_axis_off()
ax9.set_axis_off()
ax10.set_axis_off()
ax11.set_axis_off()

ax1.set_title('emir.tif, g[49,24], r[103,57]')
ax2.set_title('harvester.tif, g[59,16], r[124,13]')
ax3.set_title('icon.tif, g[41,17], r[90,23]')
ax4.set_title('lady.tif, g[56,8], r[116,11]')
ax5.set_title('melons.tif, g[82,11], r[178,13]')
ax6.set_title('onion_church.tif, g[51,27], r[108,36]')
ax7.set_title('self_portrait.tif, g[79,29], r[176,36]')
ax8.set_title('three_generations.tif, g[53,14], r[112,11]')
ax9.set_title('train.tif, g[42,6], r[87,32]')
ax10.set_title('village.tif, g[65,12], r[138,22]')
ax11.set_title('workshop.tif, g[53,0], r[105,-12]')

plt.show()

Other example images downloaded from the Prokudin-Gorskii collection

In [4]:
img1=plt.imread('out_00082a.jpg')
img2=plt.imread('out_00162a.jpg')
img3=plt.imread('out_00163v.jpg')
img4=plt.imread('out_00170v.jpg')

f=plt.figure(figsize=(20, 20))
ax1=f.add_subplot(1,4,1)
ax2=f.add_subplot(1,4,2)
ax3=f.add_subplot(1,4,3)
ax4=f.add_subplot(1,4,4)

ax1.imshow(img1)
ax2.imshow(img2)
ax3.imshow(img3)
ax4.imshow(img4)
ax1.set_axis_off()
ax2.set_axis_off()
ax3.set_axis_off()
ax4.set_axis_off()
ax1.set_title('tif 1, g[32,4], r[79,7]')
ax2.set_title('tif 2, g[35,3], r[98,4]')
ax3.set_title('jpg 1, g[-3,1], r[-4,1]')
ax4.set_title('jpg 2, g[4,-2], r[-6,-6]')

plt.show()

3. Automatic cropping

We implement automatic cropping to remove white, black or other color borders. The algorithm is:

  1. For an input image, we firstly detect the edge for the three channels by using sobel filter.
  2. For the sobel-filtered three images, we calculate the row averages. If the average of certain row deviates from the mean by more than 2*standard deviation (2 is the threshold that can be changed in the code), then that row can be white/black/color border that need to be removed.
  3. Calculate the column average and remove the vertical border.
In [41]:
img1=plt.imread('out_con.jpg')
img2=plt.imread('crop_out_con.jpg')

f=plt.figure(figsize=(20, 20))
ax1=f.add_subplot(1,2,1)
ax2=f.add_subplot(1,2,2)

ax1.imshow(img1)
ax2.imshow(img2)
ax1.set_axis_off()
ax2.set_axis_off()
ax1.set_title('icon.jpg, raw image')
ax2.set_title('crop_icon.jpg, cropped image')

plt.show()
In [ ]: