Images of the Russian Empire

CS194-26 (CS294-26): Project 1

Overview:

Sergei Mikhailovich Prokudin-Gorskii did color photography when color photography wasn't even possible. He captured daily civilian life into a glass plate using red (r), green (g), blue(b).

Therefore, the aim of this project is to bring life to color photos from Prodkudin-Gorskii's RGB photos using displacement heuristic techniques and other image processing techniques.

Approach:

  • The initial implementation was really snow for the large .tif files, so I implemented an image pyramid for the large .tif files. Essentially, I recursively used my displacement methdo on rescaled versions of the original image, cutting the resolution in by log 2 each time.

  • I noticed similar results between SSD and NCC and I used SSD. SSD is essentially: $Σ Σ (r - b)^2$ or $Σ Σ (g - b)^2$

  • When the recursion base case hit, I stop rescaling. Then I upscale, then I only stop when the whole image has been finished.

Naive Approach:

My initial approach proved to work really well for low resolution photos but terribly slow for high resolution photos. It used a displacement heuristic based on ssd.

In [57]:
#without croppping
low_res_implementation('cathedral.jpg');
green align (x,y) (5, 2)
red align (x,y) (12, 3)
In [51]:
low_res_implementation('monastery.jpg');
green align (x,y) (-3, 2)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (3, 2)

Image Pyramid Approach:

Exhaustive search is super slow for large images. Therefore, we implement a faster search function called 'pyramid', as seen below.

Result of my algorithm on example images:

In [28]:
 
green align (x,y) (5, 2)
red align (x,y) (12, 3)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
green align (x,y) (-3, 2)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (3, 2)
green align (x,y) (3, 3)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (6, 3)
green align (x,y) 49 ,  24
red align (x,y) 103 ,  55
green align (x,y) 59 ,  16
red align (x,y) 123 ,  13
green align (x,y) 41 ,  17
red align (x,y) 89 ,  23
green align (x,y) 51 ,  9
red align (x,y) 112 ,  11
green align (x,y) 81 ,  10
red align (x,y) 178 ,  13
green align (x,y) 51 ,  26
red align (x,y) 108 ,  36
green align (x,y) 78 ,  29
red align (x,y) 176 ,  37
green align (x,y) 53 ,  14
red align (x,y) 112 ,  11
green align (x,y) 42 ,  5
red align (x,y) 87 ,  32
green align (x,y) 56 ,  21
red align (x,y) 116 ,  28
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-28-de5ece90ef5c> in <module>
     12     for i in tifs:
     13         pyramid(i)
---> 14 print_all()

<ipython-input-28-de5ece90ef5c> in print_all()
     11 
     12     for i in tifs:
---> 13         pyramid(i)
     14 print_all()

<ipython-input-16-2c191367dd75> in pyramid(imname)
      1 def pyramid(imname):
      2     # read in the image
----> 3     im = skio.imread(imname)
      4 
      5     # convert to double (might want to do this later on to save memory)

/anaconda3/lib/python3.7/site-packages/skimage/io/_io.py in imread(fname, as_gray, plugin, flatten, **plugin_args)
     60 
     61     with file_or_url_context(fname) as fname:
---> 62         img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
     63 
     64     if not hasattr(img, 'ndim'):

/anaconda3/lib/python3.7/site-packages/skimage/io/manage_plugins.py in call_plugin(kind, *args, **kwargs)
    212                                (plugin, kind))
    213 
--> 214     return func(*args, **kwargs)
    215 
    216 

/anaconda3/lib/python3.7/site-packages/skimage/io/_plugins/tifffile_plugin.py in imread(fname, dtype, **kwargs)
     27     if 'img_num' in kwargs:
     28         kwargs['key'] = kwargs.pop('img_num')
---> 29     with open(fname, 'rb') as f:
     30         tif = TiffFile(f)
     31         return tif.asarray(**kwargs)

FileNotFoundError: [Errno 2] No such file or directory: 'village.tifwood_boat.tif'
In [32]:
## sorry ignore above error.. here are the rest:)
green align (x,y) 64 ,  12
red align (x,y) 137 ,  22
green align (x,y) 34 ,  -14
red align (x,y) 121 ,  -25
green align (x,y) 53 ,  0
red align (x,y) 105 ,  -12

Additional Images included above

Description of Bells and Whistles:

(1) Auto-cropping:

  • I limited the cropping to be 10% of the total edge-length for each given edge.
  • Then, I set a threshold min value and tested each row and column.
  • I cropped any rows or columns that exceeded threshold.

After Cropping

In [54]:
low_res_implementation('cathedral.jpg');
green align (x,y) (1, -1)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (7, -1)

Before Cropping

In [52]:
low_res_implementation('cathedral.jpg');
green align (x,y) (5, 2)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (12, 3)

(2) Contrast Adjustment

  • Using R, G, and B channels and applied a sigmoid function to every pixel
  • Then I merged all the channels
  • Then I tuned it to the best results

After contrast:

In [55]:
low_res_implementation('cathedral.jpg');
green align (x,y) (1, -1)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
red align (x,y) (7, -1)

Before contrast

In [58]:
low_res_implementation('cathedral.jpg');
green align (x,y) (5, 2)
red align (x,y) (12, 3)
In [35]: