Overview
The goal is to take the digitized Prokudin-Gorskii glass-plate images and automatically produce a color image with as few artifacts as possible. Each plate contains three grayscale exposures captured through different color filters stacked vertically (top→bottom: B, G, R). I extract the three channels, align G and R to B under an (x, y) translation model, and compose an RGB image.
Full-resolution plates are large, so I begin with smaller images or downsampled versions, then scale up via a coarse-to-fine image pyramid. For each image I report the displacement vectors used to align G→B and R→B.
Core Idea — Binarized High-Pass Features
Binarization is the distinctive choice in my pipeline. Rather than comparing raw intensities across channels (which often differ in brightness/contrast), I compare structure by computing a high-pass map, taking absolute value, and thresholding to a binary mask. This emphasizes edges/texture that are present in all channels and works well with normalized cross-correlation (NCC) to reduce sensitivity to gain/offset changes.
Example: Raw → High-Pass → Binarized
Approach
Single-scale baseline. Split B/G/R strips and align G and R to B by exhaustively searching displacements within a window using NCC on high-pass binarized features. Apply the best shift to the original channel and compose RGB.
Coarse-to-fine pyramid. For large displacements, build an image pyramid, estimate at coarse scale, and refine toward full resolution. Cropping a small border before scoring helps avoid plate-edge bias.
Iteration 1 — Results (Baseline)
Aligned outputs for the required set. Displacements (dx, dy) are reported for G→B and R→B. Click any thumbnail to view larger.
Displacement Offsets
| Image | G → B (dx, dy) | R → B (dx, dy) |
|---|
Iteration 2 — Results (Coarse-to-Fine + Simple Crop)
A coarse-to-fine alignment wrapper that accumulates shifts from coarse → fine (with a simple inner crop to suppress plate borders), reusing the same high-pass + threshold + NCC matching. This approach showed little to no improvement, and in some cases, even worse performance than the original.
Displacement Offsets (Iteration 2)
| Image | G → B (dx, dy) | R → B (dx, dy) |
|---|
Additional Examples from the Collection
Discussion: Failures & Fixes
Brightness mismatch (e.g., Emir). Cross-channel intensity differences can still confuse raw-pixel similarity, which is why high-pass binarized features with NCC help. Increasing the crop margin or using gradient magnitude can further improve robustness.
Border artifacts. Strong horizontal borders from plate edges can dominate similarity. Cropping a fixed margin (e.g., 5–10% per side) before scoring helps.
Large shifts. A pyramid with smaller windows per level efficiently recovers large motions.