I’m currently working on a computer vision project to implement seam carving, which is explained quickly in this video by the technique’s inventors. The goal of seam carving is to allow for content-aware image resizing. Part of this algorithm is a numerical method to identify the ‘important’ content of an image (at least w/r/t re-sizing). I wanted to see what this algorithm would ID as ‘important’ in some great artworks from history. So I ran the algorithm on some of my favorite paintings (low-res versions from the internet) and am sharing the outputs here.
Brief Technical Explanation (skip if not interested)
The short story is that seam carving allows us to re-size images without distorting the important details. Seam carving is the process of finding connected paths of pixels with “low energy” running through the image–call these paths seams–and then removing those seams instead of a full column or row which might intersect with something important in the image, like a person’s face, the edge of a large object, etc… The easiest way to get an overview is to watch the video above, but hopefully that’s a succinct enough explanation for anyone impatient like me.
The first step in implementing the seam carving algorithm is to compute what’s called the energy function of the image’s pixel gradients. We can do this by plotting the image’s grayscale values (0-255) in a 2D matrix, applying a filter to identify edge gradients in both the x-direction and y-direction, and then computing the magnitude of each pixel’s edge gradient, weighing the x-gradient and y-gradient equally: (i.e. — sqrt(x^2 + y^2) ). If that doesn’t make a ton of sense yet, don’t worry. The video is the easiest way to understand the concept, and reading the paper is the easiest way to understand the process. This is just the super succinct version. (I also probably explained it badly.)
The Energy Function Code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|function [ energy_matrix ] = energy_image( image_matrix_input )|
|%ENERGY_IMAGE Computes the energy at each pixel in a matrix nxmx3 matrix|
|% Outputs a 2D-matrix containing energy equation outputs, of datatype DBL|
|% convert image to grayscale first|
|G = rgb2gray(image_matrix_input);|
|% convert to double|
|G2 = im2double(G);|
|% create X and Y filters|
|horizontal_filter = [1 0 –1; 2 0 –2; 1 0 –1];|
|vertical_filter = [1 2 1; 0 0 0 ; –1 –2 –1];|
|% using imfilter to get our gradient in each direction|
|filtered_x = imfilter(G2, horizontal_filter);|
|filtered_y = imfilter(G2, vertical_filter);|
|energy_matrix = zeros(size(G2,1), size(G2,2));|
|% compute the energy at each pixel using the magnitude of the x and y|
|% gradients: sqrt((dI/dX)^2+(dI/dY)^2))|
|for y = 1:size(G2, 1)|
|for x = 1:size(G2,2)|
|% calculate energy function|
|y_magnitude = filtered_y(y,x) ^ 2;|
|x_magnitude = filtered_x(y,x) ^ 2;|
|energy_output = sqrt( y_magnitude + x_magnitude );|
|% fill energy matrix with our calculation|
|energy_matrix(y,x) = energy_output;|
The Good Stuff
Okay, so now we can compute the energy function of any image.
And the supposed output of this energy function is a mapping where the the important details of the image–the ones we don’t want to carve out when we re-size–are represented in higher intensities of white. The less important details of the image are represented in darker shades of gray and black.
So my question is what happens when we run historical artworks through this algorithm? Does it tell us anything new and novel about artist, his techniques, etc? I’ve always heard that the really well educated and expert art historians can confirm or deny the veracity of an artwork by looking at brush strokes, and the like. I’ve also read stories about taking x-rays of famous artworks, and revealing hidden paintings beneath.
For the most part I’ll let you decide, but in any case, I found some of the outputs to be really beautiful and interesting in their own right. Enough talking for now, here’s the imagery.
You’ll have to click through to see the images in full-size (bottom right button in popup after clicking on the thumbnail), and some of the more interesting details need magnification, which requires downloading the images, just as a heads up.
There’s a lot more that can be done with this, of course, and this post has no real analysis, conclusions, or anything of the like. But I wanted to share what I was playing around with, in case anyone else finds it interesting, like I do. -t