I was never much into image processing. Sure, like most programmers I dabbled into it for cropping images or doing some fancy-schmancy filtering effects stuff. I even wrote a Flickr clone for my last book which has a rather impressive photo editor (mashed up from Pixlr, not mine). But I never thought much on how those effects were done or who came up with them in the first place. That is until I met Irwin Sobel.
For those who know their image processing, this should ring bells immediately. Yes, it’s that Sobel. But a minute to give some background — Irwin is a colleague of mine working in the Mobile and Immersive Experience Lab in HP Labs. I was visiting about two weeks ago and was introduced to him and his current projects. Inevitably someone talked about the Sobel operator, a commonly used algorithm used for edge detection. I was, unfortunately, totally clueless about what it was. Not good. So not surprisingly I ended up Googling for ‘Sobel operator’ at the first possible chance and found out what it was.
The Sobel operator is an algorithm for edge detection in images. Edge detection for those who are not familiar with the term, is an image processing technique to discover the boundaries between regions in an image. It’s an important part of detecting features and objects in an image. Simply put, edge detection algorithms help us to determine and separate objects from background, in an image.
The Sobel operator does this in a rather clever way. An image gradient is a change in intensity (or color) of an image (I’m over simplifying but bear with me). An edge in an image occurs when the gradient is greatest and the Sobel operator makes use of this fact to find the edges in an image. The Sobel operator calculates the approximate image gradient of each pixel by convolving the image with a pair of 3×3 filters. These filters estimate the gradients in the horizontal (x) and vertical (y) directions and the magnitude of the gradient is simply the sum of these 2 gradients.
The magnitude of the gradient, which is what we use, is calculated using:
That’s the simplified, 2-paragraph theory behind the algorithm. If this fascinates you, you should grab a couple of books on image processing and computer vision and go through them.
Let’s look at how to implement the Sobel operator. This is simply by creating the 2 filters and running them through each pixel in the image, starting from the left and going right. Note that because the filter is a 3×3 matrix, the pixels in the first and last rows as well as the first and last columns cannot be estimated so the output image will be a 1 pixel-depth smaller than the original image.
To calculate the pixel in the right side of the equation (the one with coordinates 1,1) the following equation is used:
output pixel [1,1] = ([0,0] x -1) + ([0,1] x 0) + ([0,2] x 1) + ([1,0] x -2) + ([1,1] x 0) + ([1,2] x 2) + ([2,0] x -1) + ([2,1] x 0) + ([2,2] x 1)
To simplify matters even more, the grayscale version of the original image is usually used.
Now let’s look at the Ruby implementation
require 'chunky_png' class ChunkyPNG::Image def at(x,y) ChunkyPNG::Color.to_grayscale_bytes(self[x,y]).first end end img = ChunkyPNG::Image.from_file('engine.png') sobel_x = [[-1,0,1], [-2,0,2], [-1,0,1]] sobel_y = [[-1,-2,-1], [0,0,0], [1,2,1]] edge = ChunkyPNG::Image.new(img.width, img.height, ChunkyPNG::Color::TRANSPARENT) for x in 1..img.width-2 for y in 1..img.height-2 pixel_x = (sobel_x * img.at(x-1,y-1)) + (sobel_x * img.at(x,y-1)) + (sobel_x * img.at(x+1,y-1)) + (sobel_x * img.at(x-1,y)) + (sobel_x * img.at(x,y)) + (sobel_x * img.at(x+1,y)) + (sobel_x * img.at(x-1,y+1)) + (sobel_x * img.at(x,y+1)) + (sobel_x * img.at(x+1,y+1)) pixel_y = (sobel_y * img.at(x-1,y-1)) + (sobel_y * img.at(x,y-1)) + (sobel_y * img.at(x+1,y-1)) + (sobel_y * img.at(x-1,y)) + (sobel_y * img.at(x,y)) + (sobel_y * img.at(x+1,y)) + (sobel_y * img.at(x-1,y+1)) + (sobel_y * img.at(x,y+1)) + (sobel_y * img.at(x+1,y+1)) val = Math.sqrt((pixel_x * pixel_x) + (pixel_y * pixel_y)).ceil edge[x,y] = ChunkyPNG::Color.grayscale(val) end end edge.save('engine_edge.png')
First thing you’d notice is that I used a library called ChunkyPNG, which is PNG manipulation library that is implemented in pure Ruby. While wrappers over ImageMagick (like RMagick) is probably the defacto image processing and manipulation library in Ruby, I thought it’s kind of pointless to do a Sobel operator with ImageMagick since it already has its own edge detection implementation.
To simplify the implementation, I opened up the Image class in ChunkyPNG and added a new method that will return a grayscale pixel at a specific location. Then I created the 2 Sobel filters with arrays of arrays. I created 2 nested loops to iterate through each pixel column by column, then row by row and at each pixel I used the equation above to calculate the gradient by applying the x filter then the y filter. Finally I used the gradient and set a grayscale pixel based on the gradient value, on a new image.
Here you can see the original image, which I reused from the Wikipedia entry on Sobel operator.
And the edge detected image with the x filter applied only.
This is the edge detected image with the y filter only.
Finally this is the edge detected image with both x and y filters applied.
This short exercise might not be technically challenging but it made me appreciate the pioneers who invented things that we now take for granted. Here’s a final picture, one with myself and Irwin (he is the guy who’s sitting opposite me), and a bunch of other colleagues at HP Labs Palo Alto over lunch. Thanks Irwin, for the Sobel operator!