This fun little project was inspired by a forum post for the Coursera course Discrete Inference and Learning in Artificial Vision.
I use the method outlined in Graphcut Textures: Image and Video Synthesis Using Graph Cuts.
The toy problem is as follows. Given two images overlaid on top of each, with say a 50% overlap, find the best seam cut through the top most image to produce the best “blended” looking image. No actual alpha blending is performed though!
This is a task that can be done physically with two photos and a pair of scissors.
The problem is illustrated with the example below. Here the second image I want to blend with is a duplicate of the first image. The aim is to find a suitable seam cut in the top image such that when I merge the two images it produces the smoothest transition. This may seem unintuitive without alpha blending but it possible depending on the image, not all type of images will work with this method.
By formulating the problem as a graph cut problem we get the following result.
and the actual seam cut in thin red line. You might have to squint.
If you look closely you’ll see some strangeness at the seam, it’s not perfect. But from a distance it’s pretty convincing.
Here are a more examples using the same method as above, that is: duplicate the input image, shift by 50% in the x direction, find the seam cut in the top layer image.
This one is very realistic.
Who likes penguins?
You’ll need OpenCV 2.x install.
I’ve also included the maxflow library from http://vision.csd.uwo.ca/code/ for convenience.
To run call
$ ./SeamCut img.jpg
I’ve been mucking around with video stabilization for the past two weeks after a masters student got me interested in the topic. The algorithm is pretty simple yet produces surprisingly good stabilization for panning videos and forwarding moving (eg. on a motorbike looking ahead). The algorithm works as follows:
- Find the transformation from previous to current frame using optical flow for all frames. The transformation only consists of three parameters: dx, dy, da (angle). Basically, a rigid Euclidean transform, no scaling, no sharing.
- Accumulate the transformations to get the “trajectory” for x, y, angle, at each frame.
- Smooth out the trajectory using a sliding average window. The user defines the window radius, where the radius is the number of frames used for smoothing.
- Create a new transformation such that new_transformation = transformation + (smoothed_trajectory – trajectory).
- Apply the new transformation to the video.
Here’s an example video of the algorithm in action using a smoothing radius of +- 30 frames.
We can see what’s happening under the hood by plotting some graphs for each of the steps mentioned above on the example video.
This graph shows the dx, dy transformation for previous to current frame, at each frame. I’ve omitted da (angle) because it’s not particularly interesting for this video since there is very little rotation. You can see it’s quite a bumpy graph, which correlates with our observation of the video being shaky, though still orders of magnitude better than Hollywood’s shaky cam effect. I’m looking at you Bourne Supremacy.
Step 2 and 3
I’ve shown both the accumulated x and y, and their smoothed version so you get a better idea of what the smoothing is doing. The red is the original trajectory and the green is the smoothed trajectory.
It is worth noting that the trajectory is a rather abstract quantity that doesn’t necessarily have a direct relationship to the motion induced by the camera. For a simple panning scene with static objects it probably has a direct relationship with the absolute position of the image but for scenes with a forward moving camera, eg. on a car, then it’s hard to see any.
The important thing is that the trajectory can be smoothed, even if it doesn’t have any physical interpretation.
This is the final transformation applied to the video.
videostabKalman.cpp (live version by Chen Jia using a Kalman Filter)
You just need OpenCV 2.x or above.
Once compile run it from the command line via
Footages I took during my travels.
We (being myself and my buddy Jay) have been working on a fun vision pet project over the past few months. The project started from a little boredom and lots of discussion over wine back in July 2013. We’ve finally got the video done. It demonstrates our vision based localisation system (no GPS) on a car.
The idea is simple, to use the horizon line as a stable feature when performing image matching. The experiments were carried out on the freeway at 80-100 km/h (hence the name of the project). The freeway is just one long straight road, so the problem is simplified and constrained to localisation on a 1D path.
Now without further adieu, the video
We’re hoping to do more work on this project if time permits. The first thing we want to improve on is the motion model. At the moment, the system assumes the car travels at the same speed as the previously collected video (which is true most of the time, but not always eg. bad traffic). We have plans to determine the speed of the vehicle more accurately.
Don’t forget to visit Jay’s website as well !