I’ve been mucking around with video stabilization for the past two weeks after a masters student got me interested in the topic. The algorithm is pretty simple yet produces surprisingly good stabilization for panning videos and forwarding moving (eg. on a motorbike looking ahead). The algorithm works as follows:
- Find the transformation from previous to current frame using optical flow for all frames. The transformation only consists of three parameters: dx, dy, da (angle). Basically, a rigid Euclidean transform, no scaling, no sharing.
- Accumulate the transformations to get the “trajectory” for x, y, angle, at each frame.
- Smooth out the trajectory using a sliding average window. The user defines the window radius, where the radius is the number of frames used for smoothing.
- Create a new transformation such that new_transformation = transformation + (smoothed_trajectory – trajectory).
- Apply the new transformation to the video.
Here’s an example video of the algorithm in action using a smoothing radius of +- 30 frames.
We can see what’s happening under the hood by plotting some graphs for each of the steps mentioned above on the example video.
Step 1
This graph shows the dx, dy transformation for previous to current frame, at each frame. I’ve omitted da (angle) because it’s not particularly interesting for this video since there is very little rotation. You can see it’s quite a bumpy graph, which correlates with our observation of the video being shaky, though still orders of magnitude better than Hollywood’s shaky cam effect. I’m looking at you Bourne Supremacy.
Step 2 and 3
I’ve shown both the accumulated x and y, and their smoothed version so you get a better idea of what the smoothing is doing. The red is the original trajectory and the green is the smoothed trajectory.
It is worth noting that the trajectory is a rather abstract quantity that doesn’t necessarily have a direct relationship to the motion induced by the camera. For a simple panning scene with static objects it probably has a direct relationship with the absolute position of the image but for scenes with a forward moving camera, eg. on a car, then it’s hard to see any.
The important thing is that the trajectory can be smoothed, even if it doesn’t have any physical interpretation.
Step 4
This is the final transformation applied to the video.
videostabKalman.cpp (live version by Chen Jia using a Kalman Filter)
You just need OpenCV 2.x or above.
Once compile run it from the command line via
./videostab input.avi
More videos
Footages I took during my travels.
#!/usr/bin/env python
import cv2, sys
import numpy as np
import pandas as pd
if len(sys.argv) < 2:
print “Usage: [input file]”
fin = sys.argv[1]
cap = cv2.VideoCapture(fin)
N = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
status, prev =
prev_gray = cv2.cvtColor(prev, cv2.COLOR_BGR2GRAY)
(h,w) = prev.shape[:2]
last_T = None
prev_to_cur_transform = []
for k in range(N-1):
status, cur =
cur_gray = cv2.cvtColor(cur, cv2.COLOR_BGR2GRAY)
prev_corner = cv2.goodFeaturesToTrack(prev_gray, maxCorners = 200, qualityLevel = 0.01, minDistance = 30.0, blockSize = 3)
cur_corner, status, err = cv2.calcOpticalFlowPyrLK(prev_gray, cur_gray, prev_corner,None)
prev_corner2 = []
cur_corner2 = []
for i,st in enumerate(status):
if st==1:
prev_corner2 = np.array(prev_corner2)
cur_corner2 = np.array(cur_corner2)
T = cv2.estimateRigidTransform(prev_corner2, cur_corner2, False);
last_T = T[:]
dx = T[0,2];
dy = T[1,2];
da = np.arctan2(T[1,0], T[0,0])
prev_to_cur_transform.append([dx, dy, da])
prev = cur[:]
prev_gray = cur_gray[:]
prev_to_cur_transform = np.array(prev_to_cur_transform)
trajectory = np.cumsum(prev_to_cur_transform, axis=0)
trajectory = pd.DataFrame(trajectory)
smoothed_trajectory = pd.rolling_mean(trajectory,window=30)
smoothed_trajectory = smoothed_trajectory.fillna(method=’bfill’)
new_prev_to_cur_transform = prev_to_cur_transform + (smoothed_trajectory – trajectory)
T = np.zeros((2,3))
new_prev_to_cur_transform = np.array(new_prev_to_cur_transform)
cap = cv2.VideoCapture(fin)
out = cv2.VideoWriter(‘out.avi’, cv2.VideoWriter_fourcc(‘P’,’I’,’M’,’1′), fps, (w, h), True)
for k in range(N-1):
status, cur =
T[0,0] = np.cos(new_prev_to_cur_transform[k][2]);
T[0,1] = -np.sin(new_prev_to_cur_transform[k][2]);
T[1,0] = np.sin(new_prev_to_cur_transform[k][2]);
T[1,1] = np.cos(new_prev_to_cur_transform[k][2]);
T[0,2] = new_prev_to_cur_transform[k][0];
T[1,2] = new_prev_to_cur_transform[k][1];
cur2 = cv2.warpAffine(cur, T, (w,h));
In this line: cv::calcOpticalFlowPyrLK(prev_grey, cur_grey, prev_corner, cur_corner, status, err);
All elements in status are 0 meaning no match found, however the grey images seem fine. I had to convert them to CV_8U:
cur_grey.convertTo(cur_grey, CV_8U);
, otherwise program crashes with assertion fail.
any ideas?
Maybe you’re loading non 8 bit images?
Hello. Help me solve the problem, I need to determine whether a person is breathing on video. I want to plot the motion of points by comparing each frame to get a curve. Can I somehow use your code? If I will display your trajectory on a graph, can you assess the person’s breathing?
No. There was an academic project that did exactly this a few years ago, I can’t remember the authors though,
Thanks, it helped a lot in my project …! you are awesome
in my code i ended up with this namespace:
namespace vst {
const size_t SMOOTH_FRAMES(2);
typedef struct TRANSFORM {
TRANSFORM(const double x, const double y, const double a);
TRANSFORM& operator =(const TRANSFORM& source);
TRANSFORM& operator +=(const TRANSFORM& source);
TRANSFORM& operator -=(const TRANSFORM& source);
bool operator ==(const TRANSFORM& source) const;
bool isId() const;
void dump(std::ostream& file) const;
double x; // horizontal translation //
double y; // vertical translation //
double a; // anti-clockwise rotation //
TRANSFORM operator +(const TRANSFORM& left, const TRANSFORM& right);
TRANSFORM operator -(const TRANSFORM& left, const TRANSFORM& right);
TRANSFORM operator /(const TRANSFORM& left, const double right);
TRANSFORM operator *(const TRANSFORM& left, const double right);
TRANSFORM operator *(const double left, const TRANSFORM& right);
typedef std::vector STABILIZATION;
void dump(std::ostream& file, const STABILIZATION& list);
void save(const std::string path, const STABILIZATION& list);
void load(const std::string path, STABILIZATION& list);
void vst(const std::string path, STABILIZATION& list, const bool autoLoad = true);
} // namespace vst //
