Last update: 10th May 2013
Fixed a mistake in handling reflection case.
Finding the optimal/best rotation and translation between two sets of corresponding 3D point data, so that they are aligned/registered, is a common problem I come across. An illustration of the problem is shown below for the simplest case of 3 corresponding points (the minimum required points to solve).
The corresponding points have the same colour, R is the rotation and t is the translation. We want to find the best rotation and translation that will align the points in dataset A to dataset B. Here, ‘optimal’ or ‘best’ is in terms of least square errors. This transformation is sometimes called the Euclidean or Rigid transform, because it preserves the shape and size. This is in contrast to an affine transform, which includes scaling and shearing.
This problem arises especially in tasks like 3D point cloud data registration, where the data is obtained from hardware like a 3D laser scanner or the popular Kinect device. The solution I’ll be presenting is from the paper ‘A Method for Registration of 3-D Shapes’, by Besl and McKay, 1992.
We’re solving for R,t in the equation:
B = R*A + t
Where R,t are the transforms applied to dataset A to align it with dataset B, as best as possible.
Finding the optimal rigid transformation matrix can be broken down into the following steps:
- Find the centroids of both dataset
- Bring both dataset to the origin then find the optimal rotation, (matrix R)
- Find the translation t
Finding the centroids
This bit is easy, the centroids are just the average point and can be calculated as follows:
Here, and are points in dataset A and B respectively. We will use these values in the next step.
Finding the optimal rotation
There are a few ways of finding optimal rotations between points. The easiest way I found is using Singular Value Decomposition (SVD), because it’s a function that is widely available in many programming languages (Matlab, Octave, C using LAPACK, C++ using OpenCV …). SVD is like this powerful magical wand in linear algebra for solving all sorts of numerical problems, but tragically wasn’t taught when I was at uni. I won’t go into details on how it works but rather how to use it. You only need to know that the SVD will decompose/factorise a matrix (call it E), into 3 other matrices, such that:
If E is a square matrix then U, S and V are the same size as well. We’re only interested in square matrices for this problem so I won’t go into detail about rectangular ones.
To find the optimal rotation we first re-centre both dataset so that both centroids are at the origin, like shown below.
H is the familiar covariance matrix.
At this point you may be thinking, “what the, that easy?!”, and indeed you would be right. One thing to be careful is that you calculate H correctly. It should end up being a 3×3 matrix, not a 1×1 matrix. Pay close attention to the transpose symbol. It’s doing a multiplication between 2 matrices where the dimensions effectively are, 3×1 and 1×3, respectively. The ordering of the multiplication is also important, doing it the other way will find a rotation from B to A instead.
Special reflection case
There’s a special case when finding the rotation matrix that you have to take care of. Sometimes the SVD will return a ‘reflection’ matrix, which is numerically correct but is actually nonsense in real life. This is addressed by checking the determinant of R (from SVD above) and seeing if it’s negative (-1). If it is then the 3rd column of V is multiplied by -1.
if determinant(R) < 0 multiply 3rd column of V by -1 recompute R end if
An alternative check that is possibly more robust was suggested by Nick Lambert
if determinant(R) < 0 multiply 3rd column of R by -1 end if
where R is the rotation matrix.
A big thank you goes to Klass Jan Russcher and Nick Lambert for these solutions.
The translation is
The centroids are 3×1 column vectors.
How did I get this? If you remember back, to transform A to B we first had to centre A to its origin. This is where the -centroid_A comes from, though I put the minus sign at the front. We then rotate A, hence R. Then finally translate it to dataset B’s origin, the + centroid_B bit.
And we’re done!
Note on usage
The solution presented can be used on any size dataset as long as there are at least 3 points. When there are more than 3 points a least square solution is obtained, such that the following error is minimised:
Here, the operator || || is the Euclidean distance between two vectors, a scalar value. err is just the square distance error between the points in the two dataset.
This script has been tested in Octave. It should work in Matlab but it has not been tested. I’ve also done a Python version (still learning the language). Both scripts come with an example on how to use.
rigid_transform_3D.py_ (rename the file to remove the trailing _, added to stop the webserver from parsing it)
Read the comments in the Octave file for usage.