NAR Demo 0.2.0

No sooner did I released NAR Demo 0.1.0 I wrote up a new version a few days later! Good thing I’m using a 3 decimal versioning system …

I was unhappy with the amount of jitter in the first released and decided to bring back some optical flow code I experimented in the past. NAR Demo 0.2.0 should now jitter far less and handle greater out of plane rotation, provided you start from an easy angle and rotate to the harder one.

Check out NAR Demo 0.2.0 here.

Posted in Technical | Leave a comment

NAR Demo out

Added NAR (Nghia’s Augmented Reality) Demo is out and can be found under the computer vision section.

Posted in Technical | Leave a comment

New markeless augmented reality demo coming soon!

I’ve been working a new markerless augmented reality demo and hope to release it Real Soon. It’s completely CPU code, unlike my first attempt which uses CUDA. So this should make it more accessible to everyone and it runs just as fast, if not faster. The new code is very different to my first demo, with many new improvements and features. To name a few:

  • Multi-threaded system
  • Uses a primitive version of difference of Gaussian features, similar to SIFT but not the same (making it patent free?!). I originally used FAST but found it a bit unreliably with a noisy webcam, which I sadly own.
  • More intuitive parameters for fine tuning. The previous demo had a few unintuitive parameters that I replaced.
  • Uses Irrlicht Engine for display. This allowed me to add some nifty game model, lighting, and animation to make the demo more interesting. Not only that, it’s cross platform!
  • Works with OpenCV 2.3.x
  • Clean up of code and bug fixes. Valgrind used to check for errors.

Posted in Technical | Leave a comment

Rotation invariance using Harris corners

This post on how to take advantage of the Harris corner to calculate dominant orientations for a feature patch, thus achieving rotation invariance. Some popular examples are the SIFT/SURF descriptors. I’ll present an alternative way that is simple to implement, especially if you’re already using Harris corners.

Background

The Harris corner detector is an old school feature detector that is still used today. Given an NxN pixel patch, and the horizontal/vertical derivatives extracted from it (via Sobel for example), it accumulates the following matrix

A = \frac{1}{N^2} \left[\begin{array}{cc} I_{x}^2 & I_{x}I_{y} \\ I_{x}I_{y} & I_{y}^2 \end{array} \right]

where

I_{x} is the summation of the derivatives in the x direction and I_{y} in the y direction for every pixel. The \frac{1}{N^2} averages the summation. Mathematically you don’t really need to do this, but in practice due to numerical overflow you want to keep the values in the matrix reasonably small. The 2×2 matrix above is the result of doing the following operation

Let B = \left[\begin{array}{cc} ix_{1} & iy_{1} \\ ix_{2} & iy_{2} \\ . & . \\ . & . \\ ix_{n} & iy_{n} \end{array} \right]

where ix_{n} and iy_{n} are the derivatives at pixel n.

Using B we get

A = B^{T}B

You can think of matrix A as the covariance matrix, and the values in B are assumed to be centred around zero.

Once the matrix A is calculated there are a handful of ways to calculate the corner response of the patch, which I won’t be discussing here.

Rotation invariance

With the matrix A, the orientation of the patch can be calculated using the fact that the eigenvectors of A can be directly converted to a rotation angle as follows (note: matrix are index as A(row,col) )

t = trace(A) = A(1,1) + A(2,2)

d = determinant(A) = A(1,1)*A(2,2) - A(1,2)*A(2,1)

eig1 = t/2 + \sqrt{t^{2}/4 - d}

angle = atan2\left(A(2,1), eig1 - A(2,2)\right)

eig1 is the larger of the two eigenvalues, which corresponds to the eigenvector

v = \left[\begin{array}{c} eig1 - d \\ A(2,1) \end{array} \right]

The eigenvalue/eigenvector was calculated using an algebraic formula I found here.

I’ve found in practice that the above equation results in an uncertainty in the angle, giving two possibilities

angle1 = angle

angle2 = angle + 180 degrees

I believe this is because an eigenvector can point in two directions, both of which are correct. If v is an eigenvector then -v is legit as well. A negation means a 180 degree rotation. So there are in fact two dominant orientations for the patch. So how do we resolve this? We don’t, keep them both!

Example

Here’s an example of a small 64×64 patch rotated from 0 to 360, every 45 degrees. The top row is the rotated patch, the second row is the rotation corrected patch using angle1 and the third row using angle2. The numbers at the top of the second/third rows are the angles in degrees of the rotated patches. You can see there are in fact only two possible appearances for the patch after it has been rotated using the dominant orientation.

Interestingly, the orientation angles seem to have a small error. For example, compare patch 1 and patch 5 (counting from the left). Patch 1 and patch 5 differ by 180 degrees, yet the orientations are 46 and 49 degrees respectively, a 3 degree difference. I think this might be due to the bilinear interpolation when I was using the imrotate function in Octave. I’ve tried using an odd size patch eg. 63×63, thinking it might be a centring issue when rotating but still the same results. For now it’s not such a big deal.

Rotation invariance using Harris corners

Implementation notes

I used a standard 3×3 Sobel filter to get the pixel derivatives. When accumulating the A matrix, I only use pixels within the largest circle (radius 32) enclosed by the 64×64 patch, instead of all pixels. This makes the orientation more accurate, since the corners sometime appear off the image when they are rotated.

Code

Here is the Octave script and image patch used to generate the image above (minus the annotation). Right click and save as to download.

rotation_invariance.m

patch.png

Posted in Technical | Leave a comment

Approximating a Gaussian using a box filter

I came across this topic while searching for a fast Gaussian implementation. It’s approximating a Gaussian by using multiple box filters. It’s old stuff but cool nonetheless. I couldn’t find a reference showing it in action visually that I liked so I decided to whip one up quickly.

The idea is pretty simple, blur the image multiple times using a box filter and it will approximate a Gaussian blur. The box filter convolution mask in 1D looks something like [1 1 1 1] * 0.25 , depending how large you want the blurring mask to be. Basically it just calculates the average value inside the mask.

Alright enough yip yapping, lets see it in action! Below shows 6 graphs. The first one labelled ‘filter’ is the box filter used. It is a box 19 units wide, with height 1/19. Subsequent graphs are the result of recursively convolving the box filter with itself. The blue graph is the result of the convolution, while the green is the best Gaussian fit for the data.

Gaussian approximation using box filter

Gaussian approximation using box filter

After the 1st iteration the plot starts to look like a Gaussian very quickly. This link from Wikipedia says 3 iterations will approximate a Gaussian to within roughly 3%. It also gives a nice rule of thumb for calculating the length of the box based on the desired standard deviation.

What’s even cooler is that this works with ANY filter, provided all the values are positive! The graph below shows convolving with a filter made up of random positive values.

Gaussian approximation using a random filter

Gaussian approximation using a random filter

The graph starts to smooth out after the 3rd iteration.

Another example, but with an image instead of a 1D signal.

For the theory behind why this all works have a search for the Central Limit Theorem. The Wikipedia article is a horrible read if you’re not a maths geek. Instead, I recommend watching this Khan Academy video instead to get a good idea.

The advantage of the box filter is that it can be implemented very efficiently for 2D blurring by first separating the mask into 1D horizontal/vertical mask and re-using calculated values. This post on Stackoverflow has a good discussion on the topic.

Code

You can download my Octave script to replicate the results above or fiddle around with. Download by right clicking and saving the file.

box_demo.m
box_demo2.m

Posted in Technical | 2 Comments

Quick and easy connected component (blob) using OpenCV

OpenCV is up to version 2.3.1 and it still lacks a basic connected component function for binary blob analysis. A quick Google will bring up cvBlobsLib (http://opencv.willowgarage.com/wiki/cvBlobsLib), an external lib that uses OpenCV calls. At the time when I needed such functionality I wasn’t too keen on linking to more libraries for something so basic. So instead I quickly wrote my own version using existing OpenCV calls. It uses cv:floodFill with 4 connected neighbours. Here is the code

void FindBlobs(const cv::Mat &binary, vector < vector<cv::Point2i>  > &blobs)
{
    blobs.clear();
 
    // Fill the label_image with the blobs
    // 0  - background
    // 1  - unlabelled foreground
    // 2+ - labelled foreground
 
    cv::Mat label_image;
    binary.convertTo(label_image, CV_32FC1); // weird it doesn't support CV_32S!
 
    int label_count = 2; // starts at 2 because 0,1 are used already
 
    for(int y=0; y < binary.rows; y++) {
        for(int x=0; x < binary.cols; x++) {
            if((int)label_image.at(y,x) != 1) {
                continue;
            }
 
            cv::Rect rect;
            cv::floodFill(label_image, cv::Point(x,y), cv::Scalar(label_count), 
                          &rect, cv::Scalar(0), cv::Scalar(0), 4);
 
            std::vector   blob;
 
            for(int i=rect.y; i < (rect.y+rect.height); i++) {
                for(int j=rect.x; j < (rect.x+rect.width); j++) {
                    if((int)label_image.at(i,j) != label_count) {
                        continue;
                    }
 
                    blob.push_back(cv::Point2i(j,i));
                }
            }
 
            blobs.push_back(blob);
 
            label_count++;
        }
    }
}

Example usage

Here is an example that reads in a binary image (0 and 255) and outputs randomly coloured label blobs.

int main(int argc, char **argv)
{
    cv::Mat img = cv::imread("blob.png", 0); // force greyscale
 
    if(!img.data) {
        cout << "File not found" << endl;
        return -1;
    }
 
    cv::namedWindow("binary");
    cv::namedWindow("labelled");
 
    cv::Mat output = cv::Mat::zeros(img.size(), CV_8UC3);
 
    cv::Mat binary;
    vector < vector > blobs;
 
    cv::threshold(img, binary, 0.0, 1.0, cv::THRESH_BINARY);
 
    FindBlobs(binary, blobs);
 
    // Randomy color the blobs
    for(size_t i=0; i < blobs.size(); i++) {
        unsigned char r = 255 * (rand()/(1.0 + RAND_MAX));
        unsigned char g = 255 * (rand()/(1.0 + RAND_MAX));
        unsigned char b = 255 * (rand()/(1.0 + RAND_MAX));
 
        for(size_t j=0; j < blobs[i].size(); j++) {
            int x = blobs[i][j].x;
            int y = blobs[i][j].y;
 
            output.at(y,x)[0] = b;
            output.at(y,x)[1] = g;
            output.at(y,x)[2] = r;
        }
    }
 
    cv::imshow("binary", img);
    cv::imshow("labelled", output);
    cv::waitKey(0);
 
    return 0;
}

The results …

Posted in Technical | 1 Comment

Generating all possible views of a planar object

So it’s Boxing Day and I haven’t got much on, so why not another blog post! yay!

Today’s post is about generating synthetic views of planar objects, such as a book. I needed to do whilst implementing my own version of the Fern algorithm. Here are some references, for your reference …

  • M. Ozuysal, M. Calonder, V. Lepetit and P. Fua, Fast Keypoint Recognition using Random Ferns, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, Nr. 3, pp. 448 – 461, March 2010.
  • M. Ozuysal, P. Fua and V. Lepetit, Fast Keypoint Recognition in Ten Lines of Code, Conference on Computer Vision and Pattern Recognition, Minneapolis, MI, June 2007.

Also check out their website here.

In summary it’s a cool technique for feature matching that is rotation, scale, lighting, and affine viewpoint invariant, much like SIFT, but does it in a simpler way. Fern does this by generating lots of random synthetic views of the planar object and learns (semi-naive Bayes) the features extracted at each view. Because it has seen virtually every view possible, the feature descriptor can be very simple and does not need to be invariant to all the properties mentioned earlier. In fact, the feature descriptor is made up of random binary comparisons. This is in contrast to the more complex SIFT descriptor that has to cater for all sorts of invariance.

I wrote a non-random planar view generator, which is much easier to interpret from a geometric point of view. I find the idea of random affine transformations tricky to interpret, since they’re a combination of translation/rotation/scalings/shearing. My version treats the planar object as a flat piece of paper in 3D space (with z=0), applies a 3×3 rotation matrix (parameterise by yaw/pitch/roll), then re-projects to 2D using an orthographic projection/affine camera (by keeping x,y and ignoring z). I do this process for all combinations of scaling, yaw, pitch, roll I’m interested in.

I use the following settings for Fern

  • scaling of 1.0, 0.5, 0.25
  • yaw, 0 to 60 degrees, increments of 10 degrees
  • pitch, 0 to 60 degrees, increments of 10 degrees
  • roll, 0 to 360 degrees, increments of 10 degrees

There’s not much point going beyond 60 degrees for yaw/pitch, you can hardly see the object.

Here are some example outputs

Download

You can download my demo code here SyntheticPlanarView.tar.gz

You will need OpenCV 2.x installed, and compile using the included CodeBlocks project or an IDE of your choice. Run it on the command line with an image as the argument. Hit any key to step through all the view generated.

Posted in Technical | Leave a comment

Added tutorial on PCA for data visualisation

I added a new tutorial on using PCA for data visualisation, explaining it from an intuitively (hopefully) geometric point of view, instead of just a bunch of maths equation all over the place. It can be found in the Notes section.

Posted in Technical | Leave a comment

K-Means for Octave

Googling for K-Means for Octave brought me to this Matlab/Octave script http://www.christianherta.de/kmeans.html by Christian Herta. Works great, except that it ran very slow. Here is my improved version to run faster, most of the expensive for loops have been replaced with faster ones. The new version runs orders of magnitude faster.

function[centroid, pointsInCluster, assignment]= myKmeans(data, nbCluster)
% usage
% function[centroid, pointsInCluster, assignment]=
% myKmeans(data, nbCluster)
%
% Output:
% centroid: matrix in each row are the Coordinates of a centroid
% pointsInCluster: row vector with the nbDatapoints belonging to
% the centroid
% assignment: row Vector with clusterAssignment of the dataRows
%
% Input:
% data in rows
% nbCluster : nb of centroids to determine
%
% (c) by Christian Herta ( www.christianherta.de )
% Modified by Nghia Ho to improve speed
 
data_dim = length(data(1,:));
nbData   = length(data(:,1));
 
 
% init the centroids randomly
data_min = min(data);
data_max = max(data);
data_diff = data_max .- data_min ;
 
% every row is a centroid
centroid = rand(nbCluster, data_dim);
centroid = centroid .* repmat(data_diff, nbCluster, 1) + repmat(data_min, nbCluster, 1);
 
% no stopping at start
pos_diff = 1.;
 
% main loop until
 
while pos_diff > 0.0
  % E-Step
  assignment = [];
 
  % assign each datapoint to the closest centroid
 
  if(nbCluster == 1) % special case
	assignment = ones(size(data,1), 1);
  else
	  dists = [];
	  for c = 1: nbCluster
		d = data - repmat(centroid(c,:), size(data,1), 1);
		d = d .* d;
		d = sum(d, 2); % sum the row values
 
		dists = [dists d];
	  end
 
	  [a, assignment] = min(dists');
 
	  assignment = assignment';
  end
 
 
  % for the stoppingCriterion
  oldPositions = centroid;
 
  % M-Step
  % recalculate the positions of the centroids
  centroid = zeros(nbCluster, data_dim);
  pointsInCluster = zeros(nbCluster, 1);
 
  for c = 1: nbCluster
	indexes = find(assignment == c);
	d = data(indexes,:);
	centroid(c,:) = sum(d);
	pointsInCluster(c,1) = length(d);
 
    if( pointsInCluster(c, 1) != 0)
      centroid( c , : ) = centroid( c, : ) / pointsInCluster(c, 1);
    else
      % set cluster randomly to new position
      centroid( c , : ) = (rand( 1, data_dim) .* data_diff) + data_min;
    end
  end
 
  %stoppingCriterion
  pos_diff = sum (sum( (centroid .- oldPositions).^2 ) );
end
end
Posted in Technical | Leave a comment

Fast approximate arctan/atan function

While searching for a fast arctan approximation I came across this paper:

Efficient approximations for the arctangent function”, Rajan, S. Sichun Wang Inkol, R. Joyal, A., May 2006

Unfortunately I no longer have access to the IEEE papers (despite paying for yearly membership, what a joke …), but fortunately the paper appeared in a book that Google has for preview (for selected pages), “Streamlining digital signal processing: a tricks of the trade guidebook”. Even luckier, Google had the important pages on preview. The paper presents 7 different approximation, each with varying degree of accuracy and complexity.

Here is one algorithm I tried, which has a reported maximum error 0.0015 radians (0.085944 degrees), lowest error in the paper.

double FastArcTan(double x)
{
    return M_PI_4*x - x*(fabs(x) - 1)*(0.2447 + 0.0663*fabs(x));
}

The valid range for x is between -1 and 1. Comparing the above with the standard C atan function for 1,000,000 calls using GCC gives:

Time ms
FastArcTan 17.315
Standard C atan 60.708

About 3x times faster, pretty good!

Posted in Technical | Leave a comment