RunSFM 1.4.2 released

No major update in this release, just one bug fixed in my modified PMVS and a minor clean up in RunSFM.sh.

Posted in Technical | Leave a comment

Arduino Uno R3 + MaxSonar EZ3 calibration

I recently got my Arduino Uno R3 board this week and simply love how easy it is to use! My first muck around project was using a MaxSonar EZ3 sonar sensor to display the range readings to a serial LED segment display.

The MaxSonar’s output of (Vcc/512) per inch was not as accurate as I’d like it to be. I assumed Vcc = 5V. I decided to calibrate it by taking a few readings using a tape measure and reading the raw readings from the 10bit analog pin, hoping to fit a straight line. Plotting this data gave the following graph

As I hoped it’s a nice linear relationship. I’ve only collected data up to 120cm, ideally you would do it up to the range you’re interested in working to. Fitting a y = mx + c line gave the following values

m = 1.2275

c = 8.8524

So now I have a nice formula for converting the analog reading to range in centimetres. In practice I use fixed point integer maths.

centimetres = 1.2275*analog_reading + 8.8524

The full sketch up code is

void setup()
{
  Serial.begin(9600); 
}
 
int init_loop = 0;
 
void loop()
{  
  // NOTE to myself: Arduino Uno int is 16 bit
  const int sonarPin = A0;
 
  int dist = analogRead(sonarPin);
 
  // linear equation relating distance vs analog reading
  // y = mx + c
  // m = 1.2275
  // c = 8.8524
 
  // fixed precision scaling by 100
  dist = dist*123 + 885;
  dist /= 100;
 
  // output the numbers
  int thousands = dist / 1000;
  int hundreds = (dist - thousands*1000)/100;
  int tens = (dist - thousands*1000 - hundreds*100)/10;
  int ones = dist - thousands*1000 - hundreds*100 - tens*10;
 
  // Do a few times to make sure the segment display recieves the data ...
  if(init_loop < 5) {
    // Clear the dots
    Serial.print(0x77);
    Serial.print(0x00);
 
    // Set brightness
    Serial.print("z");
    Serial.print(0x00); // maximum brightness
 
     init_loop++;
  }
 
  // reset position, needed in case the serial transmission stuffs up
  Serial.print("v");
 
  // if thousand digit is zero don't display anything
  if(thousands == 0) {
    Serial.print("x"); // blank
  }
  else {
    Serial.print(thousands);  
  }
 
  // if thousand digit and hundred digit are zero don't display anything
  if(thousands == 0 && hundreds == 0) {
    Serial.print("x");
  }
  else {
    Serial.print(hundreds);  
  }
 
  // always display the tens and ones
  Serial.print(tens);  
  Serial.print(ones);
 
  delay(100);
}

To get the raw analog reading comment out the bit where it calculates the distance using the linear equation.

Below is a picture of the hardware setup. The MaxSonar is pointing up the ceiling, which I measured with a tape measure to be about 219 cm. The readout from the Arduino is 216cm, so it’s not bad despite calibrating up to 120cm only.

Quirks and issues

It took me a while to realise an int is 16bit on the Arduino! (fail …) I was banging my head wondering why some simple integer maths was failing. I have to be conscious of my scaling factors when doing fixed point maths.

After compiling and uploading code to the Arduino I found myself having to unplug and plug it back in for the serial LED segment display to function correctly. It seems to leave it in an unusable state during the uploading process.

Posted in Technical | Leave a comment

Multivariate decision tree (decision tree + linear classifier)

I’ve been mucking around with a multivariate decision tree for the past few days in Octave. Mostly out of curiosity. It’s basically a decision tree that uses a hyper plane to split the data at each node, instead of splitting at a single attribute. For example, say we have 2D data (x,y), a node in a standard decision tree might look like

if x > 5

where as a multivariate might be

if \theta_{1}x + \theta_{2}y + \theta_{3} > 0

The idea seems to have been around since the early 90s but for some reason you don’t hear about them nowadays. Maybe for good reason? Either way still an interesting idea. My implementation uses RANSAC to find a good hyper plane to split the data and uses the ‘Information Gain’/Entropy formula  to measure the “goodness” of the hyper plane.

Below are two simple synthetic test data that a decision tree and linear classifier might have trouble with, but when combined together they perform quite well. The green lines are the individual linear decision boundaries at each node and the red is the final boundary. The green samples are labelled “positive” and blue is “negative”. The accuracy of the classification is shown in the title of the graph.


The first dataset above cannot be separated using a single linear decision boundary, where as a decision tree on the other hand will probably zig-zag along the diagonal boundary producing a bigger tree than necessary. The multivariate decision tree on the other hand separates the data in 3 cuts. It is close to what I would consider the best, which would be 2 cuts along the diagonal boundaries. This is interesting, because it seems to suggest that to get the best decision boundary a sub-optimal cut might be required at some stage! I wonder if there’s a way to re-visit the boundary lines and simplify them …

The second dataset consists of positive samples shaped in a circle enclosed by negative samples. Pretty typical synthetic dataset. Again, no single linear decision boundary will separate this. But a decision tree will probably produce similar result to what I got given only 4 cuts as well.

Download

Download the Octave script. Might need to do right click save as.

MultivariateDecisionTree.7z

Extract and call Run.m, that’s capital R and not ‘run’ !

Posted in Technical | Leave a comment

NAR Demo 0.2.1 released

Minor bug fixed in DOG.cpp.

Added a PDF documenting the computer vision aspect in NAR. Still a work in progress.

Posted in Technical | Leave a comment

RunSFM 1.4.1

I’ve updated RunSFM to address compiling issues on Ubuntu 11.10. Mostly small issues like missing header files for standard C++ functions.

Posted in Technical | Leave a comment

NAR Demo 0.2.0

No sooner did I released NAR Demo 0.1.0 I wrote up a new version a few days later! Good thing I’m using a 3 decimal versioning system …

I was unhappy with the amount of jitter in the first released and decided to bring back some optical flow code I experimented in the past. NAR Demo 0.2.0 should now jitter far less and handle greater out of plane rotation, provided you start from an easy angle and rotate to the harder one.

Check out NAR Demo 0.2.0 here.

Posted in Technical | Leave a comment

NAR Demo out

Added NAR (Nghia’s Augmented Reality) Demo is out and can be found under the computer vision section.

Posted in Technical | Leave a comment

New markeless augmented reality demo coming soon!

I’ve been working a new markerless augmented reality demo and hope to release it Real Soon. It’s completely CPU code, unlike my first attempt which uses CUDA. So this should make it more accessible to everyone and it runs just as fast, if not faster. The new code is very different to my first demo, with many new improvements and features. To name a few:

  • Multi-threaded system
  • Uses a primitive version of difference of Gaussian features, similar to SIFT but not the same (making it patent free?!). I originally used FAST but found it a bit unreliably with a noisy webcam, which I sadly own.
  • More intuitive parameters for fine tuning. The previous demo had a few unintuitive parameters that I replaced.
  • Uses Irrlicht Engine for display. This allowed me to add some nifty game model, lighting, and animation to make the demo more interesting. Not only that, it’s cross platform!
  • Works with OpenCV 2.3.x
  • Clean up of code and bug fixes. Valgrind used to check for errors.

Posted in Technical | Leave a comment

Rotation invariance using Harris corners

This post on how to take advantage of the Harris corner to calculate dominant orientations for a feature patch, thus achieving rotation invariance. Some popular examples are the SIFT/SURF descriptors. I’ll present an alternative way that is simple to implement, especially if you’re already using Harris corners.

Background

The Harris corner detector is an old school feature detector that is still used today. Given an NxN pixel patch, and the horizontal/vertical derivatives extracted from it (via Sobel for example), it accumulates the following matrix

A = \frac{1}{N^2} \left[\begin{array}{cc} I_{x}^2 & I_{x}I_{y} \\ I_{x}I_{y} & I_{y}^2 \end{array} \right]

where

I_{x} is the summation of the derivatives in the x direction and I_{y} in the y direction for every pixel. The \frac{1}{N^2} averages the summation. Mathematically you don’t really need to do this, but in practice due to numerical overflow you want to keep the values in the matrix reasonably small. The 2×2 matrix above is the result of doing the following operation

Let B = \left[\begin{array}{cc} ix_{1} & iy_{1} \\ ix_{2} & iy_{2} \\ . & . \\ . & . \\ ix_{n} & iy_{n} \end{array} \right]

where ix_{n} and iy_{n} are the derivatives at pixel n.

Using B we get

A = B^{T}B

You can think of matrix A as the covariance matrix, and the values in B are assumed to be centred around zero.

Once the matrix A is calculated there are a handful of ways to calculate the corner response of the patch, which I won’t be discussing here.

Rotation invariance

With the matrix A, the orientation of the patch can be calculated using the fact that the eigenvectors of A can be directly converted to a rotation angle as follows (note: matrix are index as A(row,col) )

t = trace(A) = A(1,1) + A(2,2)

d = determinant(A) = A(1,1)*A(2,2) - A(1,2)*A(2,1)

eig1 = t/2 + \sqrt{t^{2}/4 - d}

angle = atan2\left(A(2,1), eig1 - A(2,2)\right)

eig1 is the larger of the two eigenvalues, which corresponds to the eigenvector

v = \left[\begin{array}{c} eig1 - d \\ A(2,1) \end{array} \right]

The eigenvalue/eigenvector was calculated using an algebraic formula I found here.

I’ve found in practice that the above equation results in an uncertainty in the angle, giving two possibilities

angle1 = angle

angle2 = angle + 180 degrees

I believe this is because an eigenvector can point in two directions, both of which are correct. If v is an eigenvector then -v is legit as well. A negation means a 180 degree rotation. So there are in fact two dominant orientations for the patch. So how do we resolve this? We don’t, keep them both!

Example

Here’s an example of a small 64×64 patch rotated from 0 to 360, every 45 degrees. The top row is the rotated patch, the second row is the rotation corrected patch using angle1 and the third row using angle2. The numbers at the top of the second/third rows are the angles in degrees of the rotated patches. You can see there are in fact only two possible appearances for the patch after it has been rotated using the dominant orientation.

Interestingly, the orientation angles seem to have a small error. For example, compare patch 1 and patch 5 (counting from the left). Patch 1 and patch 5 differ by 180 degrees, yet the orientations are 46 and 49 degrees respectively, a 3 degree difference. I think this might be due to the bilinear interpolation when I was using the imrotate function in Octave. I’ve tried using an odd size patch eg. 63×63, thinking it might be a centring issue when rotating but still the same results. For now it’s not such a big deal.

Rotation invariance using Harris corners

Implementation notes

I used a standard 3×3 Sobel filter to get the pixel derivatives. When accumulating the A matrix, I only use pixels within the largest circle (radius 32) enclosed by the 64×64 patch, instead of all pixels. This makes the orientation more accurate, since the corners sometime appear off the image when they are rotated.

Code

Here is the Octave script and image patch used to generate the image above (minus the annotation). Right click and save as to download.

rotation_invariance.m

patch.png

Posted in Technical | Leave a comment

Approximating a Gaussian using a box filter

I came across this topic while searching for a fast Gaussian implementation. It’s approximating a Gaussian by using multiple box filters. It’s old stuff but cool nonetheless. I couldn’t find a reference showing it in action visually that I liked so I decided to whip one up quickly.

The idea is pretty simple, blur the image multiple times using a box filter and it will approximate a Gaussian blur. The box filter convolution mask in 1D looks something like [1 1 1 1] * 0.25 , depending how large you want the blurring mask to be. Basically it just calculates the average value inside the mask.

Alright enough yip yapping, lets see it in action! Below shows 6 graphs. The first one labelled ‘filter’ is the box filter used. It is a box 19 units wide, with height 1/19. Subsequent graphs are the result of recursively convolving the box filter with itself. The blue graph is the result of the convolution, while the green is the best Gaussian fit for the data.

Gaussian approximation using box filter

Gaussian approximation using box filter

After the 1st iteration the plot starts to look like a Gaussian very quickly. This link from Wikipedia says 3 iterations will approximate a Gaussian to within roughly 3%. It also gives a nice rule of thumb for calculating the length of the box based on the desired standard deviation.

What’s even cooler is that this works with ANY filter, provided all the values are positive! The graph below shows convolving with a filter made up of random positive values.

Gaussian approximation using a random filter

Gaussian approximation using a random filter

The graph starts to smooth out after the 3rd iteration.

Another example, but with an image instead of a 1D signal.

For the theory behind why this all works have a search for the Central Limit Theorem. The Wikipedia article is a horrible read if you’re not a maths geek. Instead, I recommend watching this Khan Academy video instead to get a good idea.

The advantage of the box filter is that it can be implemented very efficiently for 2D blurring by first separating the mask into 1D horizontal/vertical mask and re-using calculated values. This post on Stackoverflow has a good discussion on the topic.

Code

You can download my Octave script to replicate the results above or fiddle around with. Download by right clicking and saving the file.

box_demo.m
box_demo2.m

Posted in Technical | 2 Comments