OpenCV vs. Armadillo vs. Eigen vs. more! Round 3: pseudoinverse test

Okay, the title of this post is getting longer and sillier, but this is the 3rd continuation of my last two post on comparing different libraries for everyday matrix operations. The last two posts compared basic operations such as multiplication, transposition, inversion etc. etc. in isolation, which is probably not a good reflection of real life usage. So I decided to come up with a new test that would combine different matrix operations together. I chose the pseudoinverse because it is something I use every now and then and it combines multiplication, transposition and inversion, which seems like a good test.

For benchmarking I’m going to be solving the following over determined linear system:

AX = B

and solve for X using

X = \left(A^TA\right)^{-1}A^{T}B

A is a NxM matrix, where N is much larger than M. I’ll be using N=1,000,000 data points and M (dimensions of the data) varying from 2 to 16.

B is a Nx1 matrix.

The matrix values will be randomly generated from 0 to 1 with uniform noise of [-1,1] added to B. They values are kept to a small range to avoid any significant numerical problems that can come about doing the pseudoinverse this way, not that I care too much for this benchmark. Each test is performed for 10 iterations, but not averaged out since I’m not interested in absolute time but relative to the other libraries.

Just to make the benchmark more interesting I’ve added GSL and OpenBLAS to the test, since they were just an apt-get away on Ubuntu.

Results

The following libraries were used

  • OpenCV 2.4.3 (compiled from source)
  • Eigen 3.1.2 (C++ headers from website)
  • Armadillo 3.4.4 (compiled from source)
  • GSL 1.15 (Ubuntu 12.10 package)
  • OpenBLAS 1.13 (Ubuntu 12.10 package)
  • Atlas 3.8.4 (Ubuntu 12.10 package)

My laptop has an Intel i7 1.60GHz with 6GB of RAM.

All values reported are in milliseconds. Each psuedoinverse test is performed 10 times but NOT averaged out. Lower is better. Just as a reminder each test is dealing with 1,000,000 data points of varying dimensions.

2 3 4 5 6 7 8 9
OpenCV 169.619 321.204 376.3 610.043 873.379 1185.82 1194.12 1569.16
Eigen 152.159 258.069 253.844 371.627 423.474 577.065 555.305 744.016
Armadillo +  Atlas 162.332 184.834 273.822 396.629 528.831 706.238 848.51 1088.47
Armadillo + OpenBLAS 79.803 118.718 147.714 298.839 372.235 484.864 411.337 507.84
GSL 507.052 787.429 1102.07 1476.67 1866.33 2321.66 2831.36 3237.67
10 11 12 13 14 15 16
OpenCV 1965.95 2539.57 2495.63 2909.9 3518.22 4023.67 4064.92
Eigen 814.683 1035.96 993.226 1254.8 1362.02 1632.31 1615.69
Armadillo + Atlas 1297.01 1519.04 1792.74 2064.77 1438.16 1720.64 1906.79
Armadillo + OpenBLAS 534.947 581.294 639.175 772.382 824.971 825.79 893.771
GSL 3778.44 4427.47 4917.54 6037.29 6303.08 7187.5 7280.27

Ranking from best to worse

  1. Armadillo + OpenBLAS
  2. Eigen
  3. Armadillo + Atlas (no multi-core support out of the box???)
  4. OpenCV
  5. GSL

All I can say is, holly smokes Batman! Armadillo + OpenBLAS wins out for every single dimension!  Last is GSL, okay no surprise there for me. It never boasted being the fastest car on the track.

The cool thing about Armadillo is switching the BLAS engine only requires a different library to be linked, no recompilation of Armadillo. What is surprising is the Atlas library doesn’t seem to support multi-core by default. I’m probably not doing it right. Maybe I’m missing an environmental variable setting?

OpenBLAS is based on GotoBLAS and is actually a ‘made in China’ product, except this time I don’t get to make any jokes about the quality. It is fast because it takes advantage of multi-core CPU, while the others appear to only use 1 CPU core.

I’m rather sad OpenCV is not that fast since I use it heavily for computer vision tasks. My compiled version actually uses Eigen, but that doesn’t explain why it’s slower than Eigen! Back in the old days OpenCV used to use BLAS/LAPACK, something they might need to consider bringing back.

Code

test_matrix_pseudoinverse.cpp (right click save as)

Edit the code to #define in the libraries you want to test. Make sure you don’t turn on Armadillo + GSL, because they have conflicting enums. Instructions for compiling is at the top of the cpp file, but here it is again for reference.

To compile using ATLAS:

g++ test_matrix_pseudoinverse.cpp -o test_matrix_pseudoinverse -L/usr/lib/atlas-base -L/usr/lib/openblas-base -lopencv_core -larmadillo -lgomp -fopenmp -lcblas -llapack_atlas -lgsl -lgslcblas -march=native -O3 -DARMA_NO_DEBUG -DNDEBUG -DHAVE_INLINE -DGSL_RANGE_CHECK_OFF

To compile with OpenBLAS:

g++ test_matrix_pseudoinverse.cpp -o test_matrix_pseudoinverse -L/usr/lib/atlas-base -L/usr/lib/openblas-base -lopencv_core -larmadillo -lgomp -fopenmp -lopenblas -llapack_atlas -lgsl -lgslcblas -march=native -O3 -DARMA_NO_DEBUG -DNDEBUG -DHAVE_INLINE -DGSL_RANGE_CHECK_OFF

New hand SFM dataset

A few months back I took some pics of my hand to see how well SFM would work on them. The point cloud came out pretty good. It managed to capture the depth of my finger pretty accurately, about 1 cm in diameter. You can get the dataset from the structure from motion page. Here are some screenshots of the results.