Five point algorithm for essential matrix, 1 year later …

In 2011, some time around April, I had a motorcycle accident which left me with a broken right hand. For the next few weeks I was forced to take public transport to work, *shudder* (it takes me twice as long to get to work using Melbourne’s public transport than driving). I killed time on the train reading comic books and academic papers. One interesting one I came across was Five-Point Algorithm Made Easy by Hongdong Li and Richard Hartley for calculating the essential matrix using five points (minimum). Now here is something you don’t see everyday in an academic paper title, ‘easy’. I’ve always be meaning to get around to learning the five-point algorithm and this one boasts an easier implementation than David Nister’s version. Seemed like the perfect opportunity to try it out, I mean how hard could it be? Well …

Over one year later of on and off programming I finally finished implementing this damn algorithm! A fair bit of time was spent on figuring out how to implement symbolic matrices in C++. I was up for using existing open source symbolic packages but found that they all struggled to calculate the determinant of a 10×10 symbolic matrix of polynomials. If you do the maths, doing it the naive way is excruciatingly slow, at least in the orders of 10 factorial. I ended up writing my own simple symbolic matrix class. I also wrote a Maxima and PHP script to generate parts of the symbolic maths step into C++ code, fully expanded out in all its glory. The end result is a rather horrendous looking header file with 202 lines of spaghetti code (quite proud of that one actually).

The second major road black was actually a one liner bug in my code. I was doing the SVD of a 5×9 matrix in OpenCV. By default OpenCV’s SVD doesn’t do the full decomposition for Vt unless you tell it to eg. cv::SVD svd(F, cv::SVD::FULL_UV). So I was using the 5×9 version of Vt instead of 9×9 to grab the null basis, which was incorrect.

It was quite a hellish journey, but in the end I learnt something new and here’s the code for you all to enjoy.


Last update: 25th August 2013

5point-0.1.5.tar.gz (requires OpenCV)

5Point-Eigen-0.1.5.tar.gz (requires Eigen)

The OpenCV and Eigen version produce identical output. The Eigen version was ported by Stuart Nixon. The file 5point.cpp in the Eigen version has a special #ifdef ROBUST_TEST that you can turn on to enable a more robust test against outliers. I have not thoroughly test this feature so it is turned off by default.

Given 5 or more points, the algorithm will calculate all possible essential matrix solutions and return the correct one(s) based on depth testing. As a bonus it even returns the 3×4 projection matrix. The whole process takes like 0.4 milliseconds on my computer. There’s a lot of room for speed improvement using more efficient maths. David Nister’s paper list some improvements in the appendix.

90 thoughts on “Five point algorithm for essential matrix, 1 year later …”

    1. The original C++/mex source code of Hartley and Li ‘s 5-point algorithm can be found on the authors’s website, while works just fine and extremely fast.

  1. Hi thank you for contributing such a nice code.

    I think I caught a severe bug in your code.

    in 5point.cpp, there is a line like

    pt3d = TriangulatePoint(cv::Point2d(pts1[0], pts1[1]), cv::Point2d(pts2[0], pts2[2]), P_ref, P[j]);

    I suspect it should be

    pt3d = TriangulatePoint(cv::Point2d(pts1[0], pts1[1]), cv::Point2d(pts2[0], pts2[1]), P_ref, P[j]);

    The difference is the second index of pts2

  2. Hi,
    trying to use your functions.
    I noticed that when you create the projection matrices you do not check if the rotation matrix is a mirroring rotation matrix (det < 0). Is that something you have though about?

    1. I’m flicking through the multiple view geometry book and it doesn’t seem to bring up the issue. Are you seeing det < 0 ?

    2. This is the result I get when compiling your program with the test points in the main.cpp, octave shows that the return P is a mirroring projection matrix:

      P =

      -0.30440 0.16280 -0.93853 0.44225
      0.93299 0.24959 -0.25931 -0.51762
      -0.19203 0.95457 0.22786 -0.73245

      octave:2> R = P(:,1:3)
      R =

      -0.30440 0.16280 -0.93853
      0.93299 0.24959 -0.25931
      -0.19203 0.95457 0.22786

      octave:3> det(R)
      ans = -1.00000

  3. Be sure to check out five-point.cpp in the master branch of opencv;

    if (determinant(U) < 0) U *= -1.;
    if (determinant(Vt) < 0) Vt *= -1.;

    1. I’ve used this version in the past with success

      if determinant(R) < 0 multiply 3rd column of R by -1 end if

      1. HI
        Thank you very much for your implementation because i am trying to convert my application from Matlab to OpenCV. In Matlab version, i used code from

        However, the result might not exactly the same.
        When i executed your sample (num_pts = 5), the result is
        1.0161 -0.4041 -0.3996
        0.3996 -0.7028 0.7615
        0.3310 0.2526 -0.7795
        Is this result 100% correct?
        Besides, what does num_pts mean?
        How it will influence the result?
        Moreover, i tried to solve your sample with Nister’s code.
        The result is

        E =

        0.0553 -0.6078 0.3496
        0.3687 -0.0010 0.0642
        -0.5916 -0.1183 -0.0360

        and it needs 10 points.

        I don’t quite understand the difference or the difference could be ignored?

        1. num_pts is “number of points”. My sample code has 10 pts but I tested it on the first 5 just to confirm it works with the bare minimum. I like to think my code is solid, keyword “think” 🙂 There does seem to be a big difference in the E matrix. How many E matrix does their method return?

          1. You implementation are different from the 5 point algorithm in bundler too. The result from bundler are similar to Nister’s code,but much different from you. So I think your code may have some bugs.

        2. Hi!
          I am looking for the code about essential matrix.I saw a link about its matlab version in your reply,but the link is failed.Could you send the code to my mailbox?
          My mailbox is,
          thank you!

  4. If I want to use your code into my solution then do I need to change only the corresponding points?
    5point.mac will work as it is or I need to modify that?

  5. Does the code calculate P and E from only the corresponding points? Does this mean we can retrieve the calibration matrix K from decomposing P calculated from the 5 point algorithm?

    1. The algorithm assumes you give it normalised image points, meaning you are expected to know K before hand already. The P matrix returned does not include K.

  6. Hello,

    Me too, i think that you have some bugs in your implementation of 5 pts algorithm because I compared the result of your code with the 5 points implementation in Matlab and it was not same..

  7. Hi, thanks for your code!!
    But I have found an issue when choosing the Projection matrix with the most points in front of the camera, which could be why some people find a bug.

    In fact in Opencv when you do:
    ret_E = E;
    ret_P = P[j];
    it puts only pointers on these fields, and the value can update itself while you don’t want to. I replaced it by:

    And it works much better!!!
    It might be something that has changed with the newer versions of Opencv.
    Thanks again for your awesome work!!

    1. Ah that silly matrix problem gets me every time in OpenCV! Thanks for spotting that one. Updated the code now.

      On a more general note I’ve added a print out when it detects a reflection in the rotation matrix. Can someone confirm I’m doing the right thing to fix it. Else I’ll leave the annoying print out until it gets resolved 🙂

      1. Hey again,
        About the reflections I had this issue too, especially with rotation but with translation too sometimes!
        I still think that there is a possible inversion in the Essential matrix, because the ProjectionsFromEssential function is correct.
        For the moment I just use heuristics, if you find the solution I am fan!
        Otherwise I was wondering:

        For the moment when you choose between the Essential matrices to have the best one, you pick the one that leads to a projection matrix leaving the most points in front of the camera. But with the possibles reflection, this method can sometimes fail: I printed out all the projections matrices with ground truth known, and the right one was discarded.

        I still don’t know how to solve this in a general fashion, I use heuristics for my problem and then pick the best one. An idea could be to minimize of form of criterion rather than just “number in front of the camera” but I can’t figure out a good way to do it except Sampson error…

        I’ll keep you informed if I find a better way to set this up!!

        1. Hey,

          I wonder if it’s possible to apply the rotation “correction” prior to doing the test to find the best one. That way you’re always testing valid matrices.

    1. I refer to K as the 3×3 intrinsic camera matrix, these are the calibrated parameters. This algorithm does not find or return K.

      1. I have 2 different cameras with their intrinsic parameters (2 different K matrix). Also I have correspondence points from both the cameras. Now to use this code where should I initialize these intrinsic parameters (K matrix for both cameras)?

        1. Apply them to the points before running the code eg.

          ptn1 = K1*pt1
          ptn2 = K2*pt2

          ptn1, ptn2 becomes input to the code

          1. how should I calculate inverse(K1)*pt1?
            inverse(K1) is a 3×3 matrix and pt1 is in of form (x,y) values. will it be a matrix multiplication? I guess this is not a scalar multiplication..

          2. Shall it be like below code..

            Point3d u1(pt1.x, pt1.y, 1.0);
            Mat_ um1 = K1inv * Mat_(u1);
            u1.x = um1(0); u1.y = um1(1); u1.z = um1(2);
            ptn1[] = {u1.x/u1.z, u1.y/u1.z};

            The 2D point first converted to a 3D point with z value 1 and then after multiply and get back the 2D by x/z and y/z

          3. This is correct. The division is redundant because it’ll be dividing by 1 all the time.

          4. One more question.
            I have a set of around 1000 correspondence points between 2 images. With different set of 5 points out of these 1000 points, I get different output of E and P. How do you suggest to take an average output of these?

          5. Use the RANSAC framework.

            1. Randomly pick a set of 5 points
            2. Calculate E
            3. Apply E to all the points and see how well they “fit” (up to you to define, check Sampson distance)
            4. Keep track of which E produces the best fit
            5. Repeat until satisfied

            You can do one extra step and refine it further by taking all the points that “fit” well and calculate E using all the points (instead of only 5), this will find an average E.

          6. Can you please explain how to apply the E to all other points?
            And how to calculate E using all “fit” points..

          7. My function already handles using points to find E:

            bool Solve5PointEssential(double *pts1, double *pts2, int num_pts, cv::Mat &ret_E, cv::Mat &ret_P, int &best_inliers)

            set num_pts to how many points you have.

  8. I’m not having the same results as Stewenius matlab version ( for this dataset :

    -1.84061 -2.66844
    -0.907628 -2.73298
    -1.59856 -2.32285
    -2.20069 -2.7138
    -1.41456 -2.11016
    -1.22827 -2.35542
    -1.45023 -2.11384
    -1.55311 -2.32163
    -1.78152 -2.57716
    -2.24741 -2.19773

    -1.0908 -2.26306
    -1.77424 -1.77768
    -1.45032 -2.31666
    -0.839 -2.44396
    -1.71278 -2.31398
    -1.66077 -2.34513
    -1.64307 -2.4623
    -1.46829 -2.35682
    -1.17333 -2.3116
    -1.09173 -2.63488

    The points are already normalized. Do you experience the same results ?

  9. Hello,
    I’m just looking at your 5 point code – looks very promising. Thanks for releasing this. Just for your information, I’m getting the following gcc compile warning, which I’ve yet to look into.

    src/5point/Rpoly.cpp: In function ‘void RealIT_ak1(int*, int*, double*, int, double*, int, double*, double*, double*, double*, double*)’:
    src/5point/Rpoly.cpp:660:13: warning: ‘omp’ may be used uninitialized in this function [-Wuninitialized]
    src/5point/Rpoly.cpp: In function ‘void QuadIT_ak1(int, int*, double, double, double*, double*, double*, double*, double*, int, double*, double*, double*, double*, double*, double*, double*, double*, double*, double*, double*, double*, double*, double*)’:
    src/5point/Rpoly.cpp:569:13: warning: ‘omp’ may be used uninitialized in this function [-Wuninitialized]

    1. It should be a harmless warning because omp gets used after it is initialised. Just follow the execution path.

  10. Nghia,

    As you note, the initialized value warnings are OK; I’ve set the values to 0 to shut the warnings up.

    I’ve been testing your 5 point code using RANSAC and comparing with a normalized 8 point algorithm with known correct test data. I’ve run into some issues, and have yet to sort them out. I’m happy to send you some test data and further comments; perhaps you could contact me via email (assuming you can access my email address I used to post this message).

    Some of the issues I’ve found are that:
    – The reflection test code is not quite right in my test case – one needs to invert 00, 01, 10, 11 and 22 values in the R matrix to get the correct results for my test data.
    – The 3rd value (Z) in the translation is inverted from the correct value (as extracted from the P matrix).
    – I am not getting the correct E matrix out so RANSAC is failing.
    – The polynomial code sometimes reports unable to converge after 20 shifts (perhaps because of bad random points selected for RANSAC).

    I suspect the 2nd issue (inverted Z value for translation in P) is also present in the E matrix, as I’m still not getting correct E matrix values out. I’ll continue to dig into this, but it would be helpful to continue this discussion via email and to give you some test data to see what you think of it all.

    I’m interested in resolving these problems, as your code is clean and fast.

    Kind regards,

      1. Hi. That should be a valid email domain, but try the one I used to post this message instead.

        I’m beginning to wonder if the problem might be related to RHS vs LHS coordinate system (I use photogrammetry style right-handed coordinate system not OpenCV style left handed coordinates). I’m building test cases to try and narrow the problem down. I’m also looking to swap your code over to Eigen3 for matrix work, which should make it even faster and remove the OpenCV dependancy (I’m leaving that until I get things working as expected first). Be good to continue discussions via email to chat about the issues I’m seeing and to provide you with some test data via email.

  11. Hi, I test your code for 2 pictures (of redskirt dataset).

    I manually select 8 matching:
    double pts1[] = {548, 119,

    double pts2[] = {507,110,
    This is the result:
    Average execution time: 717.139 us

    Success! Found 2 possible solutions
    The best one has the highest inliers. An inlier is a point that is in front of both cameras.

    Solution 1/2

    E = [-0.001483281169646533, 0.0798129447617992, -16.00316603240302;
    -0.0728808162683079, 0.0001959859646169806, 50.90835405921668;
    15.58680315258395, -51.03736078741671, -0.002534495378497037]

    P = [0.9999666308722415, -0.008168190689744725, -0.0001334274124399082, -0.9539744194915745;
    0.008168191808671788, 0.9999666397340765, 7.843250090652126e-06, -0.2998842742424013;
    0.0001333588961035814, -8.932849065581964e-06, 0.9999999910678045, -0.001492989558391173]

    inliers = 7/8

    Solution 2/2

    E = [-0.5049975712191607, 0.9597433748533721, 136.6575441673005;
    0.2489707601260553, -0.02233064866215852, -313.4228545544773;
    131.6783043503389, -315.5458153141706, 0.5385152633007909]

    Detected a reflection in P. Multiplying all values by -1.
    P = [-0.6920270232728676, -0.7218672225694188, 0.002512377361682321, 0.9166507454123589;
    -0.7218704435458069, 0.6920279437574728, -0.0006227309082056209, 0.3996796151553508;
    -0.001289106308433324, -0.002244557577137604, -0.9999966500774935, 0.002759739885880337]

    inliers = 8/8

    dataset’s given projection matrixes for these images:

    1.000000 0.000000 0.000000 0.000000
    0.000000 1.000000 0.000000 0.000000
    0.000000 0.000000 1.000000 0.000000

    0.954262 -0.140546 0.263872 -6.723805
    0.144974 0.989431 0.002719 -0.643127
    -0.261466 0.035661 0.964552 1.309906

    would you pls help me find out what’s wrong with it?

      1. thanks for reply, I’ve normalized points using K: as below
        K * [x y 1]’

        K = [891.913000 2.962990 507.609000
        0.0 894.045000 379.563000
        0.0 0.0 1.0]

        these are my new points:
        double pts1[] = {532777.943232259,144497.667369159,

        double pts2[] = {489204.057059098,130628.554880249,

        when I run it, it Could not find a valid essential matrix. why!?

          1. thanks, I’ve normalized points using inv(K).
            but I get different result for E and P from your code and Nistér code.

            normalized points:
            double pts1[] = {0.0912160582538460,-0.237107884256455,

            double pts2[] = {0.0460994198507981,-0.251058163244206,
            E from nister code:

            0.0130 -0.5186 -0.2433
            0.5335 0.0176 -0.4023
            0.2246 0.4175 0.0152

            determinant constraint

            trace constraint
            1.0e-014 *

            -0.3792 -0.1887 0.3025
            -0.1887 0.4805 0.3109
            0.3136 0.3109 0.0243

            reprojection errors

            -0.0106 0.1304 0.0865
            -0.2081 -0.0313 -0.6695
            -0.1939 0.6672 0.0122

            determinant constraint

            trace constraint
            1.0e-013 *

            0.0444 0.0289 0.0225
            0.0447 -0.1676 -0.0333
            0.0519 -0.0233 0.1656

            reprojection errors

            your code result:

            Success! Found 2 possible solutions
            The best one has the highest inliers. An inlier is a point that is in front of both cameras.

            Solution 1/2

            E = [0.01148409963975677, 0.2256736888650271, 0.2102266638882955;
            -0.1414274834444046, 0.03390718827838227, -0.7235860545601634;
            -0.0937650329119874, 0.7260337842465426, -0.01320226081599484]

            Detected a reflection in P. Multiplying all values by -1.
            P = [0.9806301797531383, -0.1578850635032568, 0.1159170275667039, -0.9154292901533225;
            0.1571445712945679, 0.9874530609946233, 0.01555750767299127, -0.2713866487172902;
            -0.1169189217790431, 0.002959570056834053, 0.9931370533189793, 0.2972179362477677]

            inliers = 8/8

            Solution 2/2

            E = [-0.03591391645196601, -1.477446351363468, -0.6218820848126339;
            1.436193382639371, -0.04882832844378427, -1.156326639471349;
            0.6737905241157682, 1.11404943074441, -0.04208835387859701]

            Detected a reflection in P. Multiplying all values by -1.
            P = [0.9996397798815372, -0.01490468428539671, -0.02231951757418038, -0.5741231241023349;
            0.0143598572020866, 0.999599751239123, -0.02437481946228576, 0.3359458978504272;
            0.02267388320352681, 0.02404553407674684, 0.9994537044362969, -0.7466746226358877]

            inliers = 7/8

            dataset projections matrixes mentioned above.

            I’ve no idea of which one would be correct?
            also would you tell me a bit about deep test?

          2. The E matrix is determined up to an unknown scale. Divide all the E matrices by E(3,3) to get a better comparison. I get nearly identical results with Nister’s except the matrix is a transpose of mine (no idea why).

  12. yes that’s right. thanks.
    to compute projections matrix, it gives us 4 answer that one of them is correct.
    M1 = R1 * [I T];
    M2 = R2 * [I -1*T];
    M3 = R1 * [I T];
    M4 = R2 * [I -1*T];

    which one is correct? one way is to use each Mi and reconstruct matching points using them and check if z has a positive value. (that show us a 3d point infront of camera)
    when I implement this way, it removes 2 wrong answers, but one wrong answer remains.
    Is there any way else?

    1. also I’ve got same result as you between my possible answers for projection matrixes.

      but our result is not same as dataset given one. any idea why it happened? (I’ve normalized all of them using p(3,4) to compare)

        1. dataset’s given projection matrixes for these images:

          1.000000 0.000000 0.000000 0.000000
          0.000000 1.000000 0.000000 0.000000
          0.000000 0.000000 1.000000 0.000000

          0.954262 -0.140546 0.263872 -6.723805
          0.144974 0.989431 0.002719 -0.643127
          -0.261466 0.035661 0.964552 1.309906

          1. Not sure on this one. It seems the translation has been scaled, the 4th column usually has a length of 1 for unknown scale. Do you have a reference for the dataset?

          2. There seems to be two distortion coefficents in the text file. Did you take them into consideration when normalising the points?

          3. oh really?
            dataset doesn’t have any documentation and I couldnt understand it. what are these distortion coefficents and how can I consider them in normalizing points. thank you very much for your time …..

    2. From my correspondence with another person who did some research on this I think the conclusion is there’s not obvious way to resolve all ambiguities. It’s one of those degenerate problems. You’ll probably have to resort to some heuristic.

  13. hi!, i don’t understand the meaing Of symbolic maths ‘M’,in Mblock.h .I also don’t konw how you calculate the determinant. could you give some explain . thanks!

    1. The M is an arbitrary name. It’s the result of a expanding the symbolic matrix conditions. See the original paper to see what I mean. I made a reference to the determinant calculation in README.txt.

      1. in Mblock.h, the element of M matrix only include e** which is the result of SVD . In the coefficients matrix M , I don’t find the hidden variable z.

        1. The determinant calculation kicks in around line 99 in 5point.cpp. I wrote a class called PolyMatrix, to store the symbolic matrix coefficients.

  14. With two constraints ,it can get 10 equations about x y z. The max power is 3. why regard x^3, y^3, x^2*y, …,x,y as Linear element, compute though determinant and SVD ,instead of linearized equation. Thanks!

    1. Hi,

      I’m not entirely sure what you are asking, but there’s a good chance I won’t be able to answer your question either. I simply just followed the instructions outlined in the original paper. I don’t know too much about the underlying theories behind it. The paper might answer your question, it’s only 4 pages.

  15. I have one short question and I’m sorry if this is not the right pace for it. Lets say I used the 5point algorithm to get the relative movement between the first two frames and i have a third frame and want to know the continued movement of the camera i cant just use the algorithm again on the second and third frame or can I? Is there a simple solution for this or should i use solvePnP with the triangulated 3d points from the first 2 frames to get the camera pos in the third one?

    1. I think your suggestion is the easiest I can think of. That is triangulate the 3D points from the first two and get the pos of the third one.

  16. Which Matrix is to be chosen out of the ten solutions for Essential Matrix that the algorithm is returning?
    I am unable to decide. Would you please suggest me on how to resolve this ambiguity. Thank you.

    1. Been a while since I looked at this. But start with projecting a bunch of points and see how many are in front of the camera (as oppose to behind it).

      1. Thanks for the reply, Mr. Nghia Ho. I want to use the algorithm for motion estimation in my academic project. I am stuck with this question and unable to proceed further as it will be basically integrated with RANSAC also. Any effort form your side to resolve this would be really helpful. Thank you.

  17. This is my simple code that is supposed to test your implementation.

    1) create 5 distinct 3d points
    2) create 2 separate camera poses
    3) project 3d points from different camera poses (using identity matrix as camera matrix, using 0 distortion) in order to create two arrays of 2d points
    4) use these two arrays of 2d points as input to Solve5PointEssential

    vector containing essential matrices is empty
    vector containing projection matrices is empty

    Quick debugging info:
    When populating matrix `b` in 5point.cpp at line 115, cv::determinant(ret_eval) always returns 0.
    So, `b` ends up a vector of eleven zeros.

    1. Hmm been a while since I looked at this algorithm. From memory I think it should handle pure translation. As a test try adding a rotation to one of the camera and see what happens.

  18. Hello thanks for the nice algorithm you’ve made. It is being very helpful. I just have a question, what is this for loop with 1000 iteration stands for?

Leave a Reply

Your email address will not be published. Required fields are marked *