Please see this page for the most up to date information regarding this post and code.

A few months ago I wrote an OpenCV version of the pose estimation algorithm found in the paper Robust Pose Estimation from a Planar Target (2006) by Gerald Schweighofer and Axel Pinz using their Matlab code from the link in the paper.

I originally ported it so I could play around with Augmented Reality, specifically to track a planar maker (business card). Although that project hasn’t received much attention lately I figured someone might find the pose estimation part useful. I was about to post it up last night when I came across a small bug, which took 2-3 hour to hunt down. Seems like the new OpenCV Mat class has some subtle gotchas but all good now.I checked my results with the original Matlab code and they appear numerically the same, which is always a good sign.

Download RobustPlanarPose-1.1.2.zip

Look at demo.cpp to see how it works.

**UPDATE: 12 March 2012**

If you plan to use this for Augmented Reality applications then you should modify Line 23 in RPP.cpp from:

ObjPose(model_3D, iprts, NULL, &Rlu, &tlu, &it1, &obj_err1, &img_err1);

to

ObjPose(model_3D, iprts, &Rlu, &Rlu, &tlu, &it1, &obj_err1, &img_err1);

This initialise the guess of the rotation matrix to the last estimate, providing smoother and more accurate results.

Hi Ngiaho,

Can you explain to me the use of the model points? I can extract the normalized points of an interest object from a scene image but for the model points, i am not sure what it is referring to. If my object of interest is a square, can i use a constant for the model points with them representing a square? Thanks..

Yep you are correct about the model points. You decide what 3D points you want to define for your model square, which stays constant. So for example, you could pick bottom left corner to be (-1,-1,0) and top-right to be (1,1,0). Here ‘z’ is chosen to be 0 for all points.

Oh thanks for the info, but is it necessary for the assumed model points to form a square? The pose estimation for my application runs when the projected points do not form a square but the assumed model points formed a square..

The model points can be arbitrary. You just pick the appropriate 3D model co-ordinates to use for the given model shape. For example, if it’s a rectangle 20×10 cm, you might pick choose:

x from [-10, 10]

y from [-5, 5]

z = 0

(0,0,0) at centre. Whatever you find convenient to use. The results from the code will give you a rotation/translation relative to those co-ordinates.

I tried that before, what i do is that, i’ll get the ratio between the width and height of the rectangle and then multiply it to a value and use it for the x and y range.

e.g. If the ratio between the height and width is h = 0.5w,

the range would be:

x:[-0.5w,0.5w]

y:[-0.25w,0.25w]

But the result is very jitter-ish estimation especially when the camera is perpendicular to the marker. I suspect is the distortion caused by the camera i am using causing the marker to look taller than it is supposed to be.

Is there any way in your opinion to actually estimate the model points from the projected points? I used opencv calibration to calibrate my camera using 30 frames of checkboard.

I get jitterish estimation as well. I think the results depend a lot on the quality of the features being matched. Any slight pixel error might affect it greatly. I haven’t verified this yet.

OpenCV calibration should give you decent result. Though I have had better experience with Matlab calibration toolbox. Not sure how much OpenCV has improved their calibration stuff.

I’m actually using ARToolkit for the detection part, hence the projected points are actually the vertices of the marker. It makes me wonder why the pose estimation only works for square marker and not rectangle marker. I strongly believe that it has got to do with the model points but if i were to use rectangle, then i need to compute the ratio of the height and width but it didn’t work even though i did it real time for every frame.

I think I found out why it’s jittering so much. The code has an option to initialise the rotation matrix, by default the original Matlab code it uses some heuristic to find it. But if you feed back the previous estimated rotation matrix you’ll get smoother results. Line 24 in RPP.cpp, change from

ObjPose(model_3D, iprts, NULL, &Rlu, &tlu, &it1, &obj_err1, &img_err1);

to

ObjPose(model_3D, iprts, &Rlu, &Rlu, &tlu, &it1, &obj_err1, &img_err1);

oh! thanks, i will try the solution later =D

Sorry for the dumb question but can you please clarify on what you mean by normalized image points. Is it the case that this matrix is just the detected image points multiplied by the inverse intrinsics of the camera and leaving the model points to some arbitrary range (-1,0,0 for top left etc)?

Last question; I’m using this for mobile – does anyone know where I can find a comparison between all (mainstream) pose estimation algorithms?

Appreciate any help.

Yup, that’s correct. The ‘normalized’ image points are the image points multiplied by the inverse intrinsic of the camera.

Try the original paper that I ported the code from. It should have references to popular methods.

Thanks for your response – unfortunately by just multiplying the image points with the inverse camera intrinsic values I get some strange result.

Do I need to apply any transformation to the object points?

Thanks for your help,

Josh

hello,may i know what are the software requiremnt that need to installed(for example:open cv or glut)before we can run the code..ps//: it is the code can be run by using vs2008?

Hi

You only need OpenCV 2.x installed, which will run in vs2008.

thanks..its works..May i know which file that u used to locate the data input?

The demo.cpp program has some hard coded values, that I used to verify against the Matlab code. More documentation can be found on my dedicated page, go to Computer Vision -> Pose Estimation For Planar Target.

Hi,

I found one bug in file RPP.cpp

line 1037-1039:

opt_t.at(1,1) += ((v21*r2+(v22-1)*r5+v23*r8)*py+(-v21*r1-(v22-1)*r4-v23*r7)*px+(-v21*r3-(v22-1)*r6-v23*r9)*pz);

opt_t.at(1,0) += ((2*v21*r1+2*(v22-1)*r4+2*v23*r7)*pz+(-2*v21*r3-2*(v22-1)*r6-2*v23*r9)*px);

opt_t.at(1,3) += (v21*r1+(v22-1)*r4+v23*r7)*px+(v21*r3+(v22-1)*r6+v23*r9)*pz+(v21*r2+(v22-1)*r5+v23*r8)*py;

should be

opt_t.at(1,0) += ((v21*r2+(v22-1)*r5+v23*r8)*py+(-v21*r1-(v22-1)*r4-v23*r7)*px+(-v21*r3-(v22-1)*r6-v23*r9)*pz);

opt_t.at(1,1) += ((2*v21*r1+2*(v22-1)*r4+2*v23*r7)*pz+(-2*v21*r3-2*(v22-1)*r6-2*v23*r9)*px);

opt_t.at(1,2) += (v21*r1+(v22-1)*r4+v23*r7)*px+(v21*r3+(v22-1)*r6+v23*r9)*pz+(v21*r2+(v22-1)*r5+v23*r8)*py;

because opt_t is only 3×3 matrix, and you can’t write to opt_t.at(1,3) .

Hi,

Thanks for that! I’ll update the code immediately. Subtle bug, how did you come across it?

Hi,

I am a newbie in augmented reality and I am student learning c# from India.

I want your help for developing application in c# like http://www.youtube.com/watch?v=i_bZNVmhJ2o shown in this video.

Please help in developing this.

Regards,

Bhavesh

That’s quite an advance looking piece of software and is outside of my expertise. Can’t help you there.

hi i just compile yr code demo.cpp i got errors like :

\rpp.cpp(642): error C2065: ‘M_PI_2’ : not déclared identifier

any help

Add _USE_MATH_DEFINE to your project defines and recompile. Or do this

http://msdn.microsoft.com/en-us/library/4hwaceh6.aspx

thank you it’s work !!!!!

but there is a bug at : 1030-1032

opt_t.at(1,1) += ((v21*r2+(v22-1)*r5+v23*r8)*py+(-v21*r1-(v22-1)*r4-v23*r7)*px+(-v21*r3-(v22-1)*r6-v23*r9)*pz);

opt_t.at(1,2) += ((2*v21*r1+2*(v22-1)*r4+2*v23*r7)*pz+(-2*v21*r3-2*(v22-1)*r6-2*v23*r9)*px);

opt_t.at(1,3) += (v21*r1+(v22-1)*r4+v23*r7)*px+(v21*r3+(v22-1)*r6+v23*r9)*pz+(v21*r2+(v22-1)*r5+v23*r8)*py;

should be ?

opt_t.at(1,0) += ((v21*r2+(v22-1)*r5+v23*r8)*py+(-v21*r1-(v22-1)*r4-v23*r7)*px+(-v21*r3-(v22-1)*r6-v23*r9)*pz);

opt_t.at(1,1) += ((2*v21*r1+2*(v22-1)*r4+2*v23*r7)*pz+(-2*v21*r3-2*(v22-1)*r6-2*v23*r9)*px);

opt_t.at(1,2) += (v21*r1+(v22-1)*r4+v23*r7)*px+(v21*r3+(v22-1)*r6+v23*r9)*pz+(v21*r2+(v22-1)*r5+v23*r8)*py;

Opps I actually fixed all the bugs you mentioned + the M_PI_2 last year but forgot to update the page to point to the recent file 🙂 Can you download the new one.

Hi,

thanks for sharing your code.

I have a couple of question:

1) what is the reference system used for the camera and for the object?

2) How can I translate the rotation matrix into euler angles?

Thanks

1) This is one is determined by the K matrix, such that image_points = K*[R|t]*3d_points. I think .. I use in an augmented reality demo here http://nghiaho.com/?page_id=1273

2) See if this helps http://nghiaho.com/?page_id=846