Last Updated on November 3, 2011 by nghiaho12
While searching for a fast arctan approximation I came across this paper:
“Efficient approximations for the arctangent function”, Rajan, S. Sichun Wang Inkol, R. Joyal, A., May 2006
Unfortunately I no longer have access to the IEEE papers (despite paying for yearly membership, what a joke …), but fortunately the paper appeared in a book that Google has for preview (for selected pages), “Streamlining digital signal processing: a tricks of the trade guidebook”. Even luckier, Google had the important pages on preview. The paper presents 7 different approximation, each with varying degree of accuracy and complexity.
Here is one algorithm I tried, which has a reported maximum error 0.0015 radians (0.085944 degrees), lowest error in the paper.
double FastArcTan(double x) { return M_PI_4*x - x*(fabs(x) - 1)*(0.2447 + 0.0663*fabs(x)); }
The valid range for x is between -1 and 1. Comparing the above with the standard C atan function for 1,000,000 calls using GCC gives:
Time ms | |
FastArcTan | 17.315 |
Standard C atan | 60.708 |
About 3x times faster, pretty good!
Hi, thanks for sharing this. Helped me to find out that atan function is correct on my GPU in OpenCL. I tested this because another function, tan, turned out to be buggy on this GPU.
Glad you found it useful
Thanks for posting this, it’s very useful if you don’t have time for the standard C function but can deal with a small accuracy loss!
What is M_PI_4?
Thanks for the resource!
PI/4
Ahh makes sense. I figured, just didn’t want to assume.
Thanks once more, will use this in some sand dune simulations!
Oh and one more thing, as far as speed is concerned, if you make this a macro (using #DEFINE in C/C++) it’ll theoretically be even faster than your standard function (or making it inline should have similar effect).
Depends. The compiler may be smart enough to inline such a small function. But doesn’t hurt to be explicit I guess.
To drop-in replace the standard C function, should `x` here be `y / x` in the standard implementation? http://www.cplusplus.com/reference/cmath/atan2/
yes it is
Dear,
I used the formula for atan function,butn the precision is not enough, one question:
M_PI_4*x – x*(fabs(x) – 1)*(0.2447 + 0.0663*fabs(x));
If the more precision coefficient vale of 0.2447 and 0.0663,the higher precision result? If yes, could you share your more precision value for 0.2447 and 0.0663?
Thanks in advance.
Sorry I don’t have a solution.
Why not
return x*(M_PI_4 – (fabs(x) – 1)*(0.2447 + 0.0663*fabs(x)));
which only uses 3 multiplications instead of 4? For MCUs without FPU, add/sub is cheaper than multiplications I guess.
You could even split up the last multiplication for the same reason:
return x*((M_PI_4 + 0.2447) – fabsf(x)*((0.2447 – 0.0663) + 0.0663*fabsf(x)));
And if the domain is [0..1] then replace fabs(x) with just x:
return x*((M_PI_4 + 0.2447) – x*((0.2447 – 0.0663) + 0.0663*x));
This doubles the speed of the original formula.