C++ FAQ Celebrating Twenty-One Years of the C++ FAQ!!!
(Click here for a personal note from Marshall Cline.)
Section 29:
[29.17] Why doesn't my floating-point comparison work?

Because floating point arithmetic is different from real number arithmetic.

Bottom line: Never use == to compare two floating point numbers.

Here's a simple example:

double x = 1.0 / 10.0;
double y = x * 10.0;
if (y != 1.0)
  std::cout << "surprise: " << y << " != 1\n";
The above "surprise" message will appear on some (but not all) compilers/machines. But even if your particular compiler/machine doesn't cause the above "surprise" message (and if you write me telling me whether it does, you'll show you've missed the whole point of this FAQ), floating point will surprise you at some point. So read this FAQ and you'll know what to do.

The reason floating point will surprise you is that float and double values are normally represented using a finite precision binary format. In other words, floating point numbers are not real numbers. For example, in your machine's floating point format it might be impossible to exactly represent the number 0.1. By way of analogy, it's impossible to exactly represent the number one third in decimal format (unless you use an infinite number of digits).

To dig a little deeper, let's examine what the decimal number 0.625 means. This number has a 6 in the "tenths" place, a 2 in the "hundreths" place, and a 5 in the "thousanths" place. In other words, we have a digit for each power of 10. But in binary, we might, depending on the details of your machine's floating point format, have a bit for each power of 2. So the fractional part might have a "halves" place, a "quarters" place, an "eighths" place, "sixteenths" place, etc., and each of these places has a bit.

Let's pretend your machine represents the fractional part of floating point numbers using the above scheme (it's normally more complicated than that, but if you already know exactly how floating point numbers are stored, chances are you don't need this FAQ to begin with, so look at this as a good starting point). On that pretend machine, the bits of the fractional part of 0.625 would be 101: 1 in the ½-place, 0 in the ¼-place, and 1 in the ⅛-place. In other words, 0.625 is ½ + ⅛.

But on this pretend machine, 0.1 cannot be represented exactly since it cannot be formed as a sum of a finite number of powers of 2. You can get close but you can't represent it exactly. In particular you'd have a 0 in the ½-place, a 0 in the ¼-place, a 0 in the ⅛-place, and finally a 1 in the "sixteenths" place, leaving a remainder of 1/10 - 1/16 = 3/80. Figuring out the other bits is left as an exercise (hint: look for a repeating bit-pattern, analogous to trying to represent 1/3 or 1/7 in decimal format).

The message is that some floating point numbers cannot always be represented exactly, so comparisons don't always do what you'd like them to do. In other words, if the computer actually multiplies 10.0 by 1.0/10.0, it might not exactly get 1.0 back.

That's the problem. Now here's the solution: be very careful when comparing floating point numbers for equality (or when doing other things with floating point numbers; e.g., finding the average of two floating point numbers seems simple but to do it right requires an if/else with at least three cases).

Here's the wrong way to do it:

void dubious(double x, double y)
  if (x == y)  // Dubious!
If what you really want is to make sure they're "very close" to each other (e.g., if variable a contains the value 1.0 / 10.0 and you want to see if (10*a == 1)), you'll probably want to do something fancier than the above:
void smarter(double x, double y)
  if (isEqual(x, y))  // Smarter!
There are many ways to define the isEqual() function, including:
#include <cmath>  /* for std::abs(double) */

inline bool isEqual(double x, double y)
  const double epsilon = /* some small number such as 1e-5 */;
  return std::abs(x - y) <= epsilon * std::abs(x);
  // see Knuth section 4.2.2 pages 217-218
Note: the above solution is not completely symmetric, meaning it is possible for isEqual(x,y) != isEqual(y,x). From a practical standpoint, does not usually occur when the magnitudes of x and y are significantly larger than epsilon, but your mileage may vary.

For other useful functions, check out the following (listed alphabetically):

  • Isaacson, E. and Keller, H., Analysis of Numerical Methods, Dover.
  • Kahan, W., http.cs.berkeley.edu/~wkahan/.
  • Knuth, Donald E., The Art of Computer Programming, Volume II: Seminumerical Algorithms, Addison-Wesley, 1969.
  • LAPACK: Linear Algebra Subroutine Library, www.siam.org
  • NETLIB: the collected algorithms from ACM Transactions on Mathematical Software, which have all been refereed, plus a great many other algorithms that have withstood somewhat less formal scrutiny from peers, www.netlib.org
  • Numerical Recipes, by Press et al. Although note some negative reviews, such as amath.colorado.edu/computing/Fortran/numrec.html
  • Ralston and Rabinowitz, A First Course in Numerical Analysis: Second Edition, Dover.
  • Stoer, J. and Bulirsch, R., Introduction to Numerical Analysis, Springer Verlag, in German.

Double-check your assumptions, including "obvious" things like how to compute averages, how to solve quadratic equations, etc., etc. Do not assume the formulas you learned in High School will work with floating point numbers! For insights on the underlying ideas and issues of floating point computation, start with David Goldberg's paper, What Every Computer-Scientist Should Know About Floating Point Arithmetic or here in PDF format. You might also want to read this supplement by Doug Priest. The combined paper + supplement is also available. You might also want to go here for links to other floating-point topics.