Desyncs and FPU synchronization.

If you’ve ever written anything that tries to keep perfect synchronization across systems, odds are pretty good you’ve run into desyncs. Some of them sensible, some of them leading to weeks of hair pulling.

And of course, the most obtuse and unexpected one of all is dealing with floating point issues, where a single off-by-0.000000001 error will snowball up into something much larger in a quick hurry. Traditional wisdom has just been to not use floating point math for anything related to logic, but this is not practical in every situation.

It turns out that’s not entirely true and perfect synchronization is attainable, if you understand the pitfalls that are in place. Warning: This gets fairly technical.

fp_numberSo a quick rundown of how floating point values work practically is in order. Rather than storing the number raw, floating point values store a value and a base 2 exponent, so the real value is something more like value*pow(2,exponent). This means that if you exceed the range given by the value, you will lose precision but retain magnitude.

Single precision floating point values are often referred to by the ‘float’ data type. These are 32 bit numbers separated into a one bit sign, an 8 bit exponent, and a 23 bit value. This means you will lose precision if the required value range is greater than (2^24)-1 or 16777215. In contrast to this, double precision floating point values are 64 bit numbers with a 11 bit exponent and a 52 bit value, so you get a lot more room for error.

The catch here is when you are dealing with values that do not evenly divide into powers of 2, as they have to be represented by irrational numbers. 0.1f + 0.2f is not actually 0.3f but is 0.300000004470, as none of the involved values evenly divide by a power of 2. For this reason you should basically never, ever try to test floating point values for equality directly when they have had arithmetic done on them.

This gets extra fun when you realize that the floating point constants are converted to binary form by the compiler itself, and there can be slight variance. I have not yet encountered this in the wild myself, but I wouldn’t be surprised to find it!



And that is the first gotcha for dealing with desyncs.

Storage form is not the same as data loaded into the registers. x86 based systems use a stack based format for operations to be handled, and when you load data from memory into one of these registers it will run at the current internal precision mode regardless of what it was originally. Normally this is harmless and actually improves accuracy at little to no storage cost, but it will inevitably become a problem as type of internal storage is, effectively, undefined across platforms.

Fortunately for us, the fix here is relatively easy: x86 based processors can have a precision mode set that defines the limitations of the processing value. This can be done with the assembly FLDCW instruction, and Windows has the _controlfp/_control78 functions available. Always, always do this if you need FPU synchronization, and make sure it doesn’t get unset by anything. Running FINIT and FLDCW immediately before your logic process doesn’t hurt to make sure that everything is working correctly.

Note that many higher level languages, such as C#, do not allow you to set the precision value as it is not system independent. In these cases, you absolutely cannot depend on floating point values to give the same answers on all systems.


The next problem up is that while the IEEE 754 floating point math standard defines a lot of the process, a fair number of standard library functions that get used practically every day are not defined. Often there’s a recommendation of how the result should be, but it doesn’t need to be strictly followed. Of particular note here are the transcendental functions, or basically anything related to trigonometry like calculating Sine and Cosine.

This means that you absolutely, positively cannot use math library trigonometry functions if you need consistency. Rounding errors across platforms are an inevitability, and no manner of extended precision can save you from a tiny rollover that cascades up the chain; consider the case of a value that exists right on the boundary of being rounded in a specific direction, that is treated differently on separate processors. Without a strict definition by the standard, there is no telling what answer you will get.

The best answer here, as far as consistency goes is just to roll your own variants of sin/cos, which will be slower but reliable.

Anything specifically related to rounding or, say, fabs has no problems. Surprisingly, the calculation of square roots has a defined rounding standard, so there are no issues with using it.


Modern processors and compilers are very smart and very efficient, but this also means they can outsmart an inattentive developer.

One particular optimization that’s a hassle is converting several simple operations into one complex instruction. For example, there’s the Multiply-Accumulate operation, which converts the mathematical construct a = b + (c * d) into a single instruction. As a fused version of this has only one rounding step, this is different from doing the computations separately. This is, effectively, really bad. It’s best to assume that you should disable any such operations.

Similar such ‘intelligent’ optimizations can exist, such as converting a*(b+1) into (a*b)+a, which consumes less code but produces different results. Going from a debug build to a release build can also make the optimizer be smarter than you think, so be careful.

VC++ has an fp:strict math mode which disables usage of it and GCC is generally okay if -ffast-math is not enabled, though still worth keeping an eye out at the assembly.


Once you iron them out of your own application, it’s a good time to realize that you need to iron them out entirely, even from stuff you rely on.

Embedded scripting languages, like Lua, frequently use floating point for their default number types. You need to control and limit the usage of these, if at all possible. And as mentioned before, C#, being a platform agnostic language by design, won’t let you do this directly. (I can’t speak for Java but I suspect it’s the same way.)

In the end, it may turn out to just be easiest to not bother with floating point for your logic at all, using only integer values or by using fixed point math. Simply multiplying all of your values by an integer constant and pretending it’s a decimal point is perfectly viable if you don’t need to multiply/divide them by each other in production. Though make sure you don’t escape the precision limit!

Vaguely hoping this article saves someone the headaches it gave me, one way or another…

Leave a Comment