Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Now, if a and b are already float, then it's equivalent.

Not necessarily! Floating-point contraction is allowable essentially within statements but not across them. By assigning the result of a * b into a value, you prohibit contraction from being able to contract with the addition into an FMA.

In practice, every compiler has fast-math flags which says stuff it and allows all of these optimizations to occur across statements and even across inline boundaries.

(Then there's also the issue of FLT_EVAL_METHOD, another area where what the standard says and what compilers actually do are fairly diametrically opposed.)

 help



The first mention of contraction in the standard (I'm looking at N3220 draft that I have handy) is:

A floating expression may be contracted, that is, evaluated as though it were a single opera- tion, thereby omitting rounding errors implied by the source code and the expression evalua- tion method.86) The FP_CONTRACT pragma in <math.h> provides a way to disallow contracted expressions. Otherwise, whether and how expressions are contracted is implementation-defined.

If you're making a language that generates C, it's probably a good idea to pin down which C compilers are supported, and control the options passed to them. Then you can more or less maintain the upper hand on issues like this.


The magic here with the word-ese is that a floating point expression may be contracted. In C, you write things like:

    float a = /*expression*/

    float b = /*expression*/

    float c = a * b;
Contraction can only happen within the expressions on the right hand side, not mutually between a and b. This makes inlining, and even variable substitution not legally permitted by the C spec to be equivalent to the actually written out inlined version. Ie you can't transform:

    float a = v1 * v2;

    float b = v3 * v4;

    float c = a + b;
Into

    float c = (v1 * v2) + (v3 * v4);
As they aren't semantically equivalent, and compilers do follow these rules. The same applies for inline functions. Its one of the reasons why macros can sometimes perform better than force-inline functions

In practice, the big 3 compilers all perform floating point contraction according to the spec, and this is a major issue if you want to get good gpgpu performance because this extends to OpenCL and C-style languages in general (like CUDA)


It seems to me that either you want to allow for contraction everywhere, or not all. Allowing it only sometimes is worst of both worlds.

If you allow contraction after inlining, whether or not an FMA will get contracted becomes subject to the vicissitudes of inlining and other compiler decisions that can be hard-to-predict. It turns out to be a lot harder of a problem to solve than it appears at first glance.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: