[Omp] Is this code wrong?
Phil Lucido
philiplu at microsoft.com
Wed May 11 11:58:41 PDT 2005
The problem isn't that your code is wrong so much as you're expecting
more from floating point math than is reasonable. Take a look at David
Goldberg's classic paper "What Every Computer Scientist Should Know
About Floating-Point Arithmetic"
(http://portal.acm.org/citation.cfm?id=103163) for the deep detail. But
more quickly, when dealing in FP math, you can't assume (x+y)+z ==
x+(y+z). Order of evaluation matters, because FP math is an
approximation. By parallelizing an FP calculation, you've changed the
order of evaluation in unpredictable ways, which means that round-off
errors will propagate differently. Since you're using floats with only
7 or so digits of precision to begin with, it's not surprising that you
only get 4-5 digits of agreement between runs.
There are ways to change your code that, at least in VC++ 2005, give
consistent answers, though that's a side-effect of the OpenMP
implementation. Try this instead:
float result = 0.0f;
#pragma omp parallel for reduction(+: result)
for (int x=0; x<1000; x++)
{
for (int i=0; i<5000; i++)
{
result += cosf(i*0.1f) * sinf(i*1.0f) * cosf(i*0.1f) *
sinf(i*0.1f);
}
}
LOG("Test1 - end - result=%f\n", result);
This has a number of changes from your original code:
* The barrier after the parallel region is unnecessary - the main thread
won't leave the parallel region until all the team threads finish
executing the region, and anyway a barrier in a serial region does
nothing.
* You might as well use a "parallel for" directive instead of splitting
it up, though that's more a cosmetic change, at least in VC++'s
implementation.
* It makes more sense for 'result' to be a reduction variable instead of
a shared one. That way, each team thread will operate on a local copy
of 'result', then perform a single cross-team reduction at the end.
That removes the need for the atomic operation each time through the
inner loop, speeding the code significantly.
Note that my version produces consistent results. That's because the
same order of evaluation is used on each run, thanks to implementation
details that aren't necessarily required by OpenMP:
* each run is using the same number of team threads
* the "for" directive defaults to a static schedule, so each team thread
will be assigned exactly the same iterations on each run
* the use of reduction instead of shared means that the contributions of
each team to the final "result" happens only once, at the end of the
parallel region, and in a specific order (that's an implementation
detail), so you always get the same "result".
Final note - even though you get consistent results in my version, you
still get a different result from running this with OpenMP disabled.
Again, that's because parallelizing the code changes the order of
evaluation. Consistent is not the same as "correct", whatever that
means.
...Phil
Visual C++
> -----Original Message-----
> From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org]
> On Behalf Of John van der Burg
> Sent: Wednesday, May 11, 2005 10:24 AM
> To: omp at openmp.org
> Subject: [Omp] Is this code wrong?
>
>
> Hi,
>
> I am new to OpenMP and am experimenting a bit using the
> Visual Studio .NET
> 2005 beta 2 compiler. First of all I think it is really a great API!
>
> I made the following piece of test code while playing around
> with OpenMP. I call this code a couple of times in a row (it
> is inside a function and I call this function 4 times in a
> row). The code is totally useless and just as test.
>
> -------------------------------------
> float result = 0.0f;
> #pragma omp parallel shared(result)
> {
> #pragma omp for
> for (int x=0; x<1000; x++)
> {
> for (int i=0; i<5000; i++)
> {
> #pragma omp atomic
> result += cosf(i*0.1f) *
> sinf(i*1.0f) * cosf(i*0.1f) * sinf(i*0.1f);
> }
> }
> }
>
> #pragma omp barrier
>
> LOG("Test1 - end - result=%f", result);
> -------------------------------------
>
>
>
> However, the problem I encounter is that the value of the
> variable "result"
> is not always exactly the same as you can see in the following output:
>
> Test1 - end - result=-91.146599
> Test1 - end - result=-91.146812
> Test1 - end - result=-91.147438
> Test1 - end - result=-91.146141
>
> It is not far off, but I expected the same values. What am I
> doing wrong? :)
>
> Kind regards,
> - John
>
>
>
>
>
>
> _______________________________________________
> Omp mailing list
> Omp at openmp.org
> http://openmp.org/mailman/listinfo/omp_openmp.org
>
More information about the Omp
mailing list