OpenMP and floating-point operations

General OpenMP discussion

Re: OpenMP and floating-point operations

Postby ftinetti » Mon Feb 11, 2013 7:08 am

Hi Heinz,

I checked the code and I didnt find a bug or an not initialized variable or anything else. No reductions are present. The problems seems to be caused really by the kind floatingpoint-opreations are done.

Do you have a (reduced yet complete) example to play around with (compile, run)? It's good you examined the code, but I think it's unlikely the compiler changes the kind of floating point operations... Maybe something along the lines of what Mark suggests is happening, but in any case, the problem seems to be ill-conditioned (or something similar to), in which case parallel computing exposed it, hopefully in the early stages...

HTH,

Fernando.
ftinetti
 
Posts: 575
Joined: Wed Feb 10, 2010 2:44 pm

Re: OpenMP and floating-point operations

Postby HeinzM » Mon Feb 11, 2013 9:46 am

Hi Mark,

I didnt know that 80bit precision is used internally. May be that there is the problem. I do cfd-computations, and using Kind=8 variables has well worked for many many years. This is the first time such a problem arises. I also asked the Intel-premium-support. They think about it, since over a week ;-)

Thanks a lot!
Best regards,
Heinz
HeinzM
 
Posts: 17
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby HeinzM » Mon Feb 11, 2013 10:02 am

Hi Fernando

ftinetti wrote:Do you have a (reduced yet complete) example to play around with (compile, run)?

It is a finite-element-code for computing water flow over floodplains. The problem occours while assembling the element-matrices. A few hundred show the problem, but there are hundred-thousand. So reducing the code will likely not show the problem. I dont believe you want to walk through the hole code, right? :)


ftinetti wrote:It's good you examined the code, but I think it's unlikely the compiler changes the kind of floating point operations... Maybe something along the lines of what Mark suggests is happening, but in any case, the problem seems to be ill-conditioned (or something similar to), in which case parallel computing exposed it, hopefully in the early stages...

If you are right, I have a hidden problem and no idea how to find it, not even where to look for .....

Thank you and best regards,
Heinz
HeinzM
 
Posts: 17
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby ftinetti » Mon Feb 11, 2013 11:25 am

Hi again,

A few hundred show the problem, but there are hundred-thousand. So reducing the code will likely not show the problem. I dont believe you want to walk through the hole code, right?

Hmmm... I don't know, feel free to send the code, I would like to see it, anyway. I do not guarantee any result, of course...

ftinetti wrote:
It's good you examined the code, but I think it's unlikely the compiler changes the kind of floating point operations... Maybe something along the lines of what Mark suggests is happening, but in any case, the problem seems to be ill-conditioned (or something similar to), in which case parallel computing exposed it, hopefully in the early stages...

If you are right

It's a big "if" ...
I have a hidden problem and no idea how to find it, not even where to look for .....

You are right, and it is a bug-like behavior.

Fernando.
ftinetti
 
Posts: 575
Joined: Wed Feb 10, 2010 2:44 pm

Re: OpenMP and floating-point operations

Postby Oldboy » Thu Feb 14, 2013 3:41 am

The errors Heinz has found are not small indeed. Here are examples of smaller errors:

http://people.sc.fsu.edu/~jburkardt/f_s ... output.txt
http://people.sc.fsu.edu/~jburkardt/f_s ... output.txt
http://people.sc.fsu.edu/~jburkardt/f_s ... output.txt
http://en.wikipedia.org/wiki/IEEE-754
Observe that gfortran and intelfortran don't show the same errors using omp.
The serial version is only gfortran.
In iEEE-754 5 rounding rules are specified. I have not seen any compiler directives how to use one of them.
Oldboy
 
Posts: 17
Joined: Wed Oct 31, 2012 2:39 am

Re: OpenMP and floating-point operations

Postby zerothi » Fri Jun 07, 2013 12:49 pm

HeinzM wrote:Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002
VEX=0.7071063969767235057872767 ! result of an other computation
VEY=0.707107165396162584691808

R1=A * VEX * VEX
R2=A * VEY * VEX
R3=A * VEX * VEY
R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4


This results in:
R1=0.0009999989607882072296247
R2=0.0010000000474968604152054
R3=0.0010000000474968606320458
R4=0.0010000011342066949474733


R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).
They are identical when compiling without OMP,
but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.
The solution with and without OMP are much different.



But this is not always the case. When using the following numbers, f.e.,
everything is ok:

A=0.002
VEX=0.7071159333124231727296660
VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125
R2=0.0010000000471624013922284
R3=0.0010000000471624013922284
R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0


So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.
But the digits far behind the point have an effect on the solution.


Do you have an idea what to do?


Thanks and best regards,
Heinz

When dealing with kinds of double precision you really need to specify that your numbers are double precision.
This is something that is absolutely CRUCIAL!

Your code should be on the line like this:
Code: Select all
A=0.002_8
VEX=0.7071063969767235057872767_8   ! result of an other computation
VEY=0.707107165396162584691808_8

or even better:
Code: Select all
integer,parameter :: dp = selected_real_kind(p=10)
A = 0.002_dp
VEX = .7071063969767235057872767_dp
VEY = .707107165396162584691808_dp

This is a huge problem in your code if you expect to have double precision but your problem is purely defined by constants of single precision.

When specifying
Code: Select all
A = 0.002

it will actually take only the first single precision digits and do a conversion to double precision (thus in effect removing any knowledge of the last digits). This will not even make A =0.002, the last digits will be randomized (not strictly randomized, but...).
On my machine if I do:

Code: Select all
write(6,'(1F30.15)') A
write(6,'(6F30.15)') VEX,VEY,R1,R2,R3,R4


I get:
Code: Select all
             0.002000000094995
             0.707106411457062             0.707107186317444             0.000999999001745             0.001000000097562             0.001000000097562             0.001000001193381
             0.002000000000000
             0.707106396976724             0.707107165396163             0.000999998913291             0.000999999999999             0.000999999999999             0.001000001086709

where the last two lines are with my correction.

I dont know how you compile your code, but one can force the compiler into specifying reals as double precision which will leverage the need of _dp, however, I believe this is bad practice.

I am sad to see this so many places in this forum, I have just accessed this forum and the first four codes I saw had this problem!

Thus, I would not call this a bug-behaviour as any discrepancies when using OpenMP will be visible to a higher extend as increasing number of threads decreases individual calculations, which leads to higher dependency on initial conditions (if we are dealing with self-consistent iterations of some sort?).

Regards
zerothi
 
Posts: 3
Joined: Fri Jun 07, 2013 12:25 pm

Previous

Return to Using OpenMP

Who is online

Users browsing this forum: Yahoo [Bot] and 8 guests