OpenMP and floating-point operations

General OpenMP discussion

OpenMP and floating-point operations

Postby HeinzM » Wed Jan 23, 2013 2:06 am

Hello,

I have a problem with numerical accuracy, I think. I use Fortran.

I solve a sparse matrix with a direct method (Cholesky). All storage is done on a large onedimensional array placed in a common block. The problem (cfd) is non-linear, so after solving the matrix the coefficents have to be updated and the matrix has to be new assembled and solved again and again. When this has to be done with not too large martices about 100 times, everything is ok. But on large matrices and several hundred iterations, the result are not identical compared with a computation without OpenMP.

Doing this without openmp, the code works. Running it several times the results are identical each time. But when I compile it with openmp, the results are not identical. They differ between several computations. This is even the case, when I remove all !$OMP directives or compute with one thread only. So I'm shure I have no race conditions.

I use the latest Intel-Fortran-compiler.

I guess, the numerical precision is handeld different when using openmp and rounding errors occour or something else.

Do you have an idea or some advice for me?

Thank you and best regards,
Heinz
HeinzM
 
Posts: 15
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby MarkB » Wed Jan 23, 2013 4:37 am

Hi Heinz,

If I understand correctly, you are seeing different numerical results from the sequential code (i.e. no OpenMP directives) just depending on whether you compile with -openmp or not. Am I right?

In general this might happen because the OpenMP compiler flag changes the optimisation level, but I don't think this is the case for the Intel compiler, so we need to look for another explanation!

Mark.
MarkB
 
Posts: 408
Joined: Thu Jan 08, 2009 10:12 am

Re: OpenMP and floating-point operations

Postby HeinzM » Wed Jan 23, 2013 4:52 am

Hi Mark,

yes, you are right. And differences depend on how large the matrix is or how much computation-load has to be done. For example: a sparse matrix with about 150000 unknowns needs about 300 iterations when i compiled without openmp, and needs 700 iterations when compiled with openmp but without any !$OMP directive.

My first thought was, that numerical precision causes the problem.
HeinzM
 
Posts: 15
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby MarkB » Wed Jan 23, 2013 6:13 am

It is unclear to me why compiling with -openmp should cause a change in numerical behaviour, since without any directives, it doesn't do much!
Compiling with -openmp does however have the effect of forcing all automatic variables to be stack allocated. This can expose bugs (e.g. variables which ought to have the SAVE attribute, but don't) or cause stack overflows: it would be worth eliminating these possibilities.
MarkB
 
Posts: 408
Joined: Thu Jan 08, 2009 10:12 am

Re: OpenMP and floating-point operations

Postby HeinzM » Wed Jan 23, 2013 10:49 am

Hi Mark,

I understand. I will try to check ......

Many thanks and regards,
Heinz
HeinzM
 
Posts: 15
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby Oldboy » Wed Jan 23, 2013 12:32 pm

Have you tested anything with better than double prec?

http://people.sc.fsu.edu/~jburkardt/f_s ... dmath.html

80 or 128 bit should be possible with 32-bit compiler and 128 bit with 64-bit compiler
Oldboy
 
Posts: 17
Joined: Wed Oct 31, 2012 2:39 am

Re: OpenMP and floating-point operations

Postby ftinetti » Wed Jan 23, 2013 1:47 pm

Hi,

80 or 128 bit should be possible with 32-bit compiler and 128 bit with 64-bit compiler

I would not suggest going this way, I think Mark is right, there may be some bug/s that the OpenMP flag makes to be exposed.

HTH,

Fernando.
ftinetti
 
Posts: 558
Joined: Wed Feb 10, 2010 2:44 pm

Re: OpenMP and floating-point operations

Postby HeinzM » Sat Feb 02, 2013 11:38 am

Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002
VEX=0.7071063969767235057872767 ! result of an other computation
VEY=0.707107165396162584691808

R1=A * VEX * VEX
R2=A * VEY * VEX
R3=A * VEX * VEY
R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4


This results in:
R1=0.0009999989607882072296247
R2=0.0010000000474968604152054
R3=0.0010000000474968606320458
R4=0.0010000011342066949474733


R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).
They are identical when compiling without OMP,
but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.
The solution with and without OMP are much different.



But this is not always the case. When using the following numbers, f.e.,
everything is ok:

A=0.002
VEX=0.7071159333124231727296660
VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125
R2=0.0010000000471624013922284
R3=0.0010000000471624013922284
R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0


So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.
But the digits far behind the point have an effect on the solution.


Do you have an idea what to do?


Thanks and best regards,
Heinz
HeinzM
 
Posts: 15
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby HeinzM » Sat Feb 09, 2013 1:54 am

Hello all,

I checked the code and I didnt find a bug or an not initialized variable or anything else. No reductions are present. The problems seems to be caused really by the kind floatingpoint-opreations are done.

I changed the numerical formulation of my differential equation. Now I do less floatingpoint operations. This slows the convergence, but it was the only way to get the program producing correct results when using openmp.

This seems to be a peril. Will I have to be aware always of instability when using openmp? I dont believe that I'm the only one who noticed such problems.

Regards
Heinz
HeinzM
 
Posts: 15
Joined: Sat Nov 03, 2012 7:22 am

Re: OpenMP and floating-point operations

Postby MarkB » Mon Feb 11, 2013 4:15 am

Hi Heinz,

Congratulations on your detective work!
My guess is that compiling with the OpenMP flag means that some variables are getting stored back to memory (and hence rounded to 64 bits)
instead of being kept in the extended precision 80-bit registers on the x86. I'm not exactly sure why this is happening, but in any case writing
code which relies on the extended precision arithmetic is probably not a good idea in general (and not very portable).

Best wishes,
Mark.
MarkB
 
Posts: 408
Joined: Thu Jan 08, 2009 10:12 am

Next

Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 2 guests

cron