Crash : race condition ?

General OpenMP discussion

Crash : race condition ?

Postby FJacq » Tue Jun 19, 2012 12:21 am

The following piece of Fortran code is called very often in // loops. Most of the time, the value ndat(imp) is greater than zero. When ndat(imp)
has been set to a positive value, it does not change anymore (this is also true for the associated vector idat(:,imp)).

To avoid to spend time in a critical region (the one which initializes ndat(imp) and idat(:,imp)), I have decided to put it within the test block
instead of outside. But I observe crashes from time to time and I don't understand why.

Code: Select all
     
      SUBROUTINE mdbhashinfo(cset,cprop
     O                      ,info)
      IMPLICIT DOUBLE PRECISION (A-H,O-Z)
      INCLUDE 'mdbhash.h'
      INTEGER kmp,imp
      CHARACTER*(*) cset,cprop
      INTEGER info(*),n
      VOLATILE ndat,idat

      CALL mdbhashval(cset,cprop,init,kmp)
      CALL mdbhashpos(kmp,imp)

      IF(ndat(imp).LE.0) THEN
C$OMP CRITICAL
        CALL mdbhashpos(kmp,imp)
        IF(ndat(imp).LE.0) THEN
           CALL mdbhashload(cset,cprop
     O                     ,n,idat(1,imp))
C$OMP      FLUSH(idat)       
           ndat(imp)=n
C$OMP      FLUSH(ndat)       
        ENDIF
C$OMP END CRITICAL
      ENDIF
         
      DO k=1,ndat(imp)
         info(k)=idat(k,imp)
      ENDDO
     
      END
F. Jacq
FJacq
 
Posts: 2
Joined: Mon Jun 18, 2012 1:11 am
Location: France

Re: Crash : race condition ?

Postby MarkB » Tue Jun 19, 2012 4:30 am

This is an example of a pattern called double checked locking ( http://en.wikipedia.org/wiki/Double-checked_locking ).
It is well known to be dangerous, and should in general be avoided! In particular, in your code there is nothing to prevent the new value of ndat being visible to other threads before values in idat (neither the flush of idat, nor the volatile declarations are sufficient to guarantee this). Furthermore, a thread which skips the block never flushes anything, so is not guaranteed to see updated values.

I strongly recommend that you don't try to fix your code to make the double checked locking correct: your chances of having a portable solution are not good. If the overhead of the critical section really is a problem, you will probably need to try find another way round this, e.g. by initialising idat and ndat prior to some earlier thread synchronisation point, or making each thread call this code once and storing the result in (thread)private data.
MarkB
 
Posts: 422
Joined: Thu Jan 08, 2009 10:12 am


Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 8 guests