A question about OMP3.0

Discuss the OpenMP 3.0 API Specifications with the OpenMP Arch. Review Board. (Read Only)

A question about OMP3.0

Postby huizhanyi » Wed Jun 11, 2008 8:58 pm

hi, in the new spec 3.0, there are two examples about flush directives, i.e. A.2.2 and A.2.3 , and the spec think it is not right.
I think that the two examples are right OMP program, and find many persons ask why it is wrong? But no one can tell me?
So can anyone give some further exaplaination about the wrong cases?
huizhanyi
 
Posts: 4
Joined: Wed Jun 11, 2008 8:51 pm

Re: A question about OMP3.0

Postby GregBronevetsky » Fri Jun 20, 2008 12:01 pm

The thing to pay attention to when thinking about these examples is the following text from Section 1.4.2:
The flush operation provides a guarantee of consistency between a thread’s temporary
view and memory. Therefore, the flush operation can be used to guarantee that a value
written to a variable by one thread may be read by a second thread. To accomplish this,
the programmer must ensure that the second thread has not written to the variable since
its last flush of the variable, and that the following sequence of events happens in the
specified order:
1. The value is written to the variable by the first thread.
2. The variable is flushed by the first thread.
3. The variable is flushed by the second thread.
4. The value is read from the variable by the second thread.
What you have to remember is that the above sequence is the only way to assure that OpenMP returns something reasonable in response to a read. More relaxed sequences, may result in literally any return value. This is where examples 2.2 and 2.3 run into trouble.


Example 2.3:
The problem with this example can be seen in the following execution. Threads 1 and 2 execute their initial flushes before the atomic update on thread 0. Threads 1 and 2 then both execute the read of flag in the while loop. Since these reads are not properly ordered relative to thread 0's write via a write-flush-flush-read ordering, they can return any value. In this case they return large values that allow both threads to immediately exit their loops. At this point the printf calls on the two threads may execute in either order. The problem was that the spin-locks used in this algorithm relied on the ability to accurately read the value of flag. In fact, in OpenMP spin-locks cannot reliably do that. The only thing that is possible is a binary test of whether the value of flag has changed. Anything else will cause unexpected results.

Code: Select all
Thread 0                   Thread 1                   Thread 2
--------                   --------                   --------
                           flush(flag)                flush(flag)
atomic: flag++
flush(flag)
                           while(flag<1)              while(flag<2)
                           printf("...")              printf("...")
                           atomic: flag++
                           flush(flag)


Example 2.2:
The problem is highlighted by the execution below. Thread 0 executes the write to data, the first flush and the write to flush but stalls for some reason before executing its final flush operation. Meanwhile, thread 1 executes its first flush and begins the while loop. In the first iteration it reads flag as >=1 because of the race with thread 0's write to flag, exits the loop and tries to read flag and data. Unfortunatley, the execution has not established the proper write-flush-flush-read sequence for either variable, with data missing a flush on thread 1 and flag missing both flushes. The problem here is that although this example overcame the spin-lock problem in example 2.3, it did not place a flush between the end of the spin-lock and the subsequent code.

Thread 1 then executes another flush and tries to read data and flag again. This time, there is a guaranteed write-flush-flush-read ordering between thread 0's write to data and thread 1's read, meaning that the read is guaranteed to return 42. However, thread 1's read of flag is still undefined because of the missing flush of flag on thread 0. The problem here is that the fact that thread 1's spin-lock loop returned successfully doesn't mean that thread 0 is actually finished flushing the flag. As such, the value of flag remained undefined until additional synchronizations are performed using other flag variables that prove that thread 0 has in fact executed the flush(flag) that follows its write to flag.

Code: Select all
Thread 0                   Thread 1
--------                   --------
                           flush(flag, data)
data=42                   
flush(flag, data)                                     
flag=1    <----| Race       
               |------->   while(flag < 1) [flag read as >=1 because of the race]
                            
                             printf("...") reads of flag and data undefined because
                                of race
                            
                             flush(flag, data)
                            
                             read of data is defined because proper
                                write-flush-flush-write order has been established
                             read of flag still undefined because thread 0's
                                post-write flush has not yet executed
                             ...
flush(flag)             
GregBronevetsky
 

Re: A question about OMP3.0

Postby Guest » Fri Jun 20, 2008 8:12 pm

Why all works must be via a write-flush-flush-read ordering? I think it is not logical.
For example 2.3, Threads 1 and 2 both execute the read of flag in the while loop, and they should get the value '0' (intial value) or '1', respectively. Can you give any cases which leads to a value larger than 1?
Guest
 

Re: A question about OMP3.0

Postby Chun Huang » Fri Jun 20, 2008 8:41 pm

Explicitly, Example 2.2f violates write-flush-flush-read ordering. because master(thread 0) does not assign initial value to flag before parallel region. For example 2.2.c, master sets flag = 0 in serial region, then the second thread dose while (flag < 1) flush(flag). Of course the second thread maybe flush flag with master at same time. But I think it is not a 'bad' race. It maybe get flag=0 firstly, then it loops, until master has flushed flag and it get flag=1.
Chun Huang
 

Re: A question about OMP3.0

Postby lfm » Sun Jun 22, 2008 10:33 am

If the writes and reads are not atomic (for example, if they are done with multiple smaller reads and writes) then it is possible to get a value other than 0 or 1. This is a hardware-dependent issue that is not addressed by OpenMP.
lfm
 
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Re: A question about OMP3.0

Postby yhz » Sun Jun 22, 2008 5:31 pm

If the cases, how can we program the syncronization program?
Do OpenMP provide any way?
yhz
 

Re: A question about OMP3.0

Postby lfm » Mon Jun 23, 2008 7:58 pm

Well, you can always do it with locks; that is certainly the safest. Greg's posting implies another way (relying only on the fact that the location has changed, not that it has any particular value) but I haven't tried to program it. Also, the standard requires that each implementation document the conditions under which reads are atomic (p. 14 lines 4-8) so for any given implementation it should be possible to write a fairly "standard" program that works correctly, but that same program isn't guaranteed to work across all implementations.
As a practical matter, I suspect that aligned 32-bit flag values will work properly in most implementations for general purpose processors (e.g., x86, powerpc, sparc) but again you would have to consult the documentation for the specific implementation.
lfm
 
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Re: A question about OMP3.0

Postby hzy » Tue Aug 19, 2008 4:50 pm

For Example 2.3, the update to flag in thread 0 is atomic, it is impossible for the other threads to read other value.
To lfm
hzy
 

Re: A question about OMP3.0

Postby lfm » Thu Aug 28, 2008 12:09 pm

For Example 2.3, the update to flag in thread 0 is atomic, it is impossible for the other threads to read other value.
To lfm


If you mean "If the update to flag in thread 0 is atomic, it is impossible for the other threads to read other value" then yes, that's correct. OpenMP by itself does not guarantee that the update or the reads of flag are atomic. The implementation may provide such a guarantee.
lfm
 
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Re: A question about OMP3.0

Postby samking » Wed Mar 11, 2009 1:30 pm

Gentlemen:

Are there any plans to enable OpenMP to control Nvidia's new multiprocessor Tersla boards? Nvidia is currently offering their CUDA programming language, but, from what I understand, that is a very low level programming language. Their new GPU Tersla boards offer several hundred GPU's and four plus boards per computer. That gives big time Multi Processor capabilities right now, and it would be so nice if we could program them through OpenMP or an extension of OpenMP, rather than learn a whole new language. We are currently programming a four processor cpu with OpenMP and have a new Computer with Nvdia Tersla GPU boards on the way.

Thank you,
Sam King
samking
 
Posts: 1
Joined: Wed Mar 11, 2009 12:54 pm


Return to OpenMP 3.0 API Specifications

Who is online

Users browsing this forum: No registered users and 0 guests