[Omp] schedule(static)
Dieter an Mey
anmey at rz.rwth-aachen.de
Fri Jan 21 09:39:41 PST 2005
Hi,
Before the feedback phase for the 2.5 version of the OpenMP specs has
come to an end on Jan. 31 I'd like to start (continue) a discussion
about schedule(static).
During EWOMP2004 Mark Bull pointed out that according to the OpenMP
specifications the following toy program would contain a data race:
program example
integer, parameter :: n=11
real, dimension(n) :: a, b
!$omp parallel num_threads(2)
!$omp do schedule(static)
do i = 1, n
a(i) = 1.0
end do
!$omp end do nowait
!$omp do schedule(static)
do i = 1, n
b(i) = a(i) + 1.0
end do
!$omp end do
!$omp end parallel
write(*,*) b
end program example
The problem is the nowait clause. The draft V2.5 specs say on page 34
line 11-18:
"When schedule(static, chunk_size) is specified, iterations are divided
into chunks of size chunk_size, and the chunks are statically assigned
to threads in the team in a round-robin fashion in the order of the
thread number.
Note that the last chunk to be assigned may have a smaller number of
iterations.
When no chunk_size is specified, the iteration space is divided into
chunks which are approximately equal in size, and such that each thread
is assigned at most one chunk."
The interpretation is that if the chunk_size would specified, the work
distribution is unambiguously defined and the nowait clause would be ok.
In the above example
!$omp do schedule(static,6)
would fix the problem.
But without the specification of the chunk_size it might happen that
Thread 0 would have to execute
a(1) = 1.0
...
a(6) = 1.0 ! <=======
b(1) = a(1) + 1.0
...
b(5) = a(5) + 1.0
while thread 1 would get
a(7) = 1.0
...
a(11) = 1.0
b(6) = a(6) + 1.0 ! <=======
...
b(11) = a(11) + 1.0
If thread 1 is quicker than thread 0, it would use the old value of
a(6), which is obiously not what is intended.
I suggest that the specs should be modified and that for a fixed number
of threads and a fixed number of iterations an identical schedule should
be used, even if the chunk_size is not specified explicitely.
I think that this difference between a static schedule with and without
an explicitely specified chunk_size is very misleading and dangerous.
One argument in favour of the current definition is the flexibility for
the compiler to determine the schedule depending on locality of the data
used in the parallel loop. I think that for this kind of affinity
scheduling another kind name than static should be added to the OpenMP
specs in a future release.
In fact I have been using the above construct heavily without any
problems so far on many OpenMP compilers. Also Assure, the famous OpenMP
verification tool, does not complain.
I would be curious to here your oppinion.
Dieter
--
--------------------------------------------------------------------
Dieter an Mey
High Performance Computing Hochleistungsrechnen
RWTH Aachen University Rechen- und Kommunikations-
Center for Computing and Communication zentrum der RWTH Aachen
phone: ++49-(0)241-80-24377 Seffenter Weg 23
fax: ++49-(0)241-80-22134 52074 Aachen, Germany
email: anmey at rz.rwth-aachen.de
--------------------------------------------------------------------
More information about the Omp
mailing list