[Omp] schedule(static)

Dieter an Mey anmey at rz.rwth-aachen.de
Fri Jan 21 09:39:41 PST 2005


Hi,

Before the feedback phase for the 2.5 version of the OpenMP specs has 
come to an end on Jan. 31 I'd like to start (continue) a discussion 
about schedule(static).

During EWOMP2004 Mark Bull pointed out that according to the OpenMP 
specifications the following toy program would contain a data race:

program example
         integer, parameter :: n=11
         real, dimension(n) :: a, b

!$omp parallel num_threads(2)
!$omp do schedule(static)
         do i = 1, n
                 a(i) = 1.0
         end do
!$omp end do nowait

!$omp do schedule(static)
         do i = 1, n
                 b(i) = a(i) + 1.0
         end do
!$omp end do
!$omp end parallel

         write(*,*) b
end program example

The problem is the nowait clause. The draft V2.5 specs say on page 34 
line 11-18:

"When schedule(static, chunk_size) is specified, iterations are divided
into chunks of size chunk_size, and the chunks are statically assigned 
to threads in the team in a round-robin fashion in the order of the 
thread number.
Note that the last chunk to be assigned may have a smaller number of
iterations.
When no chunk_size is specified, the iteration space is divided into 
chunks which are approximately equal in size, and such that each thread 
is assigned at most one chunk."

The interpretation is that if the chunk_size would specified, the work 
distribution is unambiguously defined and the nowait clause would be ok.

In the above example
	!$omp do schedule(static,6)
would fix the problem.
But without the specification of the chunk_size it might happen that

Thread 0 would have to execute
	a(1) = 1.0
	...
	a(6) = 1.0    ! <=======
	b(1) = a(1) + 1.0
	...
	b(5) = a(5) + 1.0

while thread 1 would get

	a(7) = 1.0
	...
	a(11) = 1.0
	b(6) = a(6) + 1.0    ! <=======
	...
	b(11) = a(11) + 1.0

If thread 1 is quicker than thread 0, it would use the old value of 
a(6), which is obiously not what is intended.

I suggest that the specs should be modified and that for a fixed number 
of threads and a fixed number of iterations an identical schedule should 
be used, even if the chunk_size is not specified explicitely.

I think that this difference between a static schedule with and without 
an explicitely specified chunk_size is very misleading and dangerous.

One argument in favour of the current definition is the flexibility for 
the compiler to determine the schedule depending on locality of the data 
used in the parallel loop. I think that for this kind of affinity 
scheduling another kind name than static should be added to the OpenMP 
specs in a future release.

In fact I have been using the above construct heavily without any 
problems so far on many OpenMP compilers. Also Assure, the famous OpenMP 
verification tool, does not complain.

I would be curious to here your oppinion.

Dieter

-- 
--------------------------------------------------------------------
Dieter an Mey
High Performance Computing               Hochleistungsrechnen
RWTH Aachen University                   Rechen- und Kommunikations-
Center for Computing and Communication   zentrum der RWTH Aachen
phone: ++49-(0)241-80-24377              Seffenter Weg 23
fax:   ++49-(0)241-80-22134              52074 Aachen, Germany
email: anmey at rz.rwth-aachen.de
--------------------------------------------------------------------






More information about the Omp mailing list