I've encountered a situation where you have a reduction directive inside a parallel region and omit the barrier after the initialization of the reduction variable. In this case there will be a race for the reduction value even in the case of no nowait clause. Here's an example:
- Code: Select all
#include <unistd.h>
#include <stdio.h>
#include <omp.h>
int main(void)
{
int a;
int i,myid;
#pragma omp parallel shared(a) private(i,myid)
{
myid = omp_get_thread_num();
if(myid==0)
sleep(1); // Or load imbalance
a = 0;
// A correct program should have a barrier here
#pragma omp for reduction(+:a)
for(i=0;i<10;i++)
a+=i;
#pragma omp single
printf("Sum is %d\n",a);
}
}
Maybe this behaviour should be clarified in example A.35?
