I'm trying to build a Fortran program that builds some control sums over an array, where the trip count of the loops might be zero, although they are typically rather large. I have an implementation with OpenMP reduction on the do loops which unfortunately gives incorrect sums on zero-trip loops.

I built the attached program with Intel ifort 12.1.0 20110811, PGI pgf95 11.8-0 64-bit target on x86-64 Linux -tp penryn and gfortran, gfortran and pgf95 yield the expected result for sum_b from the second, zero-trip loop. ifort gives some bogus value. I wanted to ask wether the attached program has a bug I'm not yet aware of, before I go through our support chain to file a bug report with Intel.

If one changes n to 0 in the program header, the first loop also gives unpredictable results for sum_total.

I'd appreciate any comment.

The compilation commands I used were:

- Code: Select all
`$ ifort -O0 -g -openmp -o ompzerotripreduction ompzerotripreduction.f90`

upon running the binary I get the following output:

- Code: Select all
`$ OMP_NUM_THREADS=1 ./ompzerotripreduction`

sum_total= 0 sum_b= 0

n= 13 m= 0

sum_total= 13 sum_b= 140193825868784

i.e. only 1 thread is started, which has the sum variables initialized to 0 but after going through an openmp do loop with zero iterations and a reduction clause on the corresponding sum variable, the variable value is rather unexpectedly different from 0 afterwards.

For pgf95 and gfortran I get the expected sum of 0:

- Code: Select all
`$ pgf95 -mp -g -O0 -o ompzerotripreduction ompzerotripreduction.f90 && ./ompzerotripreduction`

sum_total= 0 sum_b= 0

n= 13 m= 0

sum_total= 13 sum_b= 0

$ gfortran -O0 -g -fopenmp -o ompzerotripreduction ompzerotripreduction.f90 && OMP_NUM_THREADS=1 ./ompzerotripreduction

sum_total= 0 sum_b= 0

n= 13 m= 0

sum_total= 13 sum_b= 0

The program source is as follows:

- Code: Select all
`PROGRAM zerotripreduction`

INTEGER, PARAMETER :: i8=SELECTED_INT_KIND(14)

INTEGER(i8) :: sum_total, sum_b

INTEGER, ALLOCATABLE :: a(:)

INTEGER :: n, m

!$omp parallel shared(a, sum_total, sum_b, n, m)

!$omp master

n = 13

ALLOCATE(a(n))

m = MAX(0, n - 500)

sum_total = 0_i8

sum_b = 0_i8

a = 1

!$omp end master

!$omp barrier

PRINT *, 'sum_total=', sum_total, 'sum_b=', sum_b

!$omp barrier

!$omp do reduction(+: sum_total)

DO i = 1, n

sum_total = sum_total + a(i)

END DO

!$omp end do

!$omp do reduction(+: sum_b)

DO i = 1, m

sum_b = sum_b + a(i)

END DO

!$omp end do

!$omp master

PRINT *, 'n=', n, 'm=', m

PRINT *, 'sum_total=', sum_total, 'sum_b=', sum_b

!$omp end master

!$omp end parallel

END PROGRAM zerotripreduction