[Omp] Still on the barrier
Shengyan Hong
shhong at cse.psu.edu
Sun Mar 11 11:52:59 PDT 2007
Every OMP member,
Thank you for your explanation.
Now I'd like to mention my question in detail. I am looking for
your help.
Suppose there are 8 processors. Each has different frequencies and
L1 cache latencies. For example, Cpu1 has 1.1 GHz and its L1 cache latency
is 2 cycle.
Suppose there are 8 threads in one benchmark. Suppose there is one
barrier for these 8 threads. Suppose these 8 threads are divided into the
8 processors. Now I want to utilize the simics to test the idle time in
the barrier for these 8 processors.
I choose the benchmark FT. I think that the code should be
parallelized, so I choose the code as below:
!$omp parallel do default(shared) private(i,j,k)
do k = 1, d3
do j = 1, d2
do i = 1, d1
u0(i,j,k) = 0.d0
u1(i,j,k) = 0.d0
indexmap(i,j,k) = 0.d0
end do
end do
end do
return
end
And change the code to be:
!$omp parallel do default(shared) private(i,j,k)
do k = 1, d3
C TID = OMP_GET_THREAD_NUM()
C PRINT *, 'thread = ', TID
C print *, "March 9"
CALL MAGIC_BRK_SIM_START()
do j = 1, d2
do i = 1, d1
u1(i,j,k) = u0(i,j,k)*ex(t*indexmap(i,j,k))
end do
end do
C print *, "Before barrier"
CALL MAGIC_BRK_SIM_MIDDLE()
C !$OMP BARRIER
C print *, "After barrier"
CALL MAGIC_BRK_SIM_STOP()
end do
Now I get the idle time by using MAGIC_BRK_SIM_MIDDLE() and
MAGIC_BRK_SIM_STOP. But each processor has the same idle time 6 cycles.
Besides, I test the execution time by using MAGIC_BRK_SIM_START() and
MAGIC_BRK_SIM_MIDDLE(). And each processor has different execution time.
But no too different. For example, 1.7821*10^5 and 1.78345*10^5.
Since you tell me that barrier can not be added into the parallel
region. Now can you tell me where I can add the barrier. I think that the
place of the code should have 8 threads. Right? To the end, how can I use
barrier in the fortran code? Thank you very much.
Shengyan Hong
More information about the Omp
mailing list