[Omp] OpenMP Parallel Do Loops

Breshears, Clay clay.breshears at intel.com
Thu Nov 17 08:47:38 PST 2005


Craig -

I expect there are some memory access/contention issues at work here.
By changing the schedule I was able to get better performance.  I'm
using a dual processor system to run the code and I've got some results
below for a 1000x1000 case

1 thread:			 26.7 seconds
2 threads:			109.2
2 threads (Dyn, 8):	 51.4
2 threads (stat,8):	 41.5
2 threads (stat,16):	 37.0

Not as good as serial execution, yet.  Changing the schedule may
alleviate some of the memory problems, but there can still be issues of
data size and granularity coming into play here.

						clay

-----Original Message-----
From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org] On Behalf
Of ta.cbra at maths.strath.ac.uk
Sent: Monday, November 14, 2005 4:52 PM
To: Neil Summers
Cc: Omp at openmp.org
Subject: Re: [Omp] OpenMP Parallel Do Loops


Neil,

Thank you for your reply. I have attached a more detailed version of my 
code that actually applies the Given Rotations (this uses BLAS routines 
drotg and drot).

I have tried to implement your suggestions in this new code but I am
still 
unable to get any kind of speed up when I increase the number of 
processors.

Any obvious reason why to anyone? Any help is greatly appreciated!

      program CDGR
      include 'omp_lib.h'

c     Declare variable types
      integer :: i, j, x, m, n
      double precision, dimension(:,:), allocatable :: W
      double precision :: cc,ss,time
      integer np, me

c     Get the size of matrix to use
      write(*,*) 'What size of matrix do you wish to use?'
      write(*,*) 'Number of rows (m) ='
      read(*,*) m
      write(*,*) 'Number of columns (n) ='
      read(*,*) n
      write(*,*)'m= ',m,' and n = ',n,' thank you.'

      allocate(W(m,n))

      do i=1,m
         do j=1,n
            W(i,j)=1000*rand(i+j)
         end do
      end do

!      do i=1,m
!         write(*,*)(W(i,j),j=1,n)
!      end do

      time=dtime(timearray)

c     Show time step (i) that each element would be annihilated during 
c     to leave the matrix W upper triangular

!$OMP parallel private(x,cc,ss,me,i) shared(W,np,m,n)
      np = omp_get_num_threads()
      me = omp_get_thread_num()

      do i=1,m+n-2
c     Every node uses the same value of i but the j values
c     are shared out and can be preformed at the same time 

!$OMP do schedule(dynamic,1)
         do j=1,n
            x=m+2*j-i-1
c     make sure element W(x,j) is with-in the matrix W
            if (j .lt. x) then 
               if (x .le. m) then
                   call drotg(W(x-1,j), W(x,j), cc, ss)
                   W(x,j)=0d0
                   call drot(n-j,W(x-1,j+1:n),1,W(x,j+1:n),1,cc,ss)
!                   W(x,j)=i
               endif
            endif
         enddo
!$OMP end do
      enddo
!$OMP end parallel

      time=dtime(timearray)
      write(*,*)'CDGR with ',np
      write(*,*)'m=',m,'n=',n,'time=',time,

c  Print W if you want to see how it was annihilated
!      do i=1,m
!         write(*,*)(W(i,j),j=1,n)
!      end do

      deallocate(W)

      stop
      end


On Mon, 14 Nov 2005, Neil Summers wrote:

> 2 things i have noticed on a quick scan of your code.
> 
> 1) you should define the parallel region outside
> the i loop, creating a parallel region within a do loop
> causes excessive overhead, as the program fork/joins excessively.
> You should define the parallel region outside the i loop
> to reduce overhead then use omp do to split work up
> between threads. ie
> 
> !$OMP parallel private(x,me,np)
>  do i=1,m+n-2
> !$OMP do
>   do j=1,n
>    ...
>   enddo
>  enddo
> !$OMP end parallel
> 
> 2) i'm supprised you get the right results,
> by defining firstprivate(m,n), these are then undefined
> on exiting the parallel region, so i would guess the second
> iteration of i would not happen correctly.
> you don't need these private, so i'd leave them shared
> 
> Neil

-- 
  CB.

**************************************
*                                    *
* Craig Brand                        *
* University of Strathclyde          *
* Department of Mathematics          *
* e-mail  ta.cbra at maths.strath.ac.uk *
*                                    *
**************************************

_______________________________________________
Omp mailing list
Omp at openmp.org
http://openmp.org/mailman/listinfo/omp_openmp.org




More information about the Omp mailing list