Problem in parallelizing the code

General OpenMP discussion

Problem in parallelizing the code

Postby ymharshe » Tue Jul 01, 2008 8:10 am

Dear all,

I tried using openmp for simple parallelization hwoever did not work.

My computer, which is an Apple machine with MAC OS X 10.4 (Tiger) installed, has 4 threads in total (2 processors each one is a dual core),. I wrote a simple fortran code performing inversion of two matrices and compiled the code with intel fortran compiler. I defined the number of threads equal to 2 by and external command (EXPORT NUM_THREDS = 2) at the terminal. I tried using SECTIONS directive with NOWAIT clause with inversion of each matrix being in a different SECTION. However, I observed no performance improvement, meaning, the time required with and without –openmp option remained the same. Actually, I was expecting that each thread will be inverting a matrix independently and the overall simulation time will be almost half.

For your information I have copied my code at the end.

Any kind of help is highly appreciated.
===============================================================================

c program to see if the parallelization with OMP works!

program parallel_try

implicit none

external minnvert

integer i,j,k,l,nx,ny,ithread

real*8, ALLOCATABLE, DIMENSION(:,:) :: a,b
real*8, ALLOCATABLE, DIMENSION(:,:) :: a1,b1
real time_begin,time_end

nx=1000
ny=1000

ALLOCATE(a(nx,ny),b(nx,ny))
ALLOCATE(a1(nx,ny),b1(nx,ny))

do i=1,nx
do j=1,ny
a(i,j) = 2.0*i+0.5*j**2
b(i,j) = 3.0*i*i+j
end do
end do

call cpu_time (time_begin)

!$OMP SECTIONS
!$OMP SECTION
call minvert(nx,a,a1)
print*,'a invet found!'
!$OMP SECTION
call minvert(nx,b,b1)
print*,'b invet found!'
!$OMP END SECTIONS NOWAIT

call cpu_time (time_end)

print*,'time required for the simulation:'
$ ,time_end-time_begin,'seconds'


stop
end
===============================================================================
Last edited by ymharshe on Tue Jul 01, 2008 11:05 am, edited 1 time in total.
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ejd » Tue Jul 01, 2008 8:47 am

The code you have shown has several problems:
  • array c is not declared
  • the sections directive has to be in a parallel region to actually be run in parallel
  • the end statement is wrong (should be !$omp end sections)
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Problem in parallelizing the code

Postby ymharshe » Tue Jul 01, 2008 9:01 am

ok, array C is not required : forget about it.
the end statement of the sections is corrected.
the only problem is : how do i define the parallel region?
should it be like
!$OMP PARALLEL SECTIONS
!$OMP SECTION
....
!$OMP END PARALLEL SECTIONS

OR

!$OMP PARALLEL
!$OMP SECTIONS
!$OMP SECTION
....
!$OMP END SECTIONS
!$OMP END PARALLEL
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ymharshe » Tue Jul 01, 2008 9:04 am

I already tried the second way but to my surprise, it required more time !!! :lol:
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ejd » Tue Jul 01, 2008 9:13 am

Your second approach:

Code: Select all
!$OMP PARALLEL
!$OMP SECTIONS
!$OMP SECTION
....
!$OMP END SECTIONS
!$OMP END PARALLEL

will take more time if the compiler doesn't combine the barriers for the 'end sections' and 'end parallel'. If it does, then there is no difference. The other thing you can do, if there is no code between the regions is (add the nowait on the 'end sections'):

Code: Select all
!$OMP PARALLEL
!$OMP SECTIONS
!$OMP SECTION
....
!$OMP END SECTIONS nowait
!$OMP END PARALLEL
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Problem in parallelizing the code

Postby ymharshe » Tue Jul 01, 2008 10:05 am

of course, I already have the nowait clause. It does not help at all. The computation time is more than that would require without parallelization. :shock:

Any help, please!
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ymharshe » Tue Jul 01, 2008 2:08 pm

Code: Select all
icb-dhcp225-138:~/Documents/phd_project/model/parallelization harshey$ csh
[icb-dhcp225-138:phd_project/model/parallelization] harshey% ifort -openmp parallel.f InvertMatrix.f
parallel.f(35): (col. 7) remark: OpenMP DEFINED SECTION WAS PARALLELIZED.
InvertMatrix.f(346): (col. 10) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(303): (col. 10) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(115): (col. 18) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(125): (col. 21) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(234): (col. 18) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(240): (col. 21) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(252): (col. 16) remark: LOOP WAS VECTORIZED.
InvertMatrix.f(258): (col. 21) remark: LOOP WAS VECTORIZED.
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 1
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out
a invet found!
b invet found!
time required for the simulation:   3.895863     seconds
3.910u 0.041s 0:03.95 100.0%    0+0k 0+5io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 2
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     
a invet found!
b invet found!
time required for the simulation:   3.907183     seconds
3.921u 0.041s 0:03.96 100.0%    0+0k 0+5io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 3
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     
a invet found!
b invet found!
time required for the simulation:   3.921978     seconds
3.935u 0.043s 0:03.98 99.7%     0+0k 0+0io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 4
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     
a invet found!
b invet found!
time required for the simulation:   3.922734     seconds
3.936u 0.044s 0:03.98 99.7%     0+0k 0+5io 0pf+0w

Tell me what should I do? The computation time is the same.
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ejd » Tue Jul 01, 2008 10:24 pm

The compile shows that the code was compiled to use OpenMP. However, the time info seems to show that the program is not being run in parallel. Try doing a "setenv OMP_DYNAMIC FALSE" and see if the %P value changes to be over 100%.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Problem in parallelizing the code

Postby ymharshe » Wed Jul 02, 2008 1:23 am

Code: Select all
icb-dhcp225-138:~/Documents/phd_project/model/parallelization harshey$ csh
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_DYNAMIC FALSE
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 1
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out
a invet found!
b invet found!
time required for the simulation:   3.935814     seconds
3.950u 0.043s 0:03.99 100.0%    0+0k 0+4io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 2
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     a invet found!
b invet found!
time required for the simulation:   3.973833     seconds
3.987u 0.042s 0:04.03 99.7%     0+0k 0+5io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 3
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     a invet found!
b invet found!
time required for the simulation:   3.926816     seconds
3.940u 0.043s 0:03.98 100.0%    0+0k 0+5io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey% setenv OMP_NUM_THREADS 4
[icb-dhcp225-138:phd_project/model/parallelization] harshey% time ./a.out                                                     a invet found!
b invet found!
time required for the simulation:   3.942922     seconds
3.957u 0.042s 0:04.00 99.7%     0+0k 0+5io 0pf+0w
[icb-dhcp225-138:phd_project/model/parallelization] harshey%


What can you say over this? :(
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am

Re: Problem in parallelizing the code

Postby ymharshe » Thu Jul 03, 2008 8:52 am

Hey Guys,
Finally worked out with the whole stuff! It is working now.
The most important thing is I should have checked the time by using the cpu_time() routine of Fortran
What Ejd has suggested in other post helped me lot. I used csh and then used time ./a.out. It shows the proper time and I am happy to inform you that, with 2 processors I got a 1.44 times speed up of the calculations!!!
Two matrices are now parallely inverted by two processors.
Thanks a lot Ejd for your quick responses.
Yogesh
ymharshe
 
Posts: 7
Joined: Tue Jul 01, 2008 8:03 am


Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot] and 8 guests