help with OMP fortran

General OpenMP discussion

Re: help with OMP fortran

Postby kon » Mon Jun 11, 2012 8:25 pm

Hi all,

Unfortunately the problem persists. I’ve set the stack size as suggested (ulimit –s) as well as the OMP_STACK_SIZE=200M. Also I printed some intermediate results:

do k = 1, m
print*, ' k = ', k
Rhs = 0
Rhs(indx1(k)) = 1.0
Sol = 0
!
print*, " size of Sol : ", size(Sol), " size of Rhs : ", size(Rhs)

The output is:

4483144 animals in the pedigree
second reading : 4483144 animals in the pedigree
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 2388
k = 1
k = 479
k = 1912
k = 957
k = 1435
size of Sol : 0 size of Rhs : 4483144
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
collaeu 0000000000403C0D Unknown Unknown Unknown
libiomp5.so 00002AE8DBB88793 Unknown Unknown Unknown

The program crashes always here. I do not understand why Sol is of size zero!?


Regards,

Kon
kon
 
Posts: 8
Joined: Sun Jun 03, 2012 9:11 pm

Re: help with OMP fortran

Postby ftinetti » Tue Jun 12, 2012 1:32 pm

Hi,

There are some strange behavior/s. I reduced the code to no much more than the data declaration, allocation and a minimum DO, more specifically:
Code: Select all
program Colleau_red                 ! "red" for "reduced" ...
  implicit none

  INTEGER :: i, j, k, l, m, n, io, irow, icol, numAnim, ierr, jerr, numOfThreads
  REAL(8), DIMENSION(:,:), ALLOCATABLE :: A, Ainv, T, Tinv, Asub
  REAL(8), DIMENSION(:), ALLOCATABLE :: D, Dinv, Sol, Rhs, r, v, s, w, y, x
  INTEGER, DIMENSION(:,:), ALLOCATABLE :: Ped
  INTEGER, DIMENSION(:), ALLOCATABLE :: indx, indx1
  real(8) :: a1
  real :: start_time, end_time

  numOfThreads = 8
  !CALL OPM_SET_NUM_THREADS(numOfThreads)

  call cpu_time(start_time)

  n = 10;

  ALLOCATE (D(n), Sol(n), Rhs(n), stat=ierr)
  if (ierr.ne.0) then
    print*," Allocation request for D, Sol, Rhs denied! "
    stop
  endif

  ALLOCATE (Ped(n,3), v(n), indx(n), indx1(n), stat=jerr )
  if (jerr.ne.0) then
    print*," Allocation request for Ped, v, indx denied! "
    stop
  endif

  PRINT*, " Commencing Colleau calculations for sub-matrix : "

  m = 10
  print*, " Number of animals in the subset matrix of A is : " , m

  ALLOCATE (Asub(m,m), stat=jerr )
  if (jerr.ne.0) then
    print*," Allocation request for Asub denied! "
    stop
  endif

!  !$OMP PRIVATE(i, j, k) SHARED(m, n, Ped, D, indx1, Sol, v)           ! Works
!  !$OMP PRIVATE(i, j, k, Rhs) SHARED(m, n, Ped, D, indx1, Sol, v)    ! Doesn't work
!  !$OMP PRIVATE(i, j, k, Rhs, Sol, v) SHARED(m, n, Ped, D, indx1)    ! Doesn't work

  !$OMP PARALLEL DO &
  !$OMP PRIVATE(i, j, k) SHARED(m, n, Ped, D, indx1, Sol, v)
  do k = 1, m
    if (k .eq. 1) print*, ' k = ', k
  enddo
  !$OMP END PARALLEL DO

  call cpu_time(end_time)

  print*, " Elapsed time taken to calculate a subset of A is : ", end_time - start_time

  print *, "Sol(n) = ", Sol(n)
end program


and I've played a little bit with three options for private and shared data:
! !$OMP PRIVATE(i, j, k) SHARED(m, n, Ped, D, indx1, Sol, v) ! Works
! !$OMP PRIVATE(i, j, k, Rhs) SHARED(m, n, Ped, D, indx1, Sol, v) ! Doesn't work
! !$OMP PRIVATE(i, j, k, Rhs, Sol, v) SHARED(m, n, Ped, D, indx1) ! Doesn't work

Using the first option, i.e.
!$OMP PARALLEL DO &
!$OMP PRIVATE(i, j, k) SHARED(m, n, Ped, D, indx1, Sol, v)

everything works as expected for the code (I know this is not what you need, I'm just playing around...):

$ ifort -openmp colleau-red.f90 -o colleau-redifort
$ ./colleau-redifort
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 10
k = 1
Elapsed time taken to calculate a subset of A is : 8.8987000E-02
Sol(n) = 0.000000000000000E+000

The second option , i.e. using
!$OMP PARALLEL DO &
!$OMP PRIVATE(i, j, k, Rhs) SHARED(m, n, Ped, D, indx1, Sol, v)

fails...

$ ifort -openmp colleau-red.f90 -o colleau-redifort
$ colleau-redifort
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 10
k = 1
*** glibc detected *** *** glibc detected *** *** glibc detected *** colleau-redifort*** glibc detected *** colleau-redifort: munmap_chunk(): invalid pointer: 0x00007fee2ad28610 ***
*** glibc detected *** colleau-redifort: *** glibc detected *** colleau-redifort: munmap_chunk(): invalid pointer: 0x00007fee2a125610 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3196e74576]
colleau-redifort[0x40ad39]
colleau-redifort[0x403eca]
======= Backtrace: =========
*** glibc detected *** colleau-redifort: munmap_chunk(): invalid pointer: 0x00007fee2a526610 ***
Abortado (`core' generado)

and I think it should not fail... Also, it doesn't fail with gfortran:

$ gfortran -fopenmp colleau-red.f90 -o colleau-redgfortran
$ colleau-redgfortran
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 10
k = 1
Elapsed time taken to calculate a subset of A is : 2.19959989E-02
Sol(n) = 0.0000000000000000

And the third option, the one suggested by MarkB, i.e. with
!$OMP PARALLEL DO &
!$OMP PRIVATE(i, j, k, Rhs, Sol, v) SHARED(m, n, Ped, D, indx1)

again does not work with ifort:
$ ifort -openmp colleau-red.f90 -o colleau-redifort
$ colleau-redifort
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 10
k = 1

(note the missed lines). And it works as expected (at least by me...) with gfortran:

$ gfortran -fopenmp colleau-red.f90 -o colleau-redgfortran
$ colleau-redgfortran
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 10
k = 1
Elapsed time taken to calculate a subset of A is : 2.19959989E-02
Sol(n) = 0.0000000000000000

Well, right now, I think you should ask the ifort people about this... feel free to use the code I've included, and please tell us about the result/s...
Just for completeness:
$ ifort -v
ifort version 12.1.3

$ gfortran -v
...
gcc version 4.6.3 20111216 (prerelease) (GCC)

HTH,

Fernando.
ftinetti
 
Posts: 582
Joined: Wed Feb 10, 2010 2:44 pm

Re: help with OMP fortran

Postby kon » Tue Jun 12, 2012 9:22 pm

Hi all,

First of all I would like to thank you for the time you’ve spent with my little program.
I will ask ifort people about the problem. However, I’ve recompiled the program with gfortran and !$OMP PARALLEL DO PRIVATE(i, j, k, Rhs, Sol, v) SHARED(m,n,Ped,D,indx1)

It worked but it took longer that the version of the program that does not use OMP.
NO_OMP version time is : 577 sec.
OMP version time is : 1617 sec. and %CPU of 2196!

I give up, apparently this program cannot be parallelised.

Kon
kon
 
Posts: 8
Joined: Sun Jun 03, 2012 9:11 pm

Re: help with OMP fortran

Postby MarkB » Wed Jun 13, 2012 2:32 am

Don't give up!
How did you time the code? How many threads were you running and on what system?
MarkB
 
Posts: 481
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh

Re: help with OMP fortran

Postby ftinetti » Wed Jun 13, 2012 4:54 am

Hi Kon,

I (and others) usually suggest using OMP_GET_WTIME() instead of cpu_time() and, As MarkB suggests:
1) Don't give up
2) Tell us about your system: CPUs, cores, number of threads, etc.

Fernando.
ftinetti
 
Posts: 582
Joined: Wed Feb 10, 2010 2:44 pm

Re: help with OMP fortran

Postby kon » Wed Jun 13, 2012 8:26 pm

Thanks guys,

Your support is greatly appreciated.
Here is what I’ve done lately:
I’ve used both cpu_time() and OMP_GET_WTIME() functions to measure the time.
The first call starts before the main loop:

call cpu_time(start_time)
t1 = OMP_GET_WTIME()
!$OMP PARALLEL DO PRIVATE(i, j, k, Rhs, Sol, v) SHARED(m,n,Ped,D,indx1)
do k = 1, m

and the second call is right at the end including writing the matrix on disk.

The results:


4483144 animals in the pedigree
second reading : 4483144 animals in the pedigree
Commencing Colleau calculations for sub-matrix :
Number of animals in the subset matrix of A is : 2388
Elapsed time taken to calculate a subset of A is : 1846.7913
Elapsed time taken to calculate a subset of A with OMP_GET_WTIME() is : 95.214734000037424

So it works! I am happy. Not really, because I need MKL libraries from ifort for further use (inversion of large matrices). I posted an enquiry to ifort people and waiting for an answer. The specs of the current system are:

OS = CENTOS v5.5
RAM = 512GB @ 1066MHz
DISK = 4.5TB
16 cores CPU 8 physical and 8 HYPER-THREDED

Ifort version keeps crashing. So frustrating. Thanks again.

Regards,

Kon
kon
 
Posts: 8
Joined: Sun Jun 03, 2012 9:11 pm

Re: help with OMP fortran

Postby ftinetti » Thu Jun 14, 2012 3:47 am

Hi Kon,

So it works! I am happy. Not really, because I need MKL libraries from ifort for further use (inversion of large matrices). I posted an enquiry to ifort people and waiting for an answer.

It's good to know that at least this part is working. As far as I know, compiler people replies usually quickly. Please let us know the Intel reply (or, better, please post the URL where your post is).

Just curios: would you post the code in which you make the inversion of large matrices?

Fernando.
ftinetti
 
Posts: 582
Joined: Wed Feb 10, 2010 2:44 pm

Re: help with OMP fortran

Postby kon » Thu Jun 14, 2012 5:41 pm

Hi Fernando,

Thanks for the reply. I’ve posted my problem to ifort people but there is no reply so far. I am not sure however whether I posted it at the right place. I’ve posted it to Intel® Software Network: Forums>>2011 Apprentice (Entry) Level Problems>>problem with ifort v12.1.0 and OMP.

Below is the code that uses LAPACK routine for calculation of Eigen values and eigenvectors of large matrices and also Cholesky decomposition and inverse of large matrices. Note that all integers are 8 bytes. My experience with both routines with 50K matrices is: inversion 1.6 hours; Eigen problem 30 minutes! You can manage the number of threads used by specifying -par-num-threads=? (probably you know that)
Please advise if I’ve chosen the right place to ask about my problem.

Regards,

Kon


program test_lapack
! =============================================================================
!
! GETRI Example.
! ==============
! Program computes the inversea real symmetric matrix A:
! Original matrix A
! =================
!
! 5.00 7.00 6.00 5.00
! 7.00 10.00 8.00 7.00
! 6.00 8.00 10.00 9.00
! 5.00 7.00 9.00 10.00
! Cholesky decomposition using LAPACK GETRF routine
! 7.00 10.00 8.00 7.00
! 0.86 -0.57 3.14 3.00
! 0.71 0.25 2.50 4.25
! 0.71 0.25 -0.20 0.10
! Matrix inverse using LAPACK DGETRI routine
! 68.00 -41.00 -17.00 10.00
! -41.00 25.00 10.00 -6.00
! -17.00 10.00 5.00 -3.00
! 10.00 -6.00 -3.00 2.00
!
! DSYEV Example.
! ==============
!
! Program computes all eigenvalues and eigenvectors of a real symmetric
! matrix A:
!
! 1.96 -6.49 -0.47 -7.20 -0.65
! -6.49 3.80 -6.39 1.50 -6.34
! -0.47 -6.39 4.17 -1.51 2.67
! -7.20 1.50 -1.51 5.70 1.80
! -0.65 -6.34 2.67 1.80 -7.10
!
! Description.
! ============
!
! The routine computes all eigenvalues and, optionally, eigenvectors of an
! n-by-n real symmetric matrix A. The eigenvector v(j) of A satisfies
!
! A*v(j) = lambda(j)*v(j)
!
! where lambda(j) is its eigenvalue. The computed eigenvectors are
! orthonormal.
!
! Example Program Results.
! ========================
!
! DSYEV Example Program Results
!
! Eigenvalues
! -11.07 -6.23 0.86 8.87 16.09
!
! Eigenvectors (stored columnwise)
! -0.30 -0.61 0.40 -0.37 0.49
! -0.51 -0.29 -0.41 -0.36 -0.61
! -0.08 -0.38 -0.66 0.50 0.40
! 0.00 -0.45 0.46 0.62 -0.46
! -0.80 0.45 0.17 0.31 0.16
! =============================================================================
!
! .. Parameters ..


USE mkl95_LAPACK, ONLY: GETRF, GETRI, SYTRF, SYTRI, POTRF, POTRI, SYEV

implicit none
integer(8) :: i, j, k, l, m, n, m1, n1, LDA, INFO, LWORK, LWMAX
real(8), dimension(:,:), allocatable :: A
real(8), dimension(:), allocatable :: W, WORK
integer(8), dimension(:), allocatable :: IPIV, IWORK
!
N = 5
LDA = N
LWMAX = 1000
!
ALLOCATE(A( LDA, N ), W( N ), WORK( LWMAX ) )

A(1,1)= 1.96;A(1,2)= -6.49;A(1,3)= -0.47;A(1,4)=-7.20;A(1,5)= -0.65
A(2,1)= -6.49;A(2,2)= 3.80;A(2,3)= -6.39;A(2,4)= 1.50;A(2,5)= -6.34
A(3,1)= -0.47;A(3,2)= -6.39;A(3,3)= 4.17;A(3,4)=-1.51;A(3,5)= 2.67
A(4,1)= -7.20;A(4,2)= 1.50;A(4,3)= -1.51;A(4,4)= 5.70;A(4,5)= 1.80
A(5,1)= -0.65;A(5,2)= -6.34;A(5,3)= 2.67;A(5,4)= 1.80;A(5,5)= -7.10

!
! .. Executable Statements ..
WRITE(*,*)'DSYEV Example Program Results'
!
! Query the optimal workspace.
!
LWORK = -1
CALL DSYEV( 'V', 'L', N, A, N, W, WORK, LWORK, INFO )

LWORK = MIN( LWMAX, INT( WORK( 1 ) ) )

!
! Solve eigenproblem.
!
CALL DSYEV( 'V', 'L', N, A, LDA, W, WORK, LWORK, INFO )

!
! Check for convergence.
!
IF( INFO.GT.0 ) THEN
WRITE(*,*)'The algorithm failed to compute eigenvalues.'
STOP
END IF
!
! Print eigenvalues.
!
print*, " Eigenvalues : "
write(*,fmt='(20f10.2)') W(1:N)
!
! Print eigenvectors.
!
print*, " Eigenvectors : "
do i = 1, n
write(*,fmt='(20f10.2)') A(i,:)
enddo

!
! End of DSYEV Example.
!
! Start GETRI Example.
DEALLOCATE(A, W, WORK )
n = 4; m = 4
LDA = n
LWORK = n*64

allocate(A(LDA,n), IPIV(n), WORK(LWORK), IWORK(n) )


! define lower half of matrix
A(1,1)= 5;
A(2,1)=7; A(2,2)=10
A(3,1)=6; A(3,2)=8; A(3,3)= 10
A(4,1)=5; A(4,2)=7; A(4,3)=9; A(4,4)= 10
! define upper half by symmetry
do i = 1, n
do j = i+1, n
A(i,j)=A(j,i)
end do
end do
print*, " Original matrix A "
do i = 1, n
write(*,fmt='(13f10.2)') A(i,:)
enddo


! note here that you can use both calls (getrf is fortran 95; dgetrf is fortran 77)
! Cholesky factorisation
call getrf( A, IPIV, INFO )
! call dgetrf( m, n, a, lda, ipiv, info )
print*, " After Cholesky factorisation parameter info is : ", info
do i = 1, n
write(*,fmt='(13f10.2)') A(i,:)
enddo

! inverse of A
! note here that you can use both calls (getri is fortran 95; dgetri is fortran 77)
call getri( A, IPIV, INFO )
! call dgetri( n, a, lda, ipiv, work, lwork, info )
IF( INFO.GT.0 ) THEN
WRITE(*,*)'The routine failed to compute matrix inverse! '
STOP
END IF
print*, " Matrix inverse using GETRI after Cholesky decomposition using GETRF routine "
do i = 1, n
write(*,fmt='(13f10.2)') A(i,:)
enddo


! End GETRI Example.


end program test_lapack


The compilation:

/opt/intel/bin/ifort test_lapack.f90 -L/opt/intel/composerxe-2011.2.137/mkl/lib/intel64 -I/opt/intel/composerxe-2011.2.137/mkl/include/intel64/ilp64 -lmkl_blas95_ilp64 -lmkl_lapack95_ilp64 -Wl,--start-group /opt/intel/composerxe-2011.2.137/mkl/lib/intel64/libmkl_intel_ilp64.a /opt/intel/composerxe-2011.2.137/mkl/lib/intel64/libmkl_intel_thread.a /opt/intel/composerxe-2011.2.137/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -openmp -Bstatic -o test_lapack
kon
 
Posts: 8
Joined: Sun Jun 03, 2012 9:11 pm

Re: help with OMP fortran

Postby ftinetti » Fri Jun 15, 2012 5:44 am

Hi Kon,

I’ve posted my problem to ifort people but there is no reply so far. I am not sure however whether I posted it at the right place. I’ve posted it to Intel® Software Network: Forums>>2011 Apprentice (Entry) Level Problems>>problem with ifort v12.1.0 and OMP.

Hmmm... unfortunately I don't have any experience with Intel forums, but I've heard they reply rather quickly. Also, note that in the code you posted there are details too complicated and unrelated to the OpenMP problem, e.g. using files "A.hol" and "Asubset.hol". I had to exclude almost all of that stuff to see what happens with OpenMP on your code. I suggest you use the code I posted so that the Intel people do not have to do the same task I've already done.

Below is the code that uses LAPACK routine for calculation of Eigen values and eigenvectors of large matrices and also Cholesky decomposition and inverse of large matrices.

Take a look if you can:
1) Use MKL with gcc. Take a look at http://software.intel.com/en-us/article ... uirements/
2) Use the LAPACK reference implementation at http://www.netlib.org/lapack/. Since LAPACK heavily depends on BLAS for performance (and actual implementation too) take a look at the comments in the URL about installing along with ATLAS, for example.
3) Use ACML, which is similar to MKL but provided by and for AMD processors... yes, I know this does not sound right, but I think it is worth trying anyway.
4) Take a look at http://www.oracle.com/technetwork/serve ... index.html which is very good too, I think.

HTH,

Fernando.
ftinetti
 
Posts: 582
Joined: Wed Feb 10, 2010 2:44 pm

Previous

Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 10 guests