[Omp] Newbie in deep end >> Matrix multiplier > NeedtoaddOpenMPdirectives

Haab, Grant grant.haab at intel.com
Fri Apr 8 07:10:17 PDT 2005


Colm,

You also need to make "temp" private on closer examination of the code,
but that should mainly affect correctness but not performance so much.
However, I would make sure the program gets correct answers to help
verify your parallelization is correct before even worrying about
performance. At the intel web site (www.intel.com), there is a software
product called "Intel (R) Thread Checker" that automatically finds races
and other coding errors with OpenMP programs.

If it does, then there may be an overhead problem with loops that are
too
short, or do not parallelize well because of memory system bandwidth.
For instance, if beta is not 1 in the loop below, it is unlikely you
will get
much scaling from matrix initialization or simple matrix scaling because
of memory bandwidth limitations.  I would suggest only parallelizing the
third loop nest (three deep) in the example below instead by moving the
if-then-else out of the j-loop.  But this is just a guess, since I don't
have the full code.  Additionally, you could try a bigger matrix or
adjust the loop scheduling type (see "schedule" clause in OpenMP
specifications) to see if that helps performance.

- Grant

-----Original Message-----
From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org] On Behalf
Of Bono Mitch
Sent: Thursday, April 07, 2005 6:30 PM
To: Yuan.Lin at Sun.COM
Cc: Omp at openmp.org
Subject: Re: [Omp] Newbie in deep end >> Matrix multiplier >
NeedtoaddOpenMPdirectives

Not on sun sorry - working on my own pc in linux env using intel
compiler.

I took your advice Grant and I understand why that is but it's still
slower.

Can you tell me from the code I gave you, if that is enough OpenMP - i
mean 
is there much more I could do to it?

thanks,

Colly.


>
>Mitch,
>
>Grant is right. By default, 'i' and 'l' are shared among the threads
and
>you will run into a data race condition.
>
>If you are compiling on Sun Solaris platform, you can use -xvpara
option
>to watch for such errors.
>
>For example,
>
> >cat -n t.c
>      1	int main()
>      2	{
>      3	    int i,j;
>      4	    int a[100][100];
>      5
>      6	#pragma omp parallel for
>      7	    for (i=1; i<100; i++)
>      8	        for (j=1; j<100; j++)
>      9	            a[i][j] = 0;
>     10	}
> >cc -xopenmp -xO3 -xvpara -c t2.c
>"t.c", line 6: Warning: inappropriate scoping
>	variable 'j' may be scoped inappropriately as 'shared'
>	. read at line 8 and write at line 8 may cause data race
>
>
>Regards,
>
>Yuan
>
>
>
>
>
>Haab, Grant wrote:
> > Mitch,
> >
> > You must declare i and l private for this to work:
> >
> >
> > #pragma omp parallel for private(i,l)
> > for (j = 0; j < *n_cols; ++j) //col
> > {
> > ...
> >
> >
> > That should fix the problem of slower execution.  As written 
below,
> > all the threads are using the same iteration variables for the

inner
> > loops, causing incorrect results and bad performance.  The j
index 
is
> > made private
> > for you since it is the parallel loop index.  The rest you
must 
make
> > private explicitly, OR declare inside the j-loop:
> >
> >
> > #pragma omp parallel for
> > for (j = 0; j < *n_cols; ++j) //col
> > {
> >     int i, l; 	// this makes i & l private
> >
> > ...
> >
> >
> >
> > - Grant
> >
> > -----Original Message-----
> > From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org]
On 
Behalf
> > Of Bono Mitch
> > Sent: Thursday, April 07, 2005 4:52 PM
> > To: Meadows, Lawrence F; Omp at openmp.org
> > Subject: RE: [Omp] Newbie in deep end >> Matrix
multiplier 
> Need to
> > addOpenMPdirectives
> >
> > Ok...
> > Here's what I've done, but I don't think it's right, simply 
because it's
> >
> > slower when set to 2 threads than it is with one.
> > My project is based on the standard O(n^3) algorithm so I'm
not 
allowed
> > to
> > use the auto blocking or any of the other multiplication
methods. 
Sorry
> > I
> > should have told you this already. Essentially I'm using the
DGEMM
> > Fortran
> > routine > http://www.netlib.org/blas/dgemm.f
> >
> >
> >
> >
> >
> > #pragma omp parallel for
> > for (j = 0; j < *n_cols; ++j) //col
> > {
> > 	if (*beta == 0.)
> > 	{
> > 		for(i = 0; i < *m_rows; ++i) //row
> > 		{
> > 			matrixC[i + j * *ldc] = 0.;
> > 		}
> > 	}
> > 	else if (*beta != 1.)
> > 	{
> > 		for(i =0; i < *m_rows; ++i)//row
> > 		{
> > 			matrixC[i + j * *ldc] = *beta *
matrixC[i + j *
> > *ldc];
> > 		}
> > 	}
> > 	for (l = 0; l < *k_common; ++l)
> > 	{
> > 		if (matrixB[l + j * *ldb] != 0.)
> > 		{
> > 			temp = *alpha * matrixB[l + j * *ldb];
> > 			for(i = 0; i < *m_rows; ++i)
> > 			{
> > 				matrixC[i + j * *ldc] += temp *
> > matrixA[i + l * *lda];
> > 			}
> > 		}
> > 	}
> > }
> > ...
> > more similar loops depending on whether the matrices A or B
are to 
be
> > transposed.
> >
> > What do you think?
> >
> > Thanks,
> >
> > Colm
> >
> > >
> > >Non-trivial.
> > >
> > >Here's matrix multiply in Fortran: B(n,l) * C(l,m) = 
A(n,m)
> > >(actually, this adds the product of B and C to A, but
> > >this is the important bit):
> > >
> > >do i = 1,n
> > >    do j = 1,m
> > >       do k = 1,l
> > >          a(i,j) = a(i,j) + b(i,k) * c(k,j)
> > >       enddo
> > >     enddo
> > >enddo
> > >
> > >You can parallelize any loop that doesn't carry a 
dependence.
> > >So that would be either outer loop (extra credit: what

OpenMP
> > >directive would you use to parallelize the innermost 
loop?)
> > >The outermost loop is the best because of granularity,

considering
> > >only parallelism. So you would do:
> > >
> > >!$omp parallel do
> > >do i = 1,n
> > >    do j = 1,m
> > >       do k = 1,l
> > >          a(i,j) = a(i,j) + b(i,k) * c(k,j)
> > >       enddo
> > >     enddo
> > >enddo
> > >!$omp end parallel do
> > >
> > >The last directive is not required.
> > >
> > >In C, you would say:
> > >
> > >#pragma omp parallel for
> > >
> > >and translate the fortran to C, and probably put
curlys 
around
> > >the whole mess for clarity.
> > >
> > >So the homework question is why is this a really bad
way 
to
> > >do matrix multiply, even ignoring the OpenMP
directives? 
Once
> > >you've figured that out, which I presume that you
have, 
and
> > >presumably discovered tiling, then you have to decide 
which
> > >loop or loops in the tiled loop nest to parallelize. I

suspect
> > >that the answer is the outermost block loop, but I
have 
now
> > >exhausted the limits of my knowledge of matrix
multiply 
for
> > >this semester.
> > >
> > >Regards,
> > >
> > >Larry Meadows
> > >
> > > >-----Original Message-----
> > > >From: Omp-bounces at openmp.org 
[mailto:Omp-bounces at openmp.org]
> > > >On Behalf Of Bono Mitch
> > > >Sent: Thursday, April 07, 2005 12:30 PM
> > > >To: Omp at openmp.org
> > > >Subject: [Omp] Newbie in deep end 
>> Matrix multiplier
> > >
> > Need
> > > >to add OpenMPdirectives
> > > >
> > > >Hi,
> > > >
> > > >I'm working on a matrix-matrix multiplier in
C 
for my
> > > >dissertation. It is
> > > >based on the DGEMM fortran routine. The 
multiplier works great
> > > >but I need to
> > > >add open mp directives and I'm totally new to
it 
so don't
> > > >really have a clue
> > > >where. I've looked at the FAQ's and the 
specifcation examples
> > > >but still
> > > >don't know where to add them.
> > > >
> > > >Can any of you point me to some examples
where 
OpenMP has been
> > > >used in this
> > > >manner?
> > > >
> > > >Thanks for your help,
> > > >
> > > >Colly Mitch.
> > > >
> > >
> > 
>________________________________________________________________
_
> > > >G-string or bloomers? Find out what to wear
at 
MSN Weather!
> > > >http://www.msn.ie/weather
> > > >
> > > >
> > >
>_______________________________________________
> > > >Omp mailing list
> > > >Omp at openmp.org
> > >
>http://openmp.org/mailman/listinfo/omp_openmp.org
> > > >
> >
> >
_________________________________________________________________
> > Want to see who you're talking to? FREE MSN Messenge has
webcam
> > functionality! http://messenger.msn.co.uk/Beta/Default.aspx
> >
> >
> > _______________________________________________
> > Omp mailing list
> > Omp at openmp.org
> > http://openmp.org/mailman/listinfo/omp_openmp.org
> >
> > _______________________________________________
> > Omp mailing list
> > Omp at openmp.org
> > http://openmp.org/mailman/listinfo/omp_openmp.org
>

_________________________________________________________________
Have you tried the all-new MSN Search? Test-drive it today! 
http://search.msn.co.uk


_______________________________________________
Omp mailing list
Omp at openmp.org
http://openmp.org/mailman/listinfo/omp_openmp.org




More information about the Omp mailing list