[Omp] Newbie in deep end >> Matrix multiplier > Need to addOpenMPdirectives

Haab, Grant grant.haab at intel.com
Thu Apr 7 15:23:29 PDT 2005


Mitch,

You must declare i and l private for this to work:


#pragma omp parallel for private(i,l)
for (j = 0; j < *n_cols; ++j) //col
{
...


That should fix the problem of slower execution.  As written below, 
all the threads are using the same iteration variables for the inner
loops, causing incorrect results and bad performance.  The j index is
made private
for you since it is the parallel loop index.  The rest you must make
private explicitly, OR declare inside the j-loop:


#pragma omp parallel for
for (j = 0; j < *n_cols; ++j) //col
{
    int i, l; 	// this makes i & l private

...



- Grant

-----Original Message-----
From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org] On Behalf
Of Bono Mitch
Sent: Thursday, April 07, 2005 4:52 PM
To: Meadows, Lawrence F; Omp at openmp.org
Subject: RE: [Omp] Newbie in deep end >> Matrix multiplier > Need to
addOpenMPdirectives

Ok...
Here's what I've done, but I don't think it's right, simply because it's

slower when set to 2 threads than it is with one.
My project is based on the standard O(n^3) algorithm so I'm not allowed
to 
use the auto blocking or any of the other multiplication methods. Sorry
I 
should have told you this already. Essentially I'm using the DGEMM
Fortran 
routine > http://www.netlib.org/blas/dgemm.f





#pragma omp parallel for
for (j = 0; j < *n_cols; ++j) //col
{
	if (*beta == 0.)
	{
		for(i = 0; i < *m_rows; ++i) //row
		{
			matrixC[i + j * *ldc] = 0.;
		}
	}
	else if (*beta != 1.)
	{
		for(i =0; i < *m_rows; ++i)//row
		{
			matrixC[i + j * *ldc] = *beta * matrixC[i + j *
*ldc];
		}
	}
	for (l = 0; l < *k_common; ++l)
	{
		if (matrixB[l + j * *ldb] != 0.)
		{
			temp = *alpha * matrixB[l + j * *ldb];
			for(i = 0; i < *m_rows; ++i)
			{
				matrixC[i + j * *ldc] += temp *
matrixA[i + l * *lda];
			}
		}
	}
}
...
more similar loops depending on whether the matrices A or B are to be 
transposed.

What do you think?

Thanks,

Colm

&gt;
&gt;Non-trivial.
&gt;
&gt;Here's matrix multiply in Fortran: B(n,l) * C(l,m) = A(n,m)
&gt;(actually, this adds the product of B and C to A, but
&gt;this is the important bit):
&gt;
&gt;do i = 1,n
&gt;    do j = 1,m
&gt;       do k = 1,l
&gt;          a(i,j) = a(i,j) + b(i,k) * c(k,j)
&gt;       enddo
&gt;     enddo
&gt;enddo
&gt;
&gt;You can parallelize any loop that doesn't carry a dependence.
&gt;So that would be either outer loop (extra credit: what OpenMP
&gt;directive would you use to parallelize the innermost loop?)
&gt;The outermost loop is the best because of granularity, considering
&gt;only parallelism. So you would do:
&gt;
&gt;!$omp parallel do
&gt;do i = 1,n
&gt;    do j = 1,m
&gt;       do k = 1,l
&gt;          a(i,j) = a(i,j) + b(i,k) * c(k,j)
&gt;       enddo
&gt;     enddo
&gt;enddo
&gt;!$omp end parallel do
&gt;
&gt;The last directive is not required.
&gt;
&gt;In C, you would say:
&gt;
&gt;#pragma omp parallel for
&gt;
&gt;and translate the fortran to C, and probably put curlys around
&gt;the whole mess for clarity.
&gt;
&gt;So the homework question is why is this a really bad way to
&gt;do matrix multiply, even ignoring the OpenMP directives? Once
&gt;you've figured that out, which I presume that you have, and
&gt;presumably discovered tiling, then you have to decide which
&gt;loop or loops in the tiled loop nest to parallelize. I suspect
&gt;that the answer is the outermost block loop, but I have now
&gt;exhausted the limits of my knowledge of matrix multiply for
&gt;this semester.
&gt;
&gt;Regards,
&gt;
&gt;Larry Meadows
&gt;
&gt; &gt;-----Original Message-----
&gt; &gt;From: Omp-bounces at openmp.org [mailto:Omp-bounces at openmp.org]
&gt; &gt;On Behalf Of Bono Mitch
&gt; &gt;Sent: Thursday, April 07, 2005 12:30 PM
&gt; &gt;To: Omp at openmp.org
&gt; &gt;Subject: [Omp] Newbie in deep end &gt;&gt; Matrix multiplier
&gt; 
Need
&gt; &gt;to add OpenMPdirectives
&gt; &gt;
&gt; &gt;Hi,
&gt; &gt;
&gt; &gt;I'm working on a matrix-matrix multiplier in C for my
&gt; &gt;dissertation. It is
&gt; &gt;based on the DGEMM fortran routine. The multiplier works great
&gt; &gt;but I need to
&gt; &gt;add open mp directives and I'm totally new to it so don't
&gt; &gt;really have a clue
&gt; &gt;where. I've looked at the FAQ's and the specifcation examples
&gt; &gt;but still
&gt; &gt;don't know where to add them.
&gt; &gt;
&gt; &gt;Can any of you point me to some examples where OpenMP has been
&gt; &gt;used in this
&gt; &gt;manner?
&gt; &gt;
&gt; &gt;Thanks for your help,
&gt; &gt;
&gt; &gt;Colly Mitch.
&gt; &gt;
&gt;
&gt;_________________________________________________________________
&gt; &gt;G-string or bloomers? Find out what to wear at MSN Weather!
&gt; &gt;http://www.msn.ie/weather
&gt; &gt;
&gt; &gt;
&gt; &gt;_______________________________________________
&gt; &gt;Omp mailing list
&gt; &gt;Omp at openmp.org
&gt; &gt;http://openmp.org/mailman/listinfo/omp_openmp.org
&gt; &gt;

_________________________________________________________________
Want to see who you're talking to? FREE MSN Messenge has webcam 
functionality! http://messenger.msn.co.uk/Beta/Default.aspx


_______________________________________________
Omp mailing list
Omp at openmp.org
http://openmp.org/mailman/listinfo/omp_openmp.org




More information about the Omp mailing list