[Omp] slow performance
andrew wang
mcwang88 at hotmail.com
Thu Dec 16 18:43:56 PST 2004
>From: Jose Luis Gordillo Ruiz <jlgr at super.unam.mx>
>To: andrew wang <mcwang88 at hotmail.com>
>CC: anmey at rz.rwth-aachen.de, Herbert.Fruchtl at uk.fujitsu.com,omp at openmp.org
>Subject: Re: [Omp] slow performance
>Date: Thu, 16 Dec 2004 19:59:05 -0600 (CST)
>
>
> >
> >
> > #pragma omp parallel private (jj,kk,x, sum)
> > {
> >
> > id = omp_get_thread_num();
> >
> > for (jj=id;jj<3; jj=jj+omp_threads )
> >
> > ...
> >
> > are equivalent to "omp parallel for"
> >
> but works "efficiently" only if you have 4 or less threads
Here I hard code the total loop to 3. But if it is a varialbe (10,20,100
etc). This piece of code shold also work "efficiently"? If the omp threads
set to 2, one thead loop as 0, 2, 4... the other 1, 3, 5 etc.
So for my understanding, it is totally equivalent to "omp parallel for",
correct me if i am wrong, please!
> > >>Also be careful with accumulating your result. The statement
> > >>
> > >> pi += sum;
> > >>
> > >>needds an "atomic" pragma. Or better still, specify pi as "accumulate"
> > >>in your omp pragma.
> >
> >
> > Yes, I should do that. But for this testing program, I simply igonore it
>at
> > this moment.
> >
>
> that pi could be another source of slow performance, because cache
> effects.
>
>
> regards,
> Jos?Luis Gordillo
> Departamento de Supercómputo - UNAM
>
More information about the Omp
mailing list