openmp isn't improve the execution time

General OpenMP discussion

openmp isn't improve the execution time

Postby nightwish » Mon May 20, 2013 9:31 pm

Hi guys,

bfr asking anything, I want to apologize if this question sounds stupid. I just learn openmp this morning. I try to improve the execution time of my code. this is the part of the code that can be parallelize:

Code: Select all
#pragma omp for private(y,x,i,j)
        for (y = 0; y < h; y++)
        for (x = 0; x < w; x++)
        {
                float gx = 0.f, gy = 0.f;
                for (i = 0; i < 3; i++)
                for (j = 0; j < 3; j++)
                {
                        r = y+i-1; if(r<0)r=0; else if(r>=h)r=h-1;
                        c = x+j-1; if(c<0)c=0; else if(c>=w)c=w-1;

                        gx += src[r][c] * mx[i][j];
                        gy += src[r][c] * my[i][j];
                }
                g_mag[y][x] = hypotf(gy, gx);
                g_ang[y][x] = atan2f(gy, gx);
        }


However, openmp doesn't improve the execution time at all.
execution time before using openmp: 0.32111 seconds
execution time after using openmp: 0.32291 seconds.

am I doing something wrong? I also notice that there's :

Code: Select all
gx += src[r][c] * mx[i][j];
gy += src[r][c] * my[i][j];


do I have to use reduction, so the openmp statement become : #pragma omp for private(y,x,i,j) reduction(+:gx,+:gy)

Thank you
nightwish
 
Posts: 3
Joined: Mon May 20, 2013 8:20 pm

Re: openmp isn't improve the execution time

Postby rchrd » Mon May 20, 2013 10:05 pm

It would help to know what compiler and compiler options you are using, also the host platform.
Richard Friedman rchrd -at- rchrd -dot- com
openmp.org webmaster
rchrd
 
Posts: 41
Joined: Tue Apr 01, 2008 10:35 pm
Location: Oakland, California

Re: openmp isn't improve the execution time

Postby nightwish » Mon May 20, 2013 10:11 pm

rchrd wrote:It would help to know what compiler and compiler options you are using, also the host platform.


i'm using gcc 4.7.3

processor: i7-2670qm

gcc -fopenmp [filename] -o [outputFile]
nightwish
 
Posts: 3
Joined: Mon May 20, 2013 8:20 pm

Re: openmp isn't improve the execution time

Postby rchrd » Mon May 20, 2013 10:18 pm

Try setting the environment variable OMP_NUM_THREADS to some number (say 4) and try again.
Richard Friedman rchrd -at- rchrd -dot- com
openmp.org webmaster
rchrd
 
Posts: 41
Joined: Tue Apr 01, 2008 10:35 pm
Location: Oakland, California

Re: openmp isn't improve the execution time

Postby MarkB » Tue May 21, 2013 2:15 am

Hi there,

Once you have convinced yourself you are setting the number of threads correctly....

You don't need to make gx and gy reduction variables: because they are declared within the scope of the parallel region, each thread has its own private copy of these variables.
However, r and c are shared by default, which is a bug: these need to be added to the private clause.

How are you measuring the execution time? You need to be careful that you are measuring wall clock time and not CPU time: the best way is to use the omp_get_wtime() function.

Hope that helps,
Mark.
MarkB
 
Posts: 456
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh

Re: openmp isn't improve the execution time

Postby nightwish » Tue May 21, 2013 12:21 pm

MarkB wrote:Hi there,

Once you have convinced yourself you are setting the number of threads correctly....

You don't need to make gx and gy reduction variables: because they are declared within the scope of the parallel region, each thread has its own private copy of these variables.
However, r and c are shared by default, which is a bug: these need to be added to the private clause.

How are you measuring the execution time? You need to be careful that you are measuring wall clock time and not CPU time: the best way is to use the omp_get_wtime() function.

Hope that helps,
Mark.


I just wonder if there's any specific reason why I don't have to use reduction? Although, gx and gy within the scope of the parallel region, I think it's possible for race condition to occur. am I right? thx
nightwish
 
Posts: 3
Joined: Mon May 20, 2013 8:20 pm

Re: openmp isn't improve the execution time

Postby MarkB » Tue May 21, 2013 12:42 pm

nightwish wrote:I just wonder if there's any specific reason why I don't have to use reduction? Although, gx and gy within the scope of the parallel region, I think it's possible for race condition to occur. am I right? thx


There can be no race: variables declared inside the scope of a parallel region are private to each thread. There is no problem with declaring private variables, adding some values to them and then using the result all within one loop iteration. In fact, you cannot declare them as reduction variables since they are not in scope when the master thread encounters the parallel construct.

The normal pattern for a reduction is

Code: Select all
foo=....
#pragma omp parallel for reduction(+:foo)
for (i=0;i<n;i++) {
   foo += .....
}
... = foo


which doesn't match your code, because in your case the variables are initialised and the values consumed inside the parallel loop, instead of outside it.
MarkB
 
Posts: 456
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh


Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot], MarkB, Yahoo [Bot] and 11 guests