using if clause to select level of parallelisation

General OpenMP discussion

using if clause to select level of parallelisation

Postby drorc » Sun Aug 05, 2012 11:54 pm

Hello all,

I have encountered a problem when trying to selectivily parallelise at one of two different levels.
The code looks something like this

Code: Select all
functionHigh(bool para_high, bool para_low,args1){
...
do somethings
...
#pragma omp parallel for if(para_high)
for(i=0,i<end,i++){

functionLow(i,para_low, args2) // do some time consuming calculation based on 'i'

}
} //end of functionHigh

functionLow(i, args2){
...
do somethings
...
#pragma omp parallel for if(para_low)
for(j=0,j<end,j++){
...
do somethings
...
}
} // end of functionLow




That is, I am either parallelising in the low level (fine grained?) or in the high level (coarse) but I do not do both (though I would like to in the future). As far as I can tell the parallelisations work when used separately.

The problem is that when I set para_high to false and para_low to true my preformace is much worse than when I comment out the line '#pragma omp parallel for if(para_high)' (keeping para_low true) - how can that be?

Since the if clause is similar to setting the num of threads to 1 I thought it may be a nesting related problem...

Thanks in advance
drorc
 
Posts: 4
Joined: Sun Aug 05, 2012 10:43 pm

Re: using if clause to select level of parallelisation

Postby MarkB » Mon Aug 06, 2012 2:05 am

drorc wrote:The problem is that when I set para_high to false and para_low to true my preformace is much worse than when I comment out the line '#pragma omp parallel for if(para_high)' (keeping para_low true) - how can that be?


Hi there,

When you set para_high to false and para_low to true are you getting the number of threads that you expect in the inner parallel region?

Are you setting OMP_DYNAMIC=TRUE? (this should not be necessary to make this case work, but it might be worth a try if you are not using it).

What compiler (and version) are you using?
MarkB
 
Posts: 422
Joined: Thu Jan 08, 2009 10:12 am

Re: using if clause to select level of parallelisation

Postby ftinetti » Mon Aug 06, 2012 6:01 am

Hi,

I was rather confused on
Since the if clause is similar to setting the num of threads to 1 I thought it may be a nesting related problem...

because I thought that when the if clause contained a false value the corresponding region would be "plain" sequential, i.e. equivalent to 1 thread in the sense of "plain" sequential, not a parallel region with 1 thread. So, I experimented with the code below:

Code: Select all
$ cat ifclause.c
#include <stdio.h>
#include <omp.h>

int main(int argc, char* argv[])
{
  int v1[1000], v2[1000], i;
  int para_low;

  para_low = 0;
  #pragma omp parallel for if(para_low)
  for (i = 0; i < 1000; i++)
  {
    if (i == 0) printf("omp_in_parallel() return value: %d\n", omp_in_parallel());
    v1[i]= 0;
  }

  para_low = 1;
  #pragma omp parallel for if(para_low)
  for (i = 0; i < 1000; i++)
  {
    if (i == 0) printf("omp_in_parallel() return value: %d\n", omp_in_parallel());
    v2[i]= 1;
  }
}


$ gcc -v
...
gcc version 4.6.3 20111216 (prerelease) (GCC)

$ gcc -fopenmp ifclause.c -o ifclause
$ ./ifclause
omp_in_parallel() return value: 0
omp_in_parallel() return value: 1

And this experimentally confirms my initial understanding of the ifclause. However, checking the v. 3.1 spec. in pages 36-37 it seems to be that when the ifclause expression evaluates to false there is a parallel region with just one thread...

if (IfClauseValue = false)
then number of threads = 1;


Now I'm confused... ?

Fernando.
ftinetti
 
Posts: 567
Joined: Wed Feb 10, 2010 2:44 pm

Re: using if clause to select level of parallelisation

Postby MarkB » Mon Aug 06, 2012 6:21 am

Hi Fernando,

A parallel construct will always generate a parallel region: there is no such thing as "plain sequential" execution. If the parallel region only contains one thread (either because an if clause evaluates to false, or for any other reason), then it is called an inactive parallel region. However, omp_in_parallel() only returns true if it is called from inside an active parallel region (i.e. one which is executing with more than one thread), hence the behaviour you observe in your example.

(Note that the definition of active/inactive parallel regions was changed in version 3.0: this may be a source of confusion!)

Hope that helps,
Mark.
MarkB
 
Posts: 422
Joined: Thu Jan 08, 2009 10:12 am

Re: using if clause to select level of parallelisation

Postby ftinetti » Mon Aug 06, 2012 6:58 am

Thank you very much, Mark. I become lost reading sections 1.2.2 and 1.2.3, e.g.
task A specific instance of executable code and its data environment, generated
when a thread encounters a task construct or a parallel construct.
task region A region consisting of all code encountered during the execution of a task.
COMMENT: A parallel region consists of one or more implicit task
regions.

taking into account other definitions... but I do not have a specific question...

And thank you very much again,

Fernando.
ftinetti
 
Posts: 567
Joined: Wed Feb 10, 2010 2:44 pm

Re: using if clause to select level of parallelisation

Postby drorc » Mon Aug 06, 2012 10:23 pm

Thanks for your replies.

The compiler is gcc, using Ubuntu 12.04 if that makes a diff. This was observed both on my laptop (duo core) and a different quad core machine (also running ubuntu).

Haven't tried playing with OMP_DYNAMIC and will do so tonight.

MarkB I get the correct number of threads in the inner para when para_high is set to false. Looking at my system monitor I know it works as I can see both processors' utilisation is higher than with no para, it's just no as good.

Interestingly, if I set #pragma omp parallel for num_threads(0) then the performance improves - I was expecting an error for this.

Is the call OMP_SET_MAX_ACTIVE_LEVELS related??

Cheers,

Dror
drorc
 
Posts: 4
Joined: Sun Aug 05, 2012 10:43 pm

Re: using if clause to select level of parallelisation

Postby ftinetti » Tue Aug 07, 2012 5:42 am

Hi again,

I've included a call to omp_get_active_level() in the code as shown below. The Spec 3.1 explains for omp_get_active_level():
The omp_get_active_level routine returns the number of nested, active parallel
regions enclosing the task that contains the call. The routine always returns a nonnegative
integer, and returns 0 if it is called from the sequential part of the program.

and the "returns 0 if it is called from the sequential part of the program" is specifically what I liked to call "a plain sequential", but I'll call just "sequential part of the program" as the expec defines it.

Code: Select all
$ cat ifclause.c
#include <stdio.h>
#include <omp.h>

int main(int argc, char* argv[])
{
  int v1[1000], v2[1000], i;
  int para_low;

  para_low = 0;
  #pragma omp parallel for if(para_low)
  for (i = 0; i < 1000; i++)
  {
    if (i == 0) printf("omp_in_parallel() return value: %d, and omp_get_active_level() returns: %d\n", omp_in_parallel(), omp_get_active_level());
    v1[i]= 0;
  }

  para_low = 1;
  #pragma omp parallel for if(para_low)
  for (i = 0; i < 1000; i++)
  {
    if (i == 0) printf("omp_in_parallel() return value: %d, and omp_get_active_level() returns: %d\n", omp_in_parallel(), omp_get_active_level());
    v2[i]= 1;
  }
}


now the output is (as expected, I think):

omp_in_parallel() return value: 0, and omp_get_active_level() returns: 0
omp_in_parallel() return value: 1, and omp_get_active_level() returns: 1

Now, I do have a specific question: when the if clause evaluates to false:
is it a sequential part? if so, the problem in the original code is not related to dynamic
is it a parallel part with just one thread? if so, either the spec should be fixed/made clearer or the gcc implementation I'm using has a bug

or I'm missing something else, which is what I'm guessing right now.

Thanks in advance,

Fernando.
ftinetti
 
Posts: 567
Joined: Wed Feb 10, 2010 2:44 pm

Re: using if clause to select level of parallelisation

Postby drorc » Wed Aug 08, 2012 1:27 am

Hello,

Tried with OMP_DYNAMIC=TRUE and no difference.

Some more detail regarding my sys

Thread model: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

A couple of questions for my general understanding:

Firstly if I set both para_low and para_high to true then I would expect omp_get_level() to return 2 only if I have more than 2 threads?

Using #pragma omp parallel for num_threads(1) should be functionally equivalent to a sequential section?

What does #pragma omp parallel for num_threads(0) do? shouldn't that give an error?
drorc
 
Posts: 4
Joined: Sun Aug 05, 2012 10:43 pm

Re: using if clause to select level of parallelisation

Postby ftinetti » Wed Aug 08, 2012 4:31 am

Hi drorc,

I think many of your questions could be answered (at least experimentaly in the implementation you are using) by playing around with the code and reading the OpenMP specification. On
What does #pragma omp parallel for num_threads(0) do? shouldn't that give an error?


I've found in the spec:

At most one num_threads clause can appear on the directive. The num_threads
expression must evaluate to a positive integer value.

and, on the closely related omp_set_num_threads() function
The value of the argument passed to this routine must evaluate to a positive integer, or
else the behavior of this routine is implementation defined.

which means, I think, that "#pragma omp parallel for num_threads(0)" is not OpenMP compliant and the spec. does not define any thing to do. As always, it would be very good if the compiler/runtime checks and reports the anomalous situation, but the spec. does not enforce to do so.

Fernando.
ftinetti
 
Posts: 567
Joined: Wed Feb 10, 2010 2:44 pm

Re: using if clause to select level of parallelisation

Postby MarkB » Wed Aug 08, 2012 7:40 am

Hi Dror,

I think you may have to put this down to a quirk of the gcc runtime. It would be interesting to experiment with a different compiler....

One last thing you might try is changing the settings of OMP_NESTED and OMP_PROC_BIND.

drorc wrote: if I set both para_low and para_high to true then I would expect omp_get_level() to return 2 only if I have more than 2 threads?


omp_get_level() should always return 2 if called from inside the inner parallel, regardless of the number of threads or the value of the if clause expression.
omp_get_active_level() will only return 2 if both levels of parallel region are executed by more than one thread.

drorc wrote:Using #pragma omp parallel for num_threads(1) should be functionally equivalent to a sequential section?


Not quite: the runtime will still create new copies of private variables with a lifetime of the parallel region, so it is possible to write code that behaves differently with and without the OpenMP pragma present.

drorc wrote:What does #pragma omp parallel for num_threads(0) do? shouldn't that give an error?


As Fernando says, this is not permitted. Whether you get an error is a quality of implementation issue.....
Last edited by MarkB on Wed Aug 08, 2012 7:45 am, edited 1 time in total.
MarkB
 
Posts: 422
Joined: Thu Jan 08, 2009 10:12 am

Next

Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot] and 10 guests