## recursive function call and nested parallelism

General OpenMP discussion

### recursive function call and nested parallelism

Hello

I have got a question regarding the calling of functions in parallel loops.
The situation is that I got a for-loop, inside it, I call a function which has an other for-loop, so, how I can know the number of threads used to execut this code.for example:

#pragma omp parallel for
for(int i=0;i<10;i++)
{
function();
}

void function()
{
#pragma omp parallel for
for(int i=0;i<10;i++)
{
do anything;
}
}

If I put omp_set_num_threads(10), so the total number of threads= 10*10, is it true? each thread from the 10 threads in the main for-loop creat 10 other threads when it enter the function().
and if I put omp_set_num_threads(10) in the main(), and I change it inside the function, like this:

#pragma omp parallel for
for(int i=0;i<10;i++)
{
function();
}

void function()
{
#pragma omp parallel for
for(int i=0;i<10;i++)
{
do anything;
}
}

in this case, the total number of thread used = 10*3, is it true?

Now, suppose that I have a recursive call inside the function (it's my case), like this:

#pragma omp parallel for
for(int i=0;i<10;i++)
{
k=0;
function(k);
}

void function(int k)
{
if (k==5)
do the stoped test,
else
{
#pragma omp parallel for
for(int i=0;i<10;i++)
{
k++;
function();
}
}
}

So, the main loop has 10 iterations, each iteration calls the function() wich contains a for-loop parallelised, each iteration of the latest (for-loop) calls the same function.
If I use 10 threads to parallelise the main for-loop,and 3 threads to parallelize the for-loop of the function(), can I calcul the total number of threads like this?:

is there a function that allows to calculate the total number of thread used at the OpenMP V2.5 spec?I know that, at the OpenMP V3.0 spec, there are new ICV added to calculate this, but I can't find it, so I use the 2.5 version.

In my code, I use this to realise a pipeline line, so, the recursive calls make the line (each iteration is a stage of the pipeline line).so, how I do to fixe the number of the thread in each stage, in this example, the first stage has 10*3 threads, in the second stage there is 10*3*3 threads, in the third 10*3*3*3 threads...so, what I have to do in order to have the same number of threads in each stage.

any idea it's wellcome.
lamoincyloj

Posts: 7
Joined: Wed Aug 20, 2008 10:28 am

### Re: recursive function call and nested parallelism

lamoincyloj wrote:I have got a question regarding the calling of functions in parallel loops.
The situation is that I got a for-loop, inside it, I call a function which has an other for-loop, so, how I can know the number of threads used to execute this code for example:
Code: Select all
`#pragma omp parallel for  for(int i=0; i<10; i++){  function();}void function(){  #pragma omp parallel for  for(int i=0; i<10; i++)  {    ... do anything;  }}`

If I put omp_set_num_threads(10), so the total number of threads= 10*10, is it true? each thread from the 10 threads in the main for-loop create 10 other threads when it enter the function().

Depending on several things, like whether the value of the nest-var ICV is "true" (meaning that you allow nested parallelism) and whether or not your system will allow you to create 100 threads, etc, then yes you would get 100 threads. However, an implementation doesn't have to actually create 100 threads. In a one-to-one mapping implementation, then you would get 100 threads. In a one-to-many mapping implementation, the work could be done using less threads that act like 100 threads.

Also note, that the last part of your comment is wrong. If each thread in the main loop created 10 other threads when it entered function(), then you would have 110 threads in total. Each thread that "enters" a new parallel region becomes the master of a team that consists of a total of 10 threads - meaning that it "creates" 9 other threads.

lamoincyloj wrote:... and if I put omp_set_num_threads(10) in the main(), and I change it inside the function, like this:
Code: Select all
`omp_set_num_threads(10);#pragma omp parallel for  for(int i=0; i<10; i++){  function();}void function(){  omp_set_num_threads(3);  #pragma omp parallel for  for(int i=0; i<10; i++)  {    ... do anything;  }}`

in this case, the total number of thread used = 10*3, is it true?

I believe this is correct.

lamoincyloj wrote:Now, suppose that I have a recursive call inside the function (it's my case), like this:
Code: Select all
`#pragma omp parallel for  for(int i=0; i<10; i++){  k=0;  function(k);}void function(int k){  if (k==5)    do the stopped test,  else  {    omp_set_num_threads(3);    #pragma omp parallel for    for(int i=0; i<10; i++)    {      k++;      function();    }  }}`

So, the main loop has 10 iterations, each iteration calls the function() which contains a for-loop parallelised, each iteration of the latest (for-loop) calls the same function.
If I use 10 threads to parallelise the main for-loop,and 3 threads to parallelize the for-loop of the function(), can I calculate the total number of threads like this?:

Is there a function that allows to calculate the total number of thread used at the OpenMP V2.5 spec?I know that, at the OpenMP V3.0 spec, there are new ICV added to calculate this, but I can't find it, so I use the 2.5 version.

There is nothing either in the V2.5 or V3.0 spec to actually calculate the number of threads you are using. In the V3.0 spec there is a new ICV (thread-limit-var) that controls the maximum number of threads participating in an OpenMP program. However, this is quite different than calculating the maximum number used. For example, with nested parallel, if you have a parallel region using 2 threads that has a nested parallel region using 2 threads, then you could require either 4 threads (if both threads of the first region are in the second region at the same time) or only 3 threads (if only one of the threads from the first region is in the second region at any time).

lamoincyloj wrote:In my code, I use this to realise a pipeline line, so, the recursive calls make the line (each iteration is a stage of the pipeline line).so, how I do to fix the number of the thread in each stage, in this example, the first stage has 10*3 threads, in the second stage there is 10*3*3 threads, in the third 10*3*3*3 threads...so, what I have to do in order to have the same number of threads in each stage.

In your example, you have a value (k) that is essentially keeping track of the recursion level. In this case, it would be easy to add something like:
Code: Select all
`if (k ==0)  omp_set_num_threads(3);else  omp_set_num_threads(1);`
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: recursive function call and nested parallelism

Thank you very very much edj, especialy for the last idea.
But, have you some ideas how to do the pipeline using OpenMP? pipeline = line of operators such that the output of a stage it's the input of the successor stage.
lamoincyloj

Posts: 7
Joined: Wed Aug 20, 2008 10:28 am

### Re: recursive function call and nested parallelism

OpenMP was originally designed for doing things like loops where each thread had an independent task to work on. You can see this in the area of synchronization, where it doesn't really have a good way to synchronize multiple threads (except by means of a barrier). With OpenMP V3.0, the concept of tasks has been added, which depending on how you have to handle your pipeline, could be used. However, there is still no good synchronization or prioritization scheme available between tasks. There are some things that OpenMP may not be the best solution for.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: recursive function call and nested parallelism

Thank you edj,
do you know where and how I can get the openMP3.0?
I have: Fedora core 9
lamoincyloj

Posts: 7
Joined: Wed Aug 20, 2008 10:28 am

### Re: recursive function call and nested parallelism

Currently I only know of three compilers that support some or all of the Version 3 OpenMP spec - gcc, Intel, and Sun. The gcc code is in the current development version and can be downloaded. I believe that it supports all of the OpenMP version 3.0 features. The Intel compiler (V11.0) is currently in Beta test - which I believe they have opened to everyone. The Sun compiler (Sun Studio 13.0) is available as an express (beta) and supports most (but not all) of the new features. You will have to check to see which of these support your OS. OpenMP version 3.0 is just now starting to be implemented by the compiler vendors and this list (of compilers and operating systems they support) will be changing quite a bit in the near future (in my opinion).
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am