Nested for loops

General OpenMP discussion

Nested for loops

Postby Guest » Fri Jul 04, 2008 4:24 am

Hi!
I have a program with a main loop of about 4 iterations, with very different times of execution.
Inside it, I call a function which has a triple nested loop, each having from 1 to 10 iterations, but we cannot know which one is the longest, so we need to parallelize the three.

I parallelized the Three dim loop :
Code: Select all
#pragma omp parallel for
   for (int i = 0; i < nx(); i++)
   {
#pragma omp parallel for
      for (int j = 0; j < ny(); j++)
      {
#pragma omp parallel for
         for (int k = 0; k < nz(); k++)
         {
            .....
         }
      }
   }


and my program on a 8 CPU computer gains a factor of 5.

The problem is, when I parallelize the main loop too, it seems the 3dim loop is no more parallelized as I get a factor between 1 and 2, the same as when I only parallelized the main loop.

I tried seting : setenv OMP_NESTED TRUE
But then I get a "segmentation fault"

Any idea how to make all loops parallelized ?
Thanks
Guest
 

Re: Nested for loops

Postby Arnaud » Fri Jul 04, 2008 7:08 am

It seems I didn't parallelize correctly what was inside my triple loop, which gave me a segmentation fault when seting OMP_NESTED.
Now I get no error with the 4 loops with OMP.

It's strange it could work correctly with the triple loop parallelized...

I recommand that everybody tests his program with OMP_NESTED activated, even if it is supposed to work without it, as it may reveal bugs :)
Arnaud
 
Posts: 1
Joined: Fri Jul 04, 2008 2:53 am

Re: Nested for loops

Postby ejd » Wed Aug 13, 2008 11:20 pm

Setting nested to true if you do not have nested constructs should have no affect. Also, a lot of times using nested parallelism doesn't help speed up the code. The barriers at the end of the parallel regions quite often will slow the program down. In your case, you could use an if clause and only parallelize on the largest loop. Something like:

Code: Select all
    NX = nx();
    NY = ny();
    NZ = nz();
    #pragma omp parallel for if ((NX > NY) && (NX > NZ))
       for (int i = 0; i < NX; i++)
       {
    #pragma omp parallel for if ((NY > NX) && (NY > NZ))
          for (int j = 0; j < NY; j++)
          {
    #pragma omp parallel for if ((NZ > NX) && (NZ > NY))
             for (int k = 0; k < NZ; k++)
             {
                .....
             }
          }
       }

On the other hand, you said you are seeing a fairly healthy speedup, so maybe you have a case where nested actually is buying you something. The other thing to note, is that many OpenMP implementations have a limit to the number of nesting levels they support without the user having to do something. You should check your vendor documentation. In OpenMP V3 there is a new environment variable and calls added to standardize this. If you are interested, look for information about the max-active-levels-var ICV in the version 3.0 spec.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am


Return to Using OpenMP

Who is online

Users browsing this forum: Majestic-12 [Bot], Yahoo [Bot] and 11 guests