Sections

General OpenMP discussion

Sections

Postby lancer6238 » Tue Sep 02, 2008 9:16 pm

Hi,
I'm trying to understand how the Sections construct work. I've read in the OpenMp tutorial that "it specifies that the enclosed section(s) of code are to be divided among the threads in the team." I would like to know what is meant by "team". What I understand now is that in the following code, each section is run by one thread, in parallel. Task A and B may be run by the same thread is either A or B is very fast and finishes very quickly, allowing it to move on to the other task.

Code: Select all
#pragma omp parallel
{
         #pragma omp sections
        {
            #pragma omp section
            {
                TaskA();
            }
            #pragma omp section
            {
                TaskB();
            }
            #pragma omp section
           {
               TaskC();
           }
       }
}


Say I have a quad-core computer, and I'm running MPI locally using the option "-np 4". So usually, 4 processes are created, each running on 1 CPU. Then I run the above code on each of the CPUs. What I would like to know is how are the threads created and allocated when the code is reached. Is the default number of threads in this case 1 (due to the process running only on one core) or 4 (due to it being a quad-core system)?

Thank you.

Regards,
Rayne
lancer6238
 
Posts: 6
Joined: Sun Aug 10, 2008 7:31 pm

Re: Sections

Postby ejd » Tue Sep 02, 2008 10:20 pm

It will depend on the implementation defaults or what you have set elsewhere (either with a call to omp_set_num_threads() or the environment variable omp_num_threads). The parallel pragma says that a team of threads is to be used to run the code within the parallel region. The sections pragma says that one thread will run each section. What you have to note, is that if you haven't set the number of threads to be used, then it will be whatever the OpenMP implementation default is. If you have set it, then that will be the number of threads created.

For example, if you say to use 2 threads, then two of the sections will most likely be executed by one thread and one section by the other (though it is possible that one thread could execute all three). In any case, if you are using MPI as well (from your example), then each of the 4 MPI processes that run this parallel region will create 2 threads - giving you a total of 4 MPI processes and 8 OpenMP threads all running on your system.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Sections

Postby lancer6238 » Wed Sep 03, 2008 12:02 am

Right now, I'm trying to do the following using MPI and OpenMP:

Code: Select all
if (rank == 0)
{
     #pragma omp parallel sections num_threads(3)
     {
            #pragma omp section
            {
                do_master();
            }
            #pragma omp section
            {
                do_workerA();
            }
            #pragma omp section
           {
               do_workerB();
           }
     } 
}
else
{
        #pragma omp parallel sections num_threads(2)
        {
             #pragma omp section
            {
                do_workerA();
            }
            #pragma omp section
           {
               do_workerB();
           }
     }
}


The do_master() function needs to receive the state of the worker processes (i.e. whether they're free or not) and then send jobs to the free worker processes. The do_workerA() sends the process' state to the master process (Rank 0) and receives jobs. The do_workerB() function checks for any change in a list that do_workerA() accesses and tells all processes if there is a change.

The program runs in such a way that each section is "bounded" by a while-loop that runs for almost the entire run of the program, so it is almost certain that each section would be handled by a distinct thread. However, near the end of the program, certain sections, such as do_master() and do_workerB() may finish earlier. Is there a way to make sure that when a thread exits its section, it does not go into the other sections? I don't think this is a problem for the worker processes as there are only 2 sections - if one thread is done while the other is still busy, the free thread won't enter the other section (right?). However, if do_master() and do_workerB() in Rank 0 finish earlier, will the 2 threads "swap" and try to execute the other section? Or is it that as long as 2 threads don't finish at exactly the same time, the thread that first becomes free goes to the end of the parallel sections, and the thread that next becomes free also goes to the end of the parallel sections, and thus everything is fine?

For future reference, if the sections are not bounded by while-loops, is there a way to make sure that each section is executed by distinct threads, because otherwise, the sending and receiving of messages may not be carried out correctly?

Thank you.

Regards,
Rayne
lancer6238
 
Posts: 6
Joined: Sun Aug 10, 2008 7:31 pm

Re: Sections

Postby ejd » Wed Sep 03, 2008 7:16 am

The part of the OpenMP spec (using the Version 3.0 version - though V2.5 is the same ) that applies to your question is section 2.5.2 sections Construct, which states:
Summary
The sections construct is a noniterative worksharing construct that contains a set of structured blocks that are to be distributed among and executed by the threads in a team. Each structured block is executed once by one of the threads in the team in the context of its implicit task.
...
Description
...
The method of scheduling the structured blocks among the threads in the team is implementation defined.

In most cases, your example code will have one thread execute each section. However, the OpenMP spec doesn't guarantee that will happen. The spec says that it is implementation defined and only guarantees that each section will be executed once by one of the threads in the team. Most implementations would pick up a section and assign it to a thread in the team and then move to the next section and assign it to an available thread. However, if the first section finishes and places the thread it used back into the available thread queue before the next section is assigned work, it is possible (again depending on the implementation) that the same thread would be used for both of the structured blocks (or both section). Once the assignments are done and they finish, if no other section are available then the threads would go to the end of the sections region and wait (unless a nowait has been specified) until all threads of the team are finished.

Not to say that I would recommend this, but I have seen code that does something like what you what. It looks something like:
Code: Select all
int a = 0;
int b = 0;
#pragma omp parallel
{
     #pragma omp sections
     {
          #pragma omp section
          {
               a = 1;
               #pragma omp flush
               ... some work
               while (b != 1) {
                    #pragma omp flush
               }
          }
          #pragma omp section
          {
               b = 1;
               #pragma omp flush
               ... some work
               while (a != 1) {
                    #pragma omp flush
               }
          }
     }
}

The idea being that you only loop as long as you have to to make sure that each section is being run by a separate thread.

Another approach, would be to use a different worksharing construct like this:
Code: Select all
#pragma omp parallel num_threads(2)
{
     #pragma omp for schedule(static, 1)
     for (int i = 0; i < 2; i++)
     {
          if (omp_get_thread_num() == 0)
               TaskA();
          else
              TaskB();
     }
}

The idea here is that each thread will be given one iteration of the loop and you assign the thread to do certain work.

Maybe someone else might have a better suggestion.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Sections

Postby lancer6238 » Wed Sep 03, 2008 6:51 pm

Thanks a lot for your replies.

I'm also thinking of implementing the master/worker model using just threads, and not MPI. The idea would be one master thread that reads and distributes text files to the worker threads, and also one other thread that "listens" for a change to a list that the worker threads use. This other thread would update the list when any one of the worker threads indicates that there is a change by setting a global flag to 1.

However, I'm not really familiar with implementing the master/worker model using threads. With MPI, I could just send the text files to the worker processes from the master process, and each worker would definitely get different files as the memory is localized. However, with threads, all worker threads can access the same memory the master thread accesses, so I'm not sure how the distribution would work. Also, I believe the text files are accessible by any thread, so how do I make sure the worker threads work on different text files? Are there any examples anywhere where I can look at how the master/worker model works using threads?

Another concern I have is with the updating of the list using a flag. I know that the worker threads should set the flag to 1 using the atomic directive, so as to avoid race conditions. So what about the thread that listens for changes? Should it read the flag in a critical region? Also, as this is the only thread that does the updating of the list, I thought it is unnecessary to put the updating part in a critical region, but it should prevent other threads from accessing the list before it has finished updating. How can I do that?

One more question: how do I tell which thread is the master thread? Is the thread with ID 0 the master thread?

Thank you.

Regards,
Rayne
lancer6238
 
Posts: 6
Joined: Sun Aug 10, 2008 7:31 pm

Re: Sections

Postby ejd » Thu Sep 04, 2008 5:22 am

Yesterday was not one of my better days. A simple solution to the code so that a separate thread was running each "section" of code would be the following:
Code: Select all
#pragma omp parallel num_threads(2)
{
     if (omp_get_thread_num() == 0)
         TaskA();
     else
         TaskB();
}     

As for your other questions:
Also, I believe the text files are accessible by any thread, so how do I make sure the worker threads work on different text files? Are there any examples anywhere where I can look at how the master/worker model works using threads?

Since OpenMP uses a shared memory model, the data is available to all threads unless you protect it somehow. There are a lot of ways to do this. Do a search for producer/consumer models using OpenMP on the web and I am sure you will find some examples.

Another concern I have is with the updating of the list using a flag. I know that the worker threads should set the flag to 1 using the atomic directive, so as to avoid race conditions. So what about the thread that listens for changes? Should it read the flag in a critical region? Also, as this is the only thread that does the updating of the list, I thought it is unnecessary to put the updating part in a critical region, but it should prevent other threads from accessing the list before it has finished updating. How can I do that?

The problem here is that OpenMP doesn't have a way to wait for a flag to change without using CPU time. One thread has to write the flag and flush it so that all the other threads can see the value. This can be done using an atomic. However, other threads have to repeatedly flush this flag until they see that it has changed. Using the flush directive correctly is also one of the hardest things about OpenMP. The producer/consumer model seems to be a constant problem users are trying to solve and maybe OpenMP needs to be extended to help more with this problem.

One more question: how do I tell which thread is the master thread? Is the thread with ID 0 the master thread?

Yes - the master thread of each team always has the thread id of 0.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Sections

Postby lancer6238 » Fri Sep 05, 2008 1:04 am

ejd wrote:
Another concern I have is with the updating of the list using a flag. I know that the worker threads should set the flag to 1 using the atomic directive, so as to avoid race conditions. So what about the thread that listens for changes? Should it read the flag in a critical region? Also, as this is the only thread that does the updating of the list, I thought it is unnecessary to put the updating part in a critical region, but it should prevent other threads from accessing the list before it has finished updating. How can I do that?

The problem here is that OpenMP doesn't have a way to wait for a flag to change without using CPU time. One thread has to write the flag and flush it so that all the other threads can see the value. This can be done using an atomic. However, other threads have to repeatedly flush this flag until they see that it has changed. Using the flush directive correctly is also one of the hardest things about OpenMP. The producer/consumer model seems to be a constant problem users are trying to solve and maybe OpenMP needs to be extended to help more with this problem.


So I think there should be 2 flags, one "change_list" to indicate any changes to the list, and one "done_updating" to indicate if the thread that updates the list has finished doing so.

The updating thread will have

Code: Select all
do{
   #pragma omp flush (change_list)
} while (change_list == 0)
done_updating = 0;
// change list
done_updating = 1;


and the working threads will have

Code: Select all
do{
   #pragma omp flush (done_updating)
} while (done_updating == 0)
// do work and access list


Is my logic correct? The threads do seem to spend a lot of time flushing the variables.

Another question: am I right to say that locks can only lock the execution of a block of code, but cannot lock a variable itself, i.e. prevent other threads from accessing that variable while it's been updated?
lancer6238
 
Posts: 6
Joined: Sun Aug 10, 2008 7:31 pm

Re: Sections

Postby ejd » Fri Sep 05, 2008 6:34 am

Is my logic correct? The threads do seem to spend a lot of time flushing the variables.


Your logic needs some work. In any case, you are going to have to flush the flag a lot - more than you have indicated. For every thread that updates the flag, you are going to have to make sure that it has sole access to the flag for the update, sees the correct "version" of the flag, and you are going to have to make sure that the flag is flushed (either implicitly and explicitly) after the update. For the read of the flag, you have to make sure that you see the correct "version" of the flag which means you have to flush it before each read.

Another question: am I right to say that locks can only lock the execution of a block of code, but cannot lock a variable itself, i.e. prevent other threads from accessing that variable while it's been updated?

Correct.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am


Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot], Yahoo [Bot] and 12 guests