OpenMP performance unstable

General OpenMP discussion

OpenMP performance unstable

Postby linuxyeung » Thu Feb 14, 2008 3:51 pm

I run the following C++ code multiple times:
Code: Select all
// Slow and fluctuate

#include <stdint.h>
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
#include <omp.h>

uint8_t add(int);

int main()
{
   const int width=480, height=720;

   srand(time(NULL));
   uint8_t *buf = new uint8_t [width*height];

   //omp_set_num_threads(4);
   for(int k=0; k<5656; k++)
   {
      #pragma omp parallel for shared(buf)
      for(int j=0; j<height; j++)
      {
         const int n = j * width;
         for(int i=0; i<width; i++)
         {
            int nn = n + i;
            buf[nn] = add(nn);
         }
      }
   }

   int index = (int)((float)rand() / RAND_MAX * width * height);
   printf("buf[%d] = %d\n", index, buf[index]);
   delete[] buf;
   return 0;
}

uint8_t add(int idx)
{
   return ((idx * 0x12) & 0xff);
}


The executable is called "test"

When I run the following script:
Code: Select all
#!/bin/bash

for (( COUNT = 0; COUNT < 10; COUNT++ )) ; do
   time ./test
done


In one trial I get the following result:
Code: Select all
buf[76004] = 8

real    0m19.783s
user    0m34.088s
sys     0m0.148s
buf[306546] = 4

real    0m7.188s
user    0m13.719s
sys     0m0.112s
buf[309763] = 54

real    0m18.698s
user    0m32.077s
sys     0m0.131s
buf[243026] = 196

real    0m14.954s
user    0m26.089s
sys     0m0.129s
buf[200081] = 50

real    0m19.726s
user    0m33.954s
sys     0m0.140s
buf[308258] = 100

real    0m14.938s
user    0m25.965s
sys     0m0.119s
buf[264942] = 188

real    0m18.405s
user    0m31.424s
sys     0m0.129s
buf[198473] = 34

real    0m19.943s
user    0m34.639s
sys     0m0.132s
buf[82886] = 236

real    0m20.026s
user    0m35.074s
sys     0m0.114s
buf[314979] = 246

real    0m6.983s
user    0m13.598s
sys     0m0.130s

I get inconsistent result on performance, any idea?
linuxyeung
 
Posts: 4
Joined: Thu Feb 14, 2008 3:40 pm

Re: OpenMP performance unstable

Postby ejd » Wed Feb 20, 2008 10:22 am

I am not seeing the same sort of large variation that you are from run to run. When I run your code I see:

Code: Select all
real    0m21.836s  +/-0.023
user    0m21.814s  +/-0.023
sys     0m0.020s   +/-0.003

What compiler are you using (version and options), what OS, and how many processors?
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: OpenMP performance unstable

Postby linuxyeung » Wed Feb 20, 2008 12:32 pm

Hardware: Pentium D 3.2Ghz
OS: Fedora Core 6 Kernel version 2.6.22.9-61.fc6 SMP
Compiler: g++ 4.1.2 20070626
Compiling options: -g -fopenmp -Wall
linuxyeung
 
Posts: 4
Joined: Thu Feb 14, 2008 3:40 pm

Re: OpenMP performance unstable

Postby linuxyeung » Wed Feb 20, 2008 12:33 pm

2 CPU's
linuxyeung
 
Posts: 4
Joined: Thu Feb 14, 2008 3:40 pm

Re: OpenMP performance unstable

Postby ejd » Wed Feb 20, 2008 1:39 pm

Unfortunately I don't happen to have anything close to your configuration available. I tried it on a 2 processor (not 2 core) Linux box running Red Hat and using an Intel 9.1 compiler. I didn't see anything close to the variation you did:

Code: Select all
real    0m2.043s  +/-0.001
user    0m3.654s  +/-0.078
sys     0m0.141s  +/-0.052

I wouldn't think that the compiler and runtime would cause as much variance as you are seeing. It could be the OS and the way it is scheduling threads - though I wouldn't think so considering the large ranges. Maybe something else was running on the system and taking some of the resource? My best guess would be the chip. I am not that familiar with the Pentium D. I think the 3.2GHz came in two formats - one had two L2 caches and one shared it's L2 cache between processors. This looks like what might occur with a shared L2 cache.

Maybe someone else has a better idea?
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: OpenMP performance unstable

Postby linuxyeung » Wed Feb 20, 2008 3:38 pm

Thanks for the information.
I did some more experiments.
1. I run the same code in a Core 2 machine with the kernel 2.6.22.14-72 and same g++. The result is the same fluctuation.
2. When I run the following code, fluctuation goes away:
Code: Select all
// Fast and stable

#include <stdint.h>
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
#include <omp.h>

uint8_t add(int);

int main()
{
   const int width=480, height=720;

   srand(time(NULL));
   uint8_t buf[width*height];

   //omp_set_num_threads(4);
   for(int k=0; k<5656; k++)
   {
      #pragma omp parallel for shared(buf)
      for(int j=0; j<height; j++)
      {
         const int n = j * width;
         for(int i=0; i<width; i++)
         {
            int nn = n + i;
            buf[nn] = add(nn);
         }
      }
   }

   int index = (int)((float)rand() / RAND_MAX * width * height);
   printf("buf[%d] = %d\n", index, buf[index]);
   return 0;
}

uint8_t add(int idx)
{
   return ((idx * 0x12) & 0xff);
}
linuxyeung
 
Posts: 4
Joined: Thu Feb 14, 2008 3:40 pm


Return to Using OpenMP

Who is online

Users browsing this forum: Exabot [Bot], Google [Bot] and 11 guests