[Omp] Slower when it must be faster
Greg Bronevetsky
greg at bronevetsky.com
Thu May 10 16:10:52 PDT 2007
On quick thing that I noticed: Although you've declared your xsubi buffer
private, only the pointer is private rather than its data. This will break
your random seed generation since multiple threads may use the same seed.
Instead, you can declare xsubi inside the parallel region and initialize
it there. Another OpenMP thing: the code for xsubi wouldn't work even if
it were a scalar because you used the private clause rather than
firstprivate. The former simply creates a copy for each thread, while the
latter also ensures that each copy contains the same value as what the
master thread had.
Greg Bronevetsky
On Thu, 10 May 2007, Javier Escudero Infante wrote:
> I need some help because i can´t understand what´s happening.
> I have no problems for understand what openmp make but the final
> results on my computer
> are driving me crazy. My parallelized programs are four times slower
> than the serial when it must be four times faster.
> The problems it happends on three or four programs that i found on
> internet an one that i use for me.
>
> I´ll show you the codes becasuse perhaps the problem it´s something
> silly and i need a book like "OpenMP for dummies":
>
> The serial code pi_01.c calculates pi by Montecarlo method:
>
>
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////
> #include <stdio.h>
> #include <stdlib.h>
> #include <time.h>
>
> #define SAMPLES 10000000
> #define SEED0 123
> #define SEED1 456
> #define SEED2 789
>
> int main (int argc, char *argv[]){
>
> int global_count; // points inside circle
> int i; // loop counter
> int n; // number of samples
> double pi; // estimate of pi
> unsigned short xsubi[3]; // random number seed
> double x, y; // point's coordinates
> double start_time,end_time;
> start_time=clock();
>
>
> xsubi[0] = SEED0;
> xsubi[1] = SEED1;
> xsubi[2] = SEED2;
>
>
> global_count = 0;
> for(i=0; i < SAMPLES; i++){
> x = erand48(xsubi);
> y = erand48(xsubi);
> if(x*x+y*y <= 1.0)
> global_count++;
> }
> pi = 4.0 * (double)global_count / (double)SAMPLES;
>
>
> end_time=clock();
> printf("Estimate of pi: %2.10lf en %.2f segundos\n",
> pi,(end_time-start_time)/CLOCKS_PER_SEC);
> }
>
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////
>
> The parallel program also calculates pi by the same method:
>
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////
> #include<stdio.h>
> #include<stdlib.h>
> #include<omp.h>
> #include<time.h>
>
>
>
> #define SAMPLES 10000000
> #define NUM_THREADS 2
> #define SEED1 123
> #define SEED2 456
>
> int main(int argc, char *argv[]) {
> int global_count = 0, local_count, thread_num, i;
> unsigned short xsubi[3];
> double x, y;
> double start_time,end_time;
> start_time=clock();
>
> xsubi[0] = SEED1;
> xsubi[1] = SEED2;
> omp_set_num_threads(NUM_THREADS);
>
> #pragma omp parallel private(xsubi, i, x, y, local_count)
> {
> local_count = 0;
> thread_num = omp_get_thread_num();
> xsubi[2] = thread_num;
>
> for (i = thread_num; i < SAMPLES; i += NUM_THREADS) {
> x = erand48(xsubi);
> y = erand48(xsubi);
> if (x*x + y*y <= 1.0)
> local_count++;
> }
> #pragma omp critical
> global_count += local_count;
> }
> end_time=clock();
> printf("%d pi: %2.10lf en %.2f segundos\n",SAMPLES,
> 4.0*global_count/SAMPLES,(end_time-start_time)/CLOCKS_PER_SEC);
> }
>
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////
>
> And when I compile and run the programs:
>
>
>
> root at dumpty # cc pi_01.c -fast -xtarget=ultra3 -xarch=v8plusb
> root at dumpty # mv a.out serie
> root at dumpty # ./serie
> Estimate of pi: 3.1415228000 en 4.65 segundos
> root at dumpty # ptime ./serie
> Estimate of pi: 3.1415228000 en 4.65 segundos
>
> real 4.661
> user 4.656
> sys 0.003
> root at dumpty # cc pi_02.c -fast -xtarget=ultra3 -xarch=v8plusb
> -xopenmp=parallel
> root at dumpty # mv a.out parallel
> root at dumpty # ptime ./parallel
> 10000000 pi: 3.1413256000 en 32.00 segundos
>
> real 16.058
> user 31.971
> sys 0.038
>
>
> What´s I'm wrong? What I have to do to see openMp working faster?
>
> My computer it´s a Sun Fire v440 and I´m using Sun Studio 11 compilers.
>
> Thanks
> _______________________________________________
> Omp mailing list
> Omp at openmp.org
> http://openmp.org/mailman/listinfo/omp
>
>
More information about the Omp
mailing list