[Omp] slow performance

andrew wang mcwang88 at hotmail.com
Thu Dec 16 17:53:54 PST 2004


I simply get system time and do a minus. Any better way to do it?

(void) time(&t2);
printf("Total time = %d seconds\n", (int) (t2-t1));

>From: Thomas.L.Clune at nasa.gov
>Reply-To: Thomas.L.Clune at nasa.gov
>To: "andrew wang" <mcwang88 at hotmail.com>
>CC: smeds at pdc.kth.se, omp at openmp.org
>Subject: Re: [Omp] slow performance
>Date: Thu, 16 Dec 2004 08:45:40 -0500
>
>
>Just being paranoid, but how are you measuring the time?  Many
>timers total the time among all child threads - giving results just
>like yours even though things are actually running substantially
>faster.
>
>- Tom
>
>
>
>andrew wang writes:
>  > Hi All,
>  >
>  > Sorry, forget to tell you the system info:
>  >
>  > Compaq AlphaServer SC45 with 44 nodes, each node comprising of four 
>1GHz
>  > Alpha processors with 1GB memory. I am uing only one node with 
>different
>  > thread number (1-3). Compaq C compiler supports openmp spe 1.0. The os
>  > should be true 64 Unix.
>  >
>  >
>  > I also try to compile same program on Intel C compiler 8.0, and run it 
>on
>  > two processor win2k server. Here is the running result:
>  >
>  > D:\omp\test>try 2
>  > omp_get_num_procs=2
>  > Parallel region time=12 seconds
>  > Total time = 14 seconds
>  > D:\omp\test>try 1
>  > omp_get_num_procs=2
>  > Parallel region time=12 seconds
>  > Total time = 14 seconds
>  >
>  > seems there is not much difference, same problem.
>  >
>  >
>  > As somebody point out, my program actually do not much inside parallel
>  > region, so i increase the inner loop from 50->500,
>  >
>  > ....
>  >   for (kk=0; kk< 500; kk ++){
>  >
>  >
>  > 	        x = (kk+0.5)*step;
>  > 	        sum += 4.0/(1.0+x*x);   // more complicated calculation here.
>  > 	       }
>  > ....
>  >
>  >
>  > here is the result:
>  >
>  >
>  > d:\omp\test>try 1
>  > omp_get_num_procs=2
>  > Parallel region time=83 seconds
>  > Total time = 87 seconds
>  >
>  > D:\omp\test>try 2
>  > omp_get_num_procs=2
>  > Parallel region time=66 seconds
>  > Total time = 66 seconds
>  >
>  > So the perfromance got enhanced for 2 threads. If this is the case, how
>  > should I parallelize such program? Because in my real program, I can 
>only
>  > parallize the particular region only.
>  >
>  >
>  > Thanks
>  > Andrew
>  >
>  > >From: Nils Smeds <smeds at pdc.kth.se>
>  > >Reply-To: smeds at pdc.kth.se
>  > >To: "andrew wang" <mcwang88 at hotmail.com>
>  > >CC: omp at openmp.org
>  > >Subject: Re: [Omp] slow performance Date: Wed, 15 Dec 2004 17:21:50 
>+0100
>  > >
>  > >
>  > >mcwang88 at hotmail.com said:
>  > > > But to my big suprise, I see that the result is quite different 
>from
>  > >what I
>  > > > can  imagine. The more threads I have, the more slow the 
>calculation is.
>  > >
>  > >You need to tell us more about the platform you are running on. How 
>many
>  > >processors
>  > >are available? How many processors are in use? Is there any other 
>processes
>  > >running
>  > >that may interfere with your application? What kind of processors?
>  > >Operating system?
>  > >
>  > >You enter and exit a parallel region 16200*50 times. The 39 second 
>overhead
>  > >then
>  > >divides into 39s/(16200*50) = 48µs per fork-join which sounds a little 
>high
>  > >on a
>  > >modern system, but it is not outrageously high.
>  > >
>  > >/Nils
>  > >
>  >
>  >
>  >
>  > _______________________________________________
>  > Omp mailing list
>  > Omp at openmp.org
>  > http://openmp.org/mailman/listinfo/omp_openmp.org
>  >
>
>--
>
>--
>Thomas Clune, Ph.D.				301-286-4635 (W)
>Advanced Software Technology Group		301-286-1634 (F)
>Science Computing Branch, Code 931		<Thomas.L.Clune at nasa.gov>
>NASA GSFC
>






More information about the Omp mailing list