[Omp] slow performance
andrew wang
mcwang88 at hotmail.com
Thu Dec 16 17:53:54 PST 2004
I simply get system time and do a minus. Any better way to do it?
(void) time(&t2);
printf("Total time = %d seconds\n", (int) (t2-t1));
>From: Thomas.L.Clune at nasa.gov
>Reply-To: Thomas.L.Clune at nasa.gov
>To: "andrew wang" <mcwang88 at hotmail.com>
>CC: smeds at pdc.kth.se, omp at openmp.org
>Subject: Re: [Omp] slow performance
>Date: Thu, 16 Dec 2004 08:45:40 -0500
>
>
>Just being paranoid, but how are you measuring the time? Many
>timers total the time among all child threads - giving results just
>like yours even though things are actually running substantially
>faster.
>
>- Tom
>
>
>
>andrew wang writes:
> > Hi All,
> >
> > Sorry, forget to tell you the system info:
> >
> > Compaq AlphaServer SC45 with 44 nodes, each node comprising of four
>1GHz
> > Alpha processors with 1GB memory. I am uing only one node with
>different
> > thread number (1-3). Compaq C compiler supports openmp spe 1.0. The os
> > should be true 64 Unix.
> >
> >
> > I also try to compile same program on Intel C compiler 8.0, and run it
>on
> > two processor win2k server. Here is the running result:
> >
> > D:\omp\test>try 2
> > omp_get_num_procs=2
> > Parallel region time=12 seconds
> > Total time = 14 seconds
> > D:\omp\test>try 1
> > omp_get_num_procs=2
> > Parallel region time=12 seconds
> > Total time = 14 seconds
> >
> > seems there is not much difference, same problem.
> >
> >
> > As somebody point out, my program actually do not much inside parallel
> > region, so i increase the inner loop from 50->500,
> >
> > ....
> > for (kk=0; kk< 500; kk ++){
> >
> >
> > x = (kk+0.5)*step;
> > sum += 4.0/(1.0+x*x); // more complicated calculation here.
> > }
> > ....
> >
> >
> > here is the result:
> >
> >
> > d:\omp\test>try 1
> > omp_get_num_procs=2
> > Parallel region time=83 seconds
> > Total time = 87 seconds
> >
> > D:\omp\test>try 2
> > omp_get_num_procs=2
> > Parallel region time=66 seconds
> > Total time = 66 seconds
> >
> > So the perfromance got enhanced for 2 threads. If this is the case, how
> > should I parallelize such program? Because in my real program, I can
>only
> > parallize the particular region only.
> >
> >
> > Thanks
> > Andrew
> >
> > >From: Nils Smeds <smeds at pdc.kth.se>
> > >Reply-To: smeds at pdc.kth.se
> > >To: "andrew wang" <mcwang88 at hotmail.com>
> > >CC: omp at openmp.org
> > >Subject: Re: [Omp] slow performance Date: Wed, 15 Dec 2004 17:21:50
>+0100
> > >
> > >
> > >mcwang88 at hotmail.com said:
> > > > But to my big suprise, I see that the result is quite different
>from
> > >what I
> > > > can imagine. The more threads I have, the more slow the
>calculation is.
> > >
> > >You need to tell us more about the platform you are running on. How
>many
> > >processors
> > >are available? How many processors are in use? Is there any other
>processes
> > >running
> > >that may interfere with your application? What kind of processors?
> > >Operating system?
> > >
> > >You enter and exit a parallel region 16200*50 times. The 39 second
>overhead
> > >then
> > >divides into 39s/(16200*50) = 48µs per fork-join which sounds a little
>high
> > >on a
> > >modern system, but it is not outrageously high.
> > >
> > >/Nils
> > >
> >
> >
> >
> > _______________________________________________
> > Omp mailing list
> > Omp at openmp.org
> > http://openmp.org/mailman/listinfo/omp_openmp.org
> >
>
>--
>
>--
>Thomas Clune, Ph.D. 301-286-4635 (W)
>Advanced Software Technology Group 301-286-1634 (F)
>Science Computing Branch, Code 931 <Thomas.L.Clune at nasa.gov>
>NASA GSFC
>
More information about the Omp
mailing list