答复: [Omp] Could you help me about this problem?

宋刚 hevensun at 126.com
Wed Jan 3 23:41:48 PST 2007


Hi,Ruud,
 
-----ÓʼþÔ­¼þ-----
·¢¼þÈË: Ruud.Vanderpas at Sun.COM [mailto:Ruud.Vanderpas at Sun.COM] 
·¢ËÍʱ¼ä: 2007Äê1ÔÂ4ÈÕ lily 15:13
ÊÕ¼þÈË: ËΞÕ
³­ËÍ: omp at openmp.org
Ö÷Ìâ: Re: [Omp] Could you help me about this problem?
 
Hi David,
 
>    We used the OpenMP to parallel a subroutine(only a loop) of a
program 
> recently. When I set different thread number, the time they took is
just 
> like the below:
> 
>    1 thread:  20s
> 
>    2 threads: 11s
> 
>    3 threads: 10s
> 
>    4 threads:  7s
> 
>    5 threads: 11s
> 
>    6 threads: 19s
> 
>    8 threads: 23s
> 
>   The machine we used is a smp node(dual core) of a cluster. What I 
> can't understand is that why the time between 2 threads and 3 threads
is 
> so small?
 
I'm afraid we need more information to say something sensible
about this, so I'm going to ask some questions first.
 
- I assume the above timings are the _elapsed_ time for the
   application (or just the subroutine), not the CPU time? In
   other words, how did you get these numbers?
Just in the subroutine, I get the time like this.
 ¡­¡­
 real e,etime,t(2)
 e = etime(t)
do .....
........
end do
e = etime(t)
print *,¡¯Elapsed time is :¡¯,e,
 ¡­¡­
- Was your job the only one running, or could there have been
   interference from other jobs?
 The time I calculate is only in a subroutine.
 
- Can you reproduce it? That is, if you run the same experiments
   again, do you get comparable timings?
The result can be reproduced again. I tested it three more times.
- To me, the surprising number is the one on 4 threads. If I
   understand it correctly, you only have 2 cores to run your
   program on. Based on that, it seems you get good scalability
   (but keep my previous question in mind) going from 1 to 2
   threads. I would expect the 4 thread number to be in the ~10s
   range (at best) though, not 7s
Is there something wrong with the time I used? Previously, I also wrote
a OpenMP program to compress the data. On a SMP(2 cores) platform, when
4 threads the performance is the best, the speed up sometimes can go to
5.89, I didn¡¯t know the reason. On another platform, a SMP server(4-way
Dual core, there are 8 cores), the best performance can be got when
there are 12 threads, and the speed up can also be to about to 12. The
time  I got is like below:
time gzip data1
time ./ompgzip data1
   Is it right?
   Thank you very much!!
Yours sincerely,
David 
Kind regards,
Ruud
----------------------------------------------------------------
Senior Staff Engineer             Email: ruud.vanderpas at sun.com
Systems Group                     Phone: +31-33-4515000 (x15920)
Sun Microsystems                  Fax  : +31-33-4515001
----------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openmp.org/pipermail/omp/attachments/20070103/618bae4f/attachment-0001.html


More information about the Omp mailing list