Strange parallelization behaviour!?

General OpenMP discussion

Strange parallelization behaviour!?

Postby hildy » Tue Jul 01, 2008 6:30 am

I have encountered a strange behaviour in the program I parallelize. To benchmark the performance of the parallelized function I have duplicated the function 4 times (one for each core in my PC). I then, where the function call is being executed, call the different functions in consecutive order and changing the omp_set_num_threads() from 1 to 4, and test that it has been changed in between. I then use gprof to profile the run. The strange results I acquire is that the most time consuming function call is the 2 procs call followed by 1 procs, 4 procs and the most effective is the 3 procs. This is a very strange result, atleast in my opinion. So I wonder if using gprof is unreliable (and if there is a better way?), and if there could be any reasonably explenation for this result?!

I have tried with both omp_set_dynamic enabled and disabled. I have even tried changing the order of the function calls if there might be some kind of cpu or motherboard algorithm making the following calls more and more effective. None of the inputs to the function is identical. They are all copies made before calling any of the functions so there should be no memory allocation difference.

If anybody has a clue or just want to speculate a little it would be welcome. :)
Thanks in advance
/Hildy
hildy
 
Posts: 14
Joined: Mon Jun 30, 2008 2:39 am

Re: Strange parallelization behaviour!?

Postby ejd » Tue Jul 01, 2008 8:35 am

Unless you are binding threads to processors it should be even more variable than that. I don't know what OS you are using (and even if I did it would depend on the workload on the system and the scheduling algorithm) as to which thread gets which piece of work and is run on which processor. Binding threads to processors would at least reduce some of the variability. The other part, is that depending on whether you are running on a multiprocessor machine or a multicore machine, how the cache is shared (or not) between processors, and whether it is a "true" SMP or numa machine will cause some of that variability. This is all part of the "fun" of parallelism.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am


Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 3 guests