Simulating exit times of random walks in parallel: Slow down

General OpenMP discussion

Simulating exit times of random walks in parallel: Slow down

Postby PaulePanter » Fri Dec 09, 2011 4:15 pm

Dear OpenMP folks,


I am simulating how long random walks (or their area) stay negative. Each random walk has to be simulated independently so doing that in parallel sounds like a reasonable thing to do.

Adapting my current implementation was easy enough since I only wanted to run the outer for loop in parallel.

Code: Select all
#include <omp.h>
#include <stdio.h>

int main() {
  int n = 1000000;
  int rep = 1000000;

  double x = 0;

  double count[n + 2];

  /* to be comparable, but probably useless since samples
     added together in different order when run in parallel */
  srand(42);

#pragma omp parallel for
  for (int i = 0; i < rep; i++) {
    int len = 0;
    double s = 0;
    double a = 0;

    do {
      x = ((rand() % 2 ) == 0) ? 1 : -1;

      s = s + x;
      a += s;
    } while ((a < 0) && (len++ < n));

    count[len]++;
  }
  /* simple check for comparison */
  printf("%f", count[0]);
}


I was surprised though that this did not give any speed up at all but even took longer.

Code: Select all
$ gcc -std=gnu99 -fopenmp -o for-loop test.c # removed pragma line, i. e. without OpenMP
test.c: In function ‘main’:
test.c:12:3: warning: implicit declaration of function ‘srand’ [-Wimplicit-function-declaration]
test.c:20:7: warning: implicit declaration of function ‘rand’ [-Wimplicit-function-declaration]
$ time(./for-loop)
507677.000000
real    24m8.464s
user    23m56.666s
sys     0m0.664s


Code: Select all
$ gcc -std=gnu99 -fopenmp -o openmp-for-loop test.c # with OpenMP
test.c: In function ‘main’:
test.c:12:3: warning: implicit declaration of function ‘srand’ [-Wimplicit-function-declaration]
test.c:21:7: warning: implicit declaration of function ‘rand’ [-Wimplicit-function-declaration]
$ date; time(./openmp-for-loop); date
Fr 9. Dez 22:30:54 CET 2011
505596.000000
real   93m55.677s
user   97m43.782s
sys   65m19.833s
Sa 10. Dez 00:04:50 CET 2011


I read [url=http://www.viva64.com/en/a/0054/#ID0EXSAE]several things on the Web[\url] but could not figure out a possible
improvement.

[url=http://stackoverflow.com/questions/7135840/openmp-c-parallelizing-nested-for-loop-slow]One thread on Stack Overflow[\url] mentions that “small” loops could be a problem. If that is true that could be indeed a reason since half of the time the loop is quit after the first sample of rand() because half of the time it gets positive right away.

Can someone confirm this? Does someone have an idea to speed up that simulation? Or should I just run to instances of the program to utilize both cores?

Please note that the test programs from the WWW work and therefore this should not be a configuration problem. `top` also shows 167.7% CPU usage which means two cores should be used.

I also could not attach my source file and the resulting binaries for comparison since the extensions were not allowed.


Thanks,

Paul

Code: Select all
$ more /proc/version
Linux version 3.1.0-1-amd64 (Debian 3.1.4-1) (waldi@debian.org) (gcc version 4.6.2 (Debian 4.6.2-4) ) #1 SMP Tue Nov 29 19:43:43 UTC 2011

$ dpkg --print-architecture
i386

$ gcc --version # same for GOMP
gcc (Debian 4.6.2-5) 4.6.2

$ more /proc/meminfo
MemTotal:        1929980 kB
MemFree:           81224 kB
Buffers:          108524 kB
Cached:           356276 kB
SwapCached:        35016 kB
Active:          1187840 kB
Inactive:         522988 kB
[…]

$ more /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 107
model name      : AMD Athlon(tm) X2 Dual Core Processor BE-2350
stepping        : 2
cpu MHz         : 2100.000
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dno
wext 3dnow rep_good nopl extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8
_legacy 3dnowprefetch lbrv
bogomips        : 4202.43
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 107
model name      : AMD Athlon(tm) X2 Dual Core Processor BE-2350
stepping        : 2
cpu MHz         : 2100.000
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dno
wext 3dnow rep_good nopl extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8
_legacy 3dnowprefetch lbrv
bogomips        : 4202.43
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps
PaulePanter
 
Posts: 1
Joined: Fri Dec 09, 2011 1:09 pm

Re: Simulating exit times of random walks in parallel: Slow

Postby ftinetti » Tue Dec 13, 2011 1:47 pm

Hi,

Function rand() is known to be a problem for threaded parallelism (including, of course, OpenMP threads). Please take a look at several other threads in this forum from some time ago. Unfortunately, I do not know any "threaded-parallel" rand() with good performance. The Linux man pages recommend the rand_r() function as opposed to the "rand() is not reentrant or thread-safe", and I was playing around a little bit and the performance is rather poor, unfortunately.

HTH.
ftinetti
 
Posts: 575
Joined: Wed Feb 10, 2010 2:44 pm


Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 10 guests