Questions on using OMP in LDL matrix factorization

General OpenMP discussion

Questions on using OMP in LDL matrix factorization

Postby VladimirProkopov » Mon Dec 03, 2007 10:55 am

Hello
I tried to use OpenMP to speed up my LDL factorization algorithm, but I got only +8% in speed (I have Intel Core2Duo processor and I have only about 75% ). I'm solving matrix equation [A]*{x}={f}, where matrix [A] is positive defined, banded and symmetrical (N - matrix dimension, r - band size). My code looks like:

for (int i=0;i<N;i++)
{
.......
#pragma omp parallel for
for (int j=max(0,i-r);j<i;j++)
{ ... }
.....
#pragma omp parallel for
for (int j=i+1;j<min(N,i+1+r);j++)
{ ......
for (int k=max(0,j-r);k<j;k++)
{ ..... }
} // end for j
......
} // end for i

And the main problem here is that I cannot make outer loop parallel, because each (i+1)-th iteration uses results from (i)-th iteration and furthermore r<<N (N can be from 1e6 to 1e10, and N/r can be from 10000 to 100). I suppose that in this case the "fork-join" procedures are the bottleneck, because they are executed too often.
Can anybody help me with this problem?
And is there some kind of "approach" to such problems?
VladimirProkopov
 

Re: Questions on using OMP in LDL matrix factorization

Postby lfm » Mon Dec 03, 2007 2:27 pm

You may also be running into bandwidth problems. Normally you would use some kind of blocking algorithm to maximize cache reuse, which might also allow you to parallelize at a higher level. For something like this you could consider using prepacked solvers like Intel MKL or maybe ATLAS. http://www.intel.com/cd/software/products/asmo-na/eng/266858.htm has some information.
-- Larry
lfm
 
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Re: Questions on using OMP in LDL matrix factorization

Postby VladimirProkopov » Tue Dec 04, 2007 3:14 am

Thank you, Larry

but unfortunately I cannot use MKL or other similar package, because of the matrix size. I'm limited with using ordinary computer (like Intel Core2Duo with about 2Gb of RAM) and the dimensions of [A] N x r = 1e7 x 1e3 gives me 1e10 doubles = 76Gb, which can only be stored on HDD - so I'm forced to use my own algorythms, working with matrix blocks, loaded from HDD into RAM. (I already contacted with Intel and they said that they cannot offer me anything useful in this problem)
Could you, please, explain to me what You meant by "some kind of blocking algorithm to maximize cache reuse, which might also allow you to parallelize at a higher level"?
VladimirProkopov
 

Re: Questions on using OMP in LDL matrix factorization

Postby lfm » Sat Dec 29, 2007 10:20 am

You might find this helpful, it illustrates something like what I was thinking about:
http://developers.sun.com/solaris/articles/FAST/lu_content.html


Last bumped by Anonymous on Sat Dec 29, 2007 10:20 am.
lfm
 
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB


Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot] and 4 guests

cron