[Omp] sparsematrix X vector multiplication performance decreases 30%
dkouroun at cc.uoi.gr
dkouroun at cc.uoi.gr
Sun Mar 27 14:30:55 PST 2005
Dear list,
I have found a serious openmp performance bug!
I do not know how to overcome and I do not know
why it behaves that way!
The problem concerns a sparse-matrix vector multiplication.
When executed sequentially it needs 4.5 seconds! In parallel
though it needs almost 6.0 seconds running in 4 CPUs.
Here is the code:
// suppose you have a sparse matrix M stored in CSR format
// pdata is the array containing the data
// prows is the array of rows of pdata such that row i
// of the corresponding full matrix starts from pdata[prows[i]]
// and ends at pdata[prows[i+1]-1].
// pcols is the array whose elements are the columns of the
// corresponding full matrix, e.g. pdata[i] has column pcols[i]
// we implement the multiplication of CSR sparse matrix M with vector b.
void sparsematrixXvector(double* pdata,
double* b,
double* result,
int* prows,
int* pcols,
int dim)
{
int i, j, index;
// since there are not i-dependences it seems natural to
// parallelize that i-loop like:
#pragma omp parallel for
for (i = 0; i < dim; i++)
{
int from = prows[i];
int to = prows[i+1];
double sum = 0.0;
for (index = from; index < to; index++)
{
j = pcols[index];
sum += pdata[index]*b[j];
}
result[i] = sum;
}
}
And that turns to be a big mistake since the code in 4CPUs executes
30% slower than it executes sequentially!
1:) Can anybody please let me know what is wrong
with the above implementation?
2:) Can it be fixed someway?
3:) Is this an OpenMP bug or something like that?
Thanks in advance!
Drosos.
More information about the Omp
mailing list