[Omp] sparsematrix X vector multiplication performance decreases 30%

dkouroun at cc.uoi.gr dkouroun at cc.uoi.gr
Sun Mar 27 14:30:55 PST 2005


Dear list,

I have found a serious openmp performance bug!
I do not know how to overcome and I do not know
why it behaves that way! 

The problem concerns a sparse-matrix vector multiplication.
When executed sequentially it needs 4.5 seconds! In parallel
though it needs almost 6.0 seconds running in 4 CPUs.

Here is the code:

// suppose you have a sparse matrix M stored in CSR format
// pdata is the array containing the data
// prows is the array of rows of pdata such that row i
// of the corresponding full matrix starts from pdata[prows[i]]
// and ends at pdata[prows[i+1]-1].
// pcols is the array whose elements are the columns of the 
// corresponding full matrix, e.g. pdata[i] has column pcols[i]

// we implement the multiplication of CSR sparse matrix M with vector b.

void sparsematrixXvector(double* pdata, 
                         double* b,
                         double* result, 
                         int* prows, 
                         int* pcols,
                         int dim)

{
  int i, j, index;
  // since there are not i-dependences it seems natural to 
  // parallelize that i-loop like:
#pragma omp parallel for
  for (i = 0; i < dim; i++)
  {
     int from = prows[i];
     int to = prows[i+1];
     double sum = 0.0;
     for (index = from; index < to; index++)
     {
       j = pcols[index];
       sum += pdata[index]*b[j];
     }
     result[i] = sum;
  }
}
    
And that turns to be a big mistake since the code in 4CPUs executes
30% slower than it executes sequentially!

1:) Can anybody please let me know what is wrong 
    with the above implementation?
2:) Can it be fixed someway?
3:) Is this an OpenMP bug or something like that?

Thanks in advance!
Drosos.





More information about the Omp mailing list