## Newbie: question about OpenMP, Visual Studio and performance

General OpenMP discussion

### Newbie: question about OpenMP, Visual Studio and performance

Good morning all.
Recently I've tried to write some parallel code using openMP as supported by Microsoft Visual Studio 2008.
The first test I've done is about the matrix product, but I wonder about the strange performances I've obtained.
I've measured the time spent in evaluation and obtained the following results:

A. Product of square matrices (700x700 sized)
1.[Debug] Sequential_time_spent = 8600 ms (mean value) (no relevant percentual changes if size changes)
Parallel_time_spent = 7300 ms (mean value)

1.[Release] Sequential_time_spent = 3900 ms (mean value) (no relevant percentual changes if size changes)
Parallel_time_spent = 5000 ms (mean value)

Strange!!! Maybe I'm wrong with the use of parallel clauses of openMP.

Every matrix is created in the heap.
The code is really simple:

Code: Select all
typedef struct _Matrix
{
int r;
int c;
double **m;
} Matrix;

/////////////////////////////
//Sequential product
/////////////////////////////
void product(Matrix* m1, Matrix* m2, Matrix* m3)
{
for(int i=0; i<m1->r; i++)
{
for(int j=0; j<m2->c; j++)
{
m3->m[i][j] = 0.0;
for(int k=0; k<m1->c; k++)
{
//Sleep(1);
m3->m[i][j] += m1->m[i][k] * m2->m[k][j];
}
}
}
}

/////////////////////////////
//Parallel product
/////////////////////////////
void product_p(Matrix* m1, Matrix* m2, Matrix* m3)
{
#pragma omp parallel for
for(int i=0; i<m1->r; i++)
{
for(int j=0; j<m2->c; j++)
{
m3->m[i][j] = 0.0;
for(int k=0; k<m1->c; k++)
{
//Sleep(1);
m3->m[i][j] += m1->m[i][k] * m2->m[k][j];
}
}
}//end parallel for
}
Southerlies

Posts: 2
Joined: Sat Jun 21, 2008 3:21 am

### Re: Newbie: question about OpenMP, Visual Studio and performance

I don't have a copy of Microsoft Visual Studio 2008 to use. But just for fun, I ran your program on an Intel chipset running Solaris using the Sun Studio compiler. For a 700x700 matrix allocated on the heap the sequential run took 3.658442 seconds and the parallel run using 4 threads took 0.952108 seconds. A totally apples and oranges comparison, but at least it shows that your code works and on at least some hardware/software combination and shows a nice speedup.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: Newbie: question about OpenMP, Visual Studio and performance

ejd wrote:I don't have a copy of Microsoft Visual Studio 2008 to use. But just for fun, I ran your program on an Intel chipset running Solaris using the Sun Studio compiler. For a 700x700 matrix allocated on the heap the sequential run took 3.658442 seconds and the parallel run using 4 threads took 0.952108 seconds. A totally apples and oranges comparison, but at least it shows that your code works and on at least some hardware/software combination and shows a nice speedup.

Many thanks for your test on those apples and oranges.
Surely I have missed to specify informations about the architecture I tried on ... and this has been the mistake. Maybe.

In fact, while omp detected the presence of two processors, I only have a pentium 4 with hyperthreading; and I think this could be a reason for the 'speed-down' I measured.
Trying the same code on a core2 duo, i.e. a real two processors machine, I've found some 'expected' result.

Nevertheless another question arises; and it is pertinent to how omp behaves in presence of hyperthreading machines. I read about a speedup even if the processor is hyperthreaded, but this experience showed me that is not really true. This could be not a problem if only I could detect, using omp calls, if the architecture is hyperthreaded or really multi-processor; but ... is it possible to do that?
Southerlies

Posts: 2
Joined: Sat Jun 21, 2008 3:21 am

### Re: Newbie: question about OpenMP, Visual Studio and performance

You can see a speedup on a hyperthreaded machine - in some cases. It is the usual thing - "your mileage my vary depending on ...". Unfortunately, there is no way using OpenMP to distinguish between dual cores or dual chips or .... You have to remember that the OpenMP spec is just that - a specification and not a standard. It is put together by a committee made up of representatives from the different vendors (and more recently, representatives from user groups). As such, things that are put into the spec, are things that the majority agree with. While Intel chips are out there in great quantity, they only have one vote. Most other vendors do not have hyperthreading, so they see no reason to support OpenMP calls to determine whether a chip is hyperthreaded. That said, there has been a great deal of discussion about trying to formalize a way that a user can determine what the configuration is that he is running on, so a program can try to "map" to it and work (perform) well.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am