[Omp] aligned memory erratic behaviour when using parallel for,
Help!
Kang Su Gatlin
kanggatl at microsoft.com
Wed May 11 08:22:00 PDT 2005
Can't tell for sure what the problem is, but can you blit CDemo and get the correct semantics? In the second example you're blitting CDemo, whereas in the first example you're constructing it. Does your code work fine when compiled w/o /openmp?
Thanks,
Kang Su Gatlin
Visual C++ Program Manager
________________________________
From: Omp-bounces at openmp.org on behalf of Alchemist
Sent: Wed 5/11/2005 7:35 AM
To: Omp at openmp.org
Subject: [Omp] aligned memory erratic behaviour when using parallel for,Help!
Hi ,
I just want ask, I am using openmp for distributing the threads in my program. I am using the parallel for directive for a 2 iterations for loop (I could easily use sections directive here) each of those two iterations contain big chunk of calculations , Each iteration has the responsibility of calculating collision detection and response , a particle system + other. Each iteration has its own data. Now I found that by defining and initialising the objects by using STL vector the program runs predictably but not so if I use aligned malloc with 16bytes allignment .
ex. (For the STL version)
//Definition , where CDemo is our class
vector< CDemo > demo;
//first free the vector if it has something
//Initialisation
for(int i = 0; i < 2; i++)
demo.push_back(tempDemo); //Were temp demo has the initial
//values of the demo class defined
//as "CDemo tempDemo;"
//RunTime
#pragma omp parallel for
for(int i = 0; i < 2; i++)
demo[i].calculate();
But when I define the objects as a pointer and initialise the memory by using the "_aligned_malloc" and "aligned_free", to allocate aligned memory blocks, I am getting a more erratic behaviour from my program.
ex. (for the aligned memory version)
//Definition
CDemo *demo;
//free if demo if not NULL by "using _aligned_free(demo)"
//allocate memory
demo = (CDemo*)_aligned_malloc(2*sizeof(CDemo), 16);
//using crt optimised memcpy "#pragma function(memcpy)", for
//aligned memory chunks
for(int i = 0; i < 2; i++)
memcpy(&demo[i], &tempDemo, sizeof(CDemo));
//RunTime is the same as in the Vector example
//------------------------------------------------------------//
Am I doing something wrong? do I have to change my schedule chunk to 16 or something for the aligned memory part?
I am using the OpenMP that is provided with the VS2005 beta version, but I have compiled the program with Intel Compiler as well in WinXP Pro and I am getting similar results.
I would appreciate any help.
--
Best regards,
Emmanouil Hatjissavvas mailto:my at daedalus.plus.com
_______________________________________________
Omp mailing list
Omp at openmp.org
http://openmp.org/mailman/listinfo/omp_openmp.org
More information about the Omp
mailing list