I suspect that there is some overhead in creating and managing the threads, and this exceeds the time saving for the simple problem.
Did I use OpenMP correctly, is the overhead issue known and real, and is there any way to speed things up even more? Is this the best structure for this kind of problem?
Well, I thought it would be simple. I added to my C++ code
retint = setenv( OMP_WAIT_POLICY, passive, 1 ); // try to speed up OpenMP stuff
and got the error message
Error 1 error C2065: 'OMP_WAIT_POLICY' : undeclared identifier c:\synopsysv14\synopsys.cpp 725 1 SYNOPSYS200
My headers include
What did I do wrong? Can I set this from a Fortran code instead of from C++?
AMD FX-8120 Eight-Core Processor
CORE 0 1 2 3 4 6 8
Time 10.0 5.41 2.88 2.47 2.49 2.51 2.54
dondilworth wrote:I guess that going through the parallel loops is very fast, but when I come in from the top with another batch of data maybe the system has to set up all the threads all over again. This is only a guess, since I don't know what's going on in the background -- and that's why I submitted the question. Or does it only have to expend the overhead once?
I suggest you set the environment variable before starting the program i.e. in the command line/shell, since the OpenMP environment/threads would be already set up by the time the first line of code of the program is executed.
This is helpful, but being a dummy I don't know how to do that. Where, exactly, in the Property Pages do I change what to set that environment variable?
It might be useful to use OMP_GET_WTIME() to figure out how much time is spent in the parallel region versus the rest of the code.
Users browsing this forum: Google [Bot], Yahoo [Bot] and 7 guests