why not OMP improve the perfomance of large array programme?

General OpenMP discussion

why not OMP improve the perfomance of large array programme?

Postby suyingych » Tue Aug 26, 2008 2:13 am

Hi all,
Under Windows OS, I wrote a programme (C++) to initialize an integer Array.Stack reserve size set 5000000, But if I add OpenMP, it takes almost the same time to get initialized.
What could be the problem?

The programme is given below,each section I didn't use any same variables.

#include <stdio.h>
#include <iostream>
#include <omp.h>

typedef struct _SIGNAL_SAMPLES
SCplx SumReturnWave[1000000]; //和通道采样
SCplx AziReturnWave[1000000]; //方位差通道采样
SCplx EleReturnWave[1000000]; //俯仰差通道采样

typedef struct _SCPLX
float re; //实部
float im; //虚部
} SCplx;

SCplx m_pMatchWave[1000000];
SCplx m_pMatchWaveSum[1000000];
SCplx m_pMatchWaveAzi[1000000];
SCplx m_pMatchWaveEle[1000000];

SCplx m_pTempWave[3000000]; //su note
SCplx m_pTempWaveSum[1000000]; //su add
SCplx m_pTempWaveAzi[1000000]; //su add
SCplx m_pTempWaveEle[1000000]; //su add

DWORD dwStartmemeset =GetTickCount();
#pragma omp parallel sections
#pragma omp section
memcpy(m_pMatchWaveSum, m_pMatchWave, MAXSAMPLES*sizeof(SCplx));
memcpy(m_pTempWaveSum, m_pPCOutput.SumReturnWave, nTempCareAreaSampleCountSum*sizeof(SCplx));

#pragma omp section
memcpy(m_pMatchWaveAzi, m_pMatchWave, MAXSAMPLES*sizeof(SCplx));
memcpy(m_pTempWaveAzi, m_pPCOutput.AziReturnWave, nTempCareAreaSampleCountAzi*sizeof(SCplx));
#pragma omp section
memcpy(m_pMatchWaveEle, m_pMatchWave, MAXSAMPLES*sizeof(SCplx));
memcpy(m_pTempWaveEle, m_pPCOutput.EleReturnWave, nTempCareAreaSampleCountEle*sizeof(SCplx));

DWORD t_memeset = GetTickCount() - dwStartmemeset;
FILE* pFiletimesuomptest1 = fopen("E:\\080707ompmemmemfft_3duan204040.txt" , "a+");
fprintf(pFiletimesuomptest1, "%d,%d\n",t_memeset,a111);

Thank You,
Posts: 5
Joined: Thu Jul 17, 2008 8:04 am

Re: why not OMP improve the perfomance of large array programme?

Postby lfm » Thu Aug 28, 2008 12:03 pm

Two likely reasons:
1. It doesn't take very long to initialize 1,000,000 words.
2. memcpy is probably already using the entire bandwidth of the machine so running multiple threads doesn't help, in fact, probably hurts.

Look at benchmarks like stream to understand this better. It has a lot to do with hardware and memory bandwidth. Some machines might see a speedup. However, parallel overhead might be too high even if the memory bandwidth is there.
Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Return to Using OpenMP

Who is online

Users browsing this forum: Yahoo [Bot] and 5 guests