[Omp] First Touch initialization
Francisco Jesús Martínez Serrano
franjesus at gmail.com
Wed Mar 8 03:06:13 PST 2006
How does first-touch initialization work?
According to
http://www.ncsa.uiuc.edu/UserInfo/Consulting/Tips/Memory_Placement.html
Initializing shared arrays in parallel at the very beginning of the program
will distribute the contents of each array according to the access pattern
hence, in NUMA machines access will be much faster since it's local-node.
We have tried it and it works indeed (Intel Fortran compiler v9 on 4-way
Opteron),
but I don't understand why.
According to my C experience, one can do pointer arithmetic and so on (like
in
elem1000=*(mat+1000);), because when an array is allocated, it is allocated
in a contiguous single block. However from the memory map of the kernel,
each node has a contiguous block of addresses, like in:
0·····node0·····1GB|·····node1·····2GB|·····node2·····3GB|·····node3·····4GB
I guess I'm missing something fundamental about NUMA here. Otherwise
an array can't be partitioned among nodes and still being accessible by
the simple pointer method (which I believe is what Fortran does).
Any clues?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openmp.org/pipermail/omp/attachments/20060308/3fb8cf59/attachment.html
More information about the Omp
mailing list