The OpenMP 4.0 API Specification is released with Significant New Standard Features
The OpenMP 4.0 API supports the programming of accelerators, SIMD programming, and better optimization using thread affinity
The OpenMP Consortium has released OpenMP API 4.0, a major upgrade of the OpenMP API standard language specifications. Besides several major enhancements, this release provides a new mechanism to describe regions of code where data and/or computation should be moved to another computing device.
Bronis R. de Supinski, Chair of the OpenMP Language Committee, stated that “OpenMP 4.0 API is a major advance that adds two new forms of parallelism in the form of device constructs and SIMD constructs. It also includes several significant extensions for the loop-based and task-based forms of parallelism already supported in the OpenMP 3.1 API.”
The 4.0 specification is now available on the »OpenMP Specifications page.
Standard for parallel programming extends its reach
With this release, the OpenMP API specifications, the de-facto standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include DSPs, real time systems, and accelerators. The OpenMP API aims to provide high-level parallel language support for a wide range of applications, from automotive and aeronautics to biotech, automation, robotics and financial analysis.
New features in the OpenMP 4.0 API include:
· Support for accelerators. The OpenMP 4.0 API specification effort included significant participation by all the major vendors in order to support a wide variety of compute devices. OpenMP API provides mechanisms to describe regions of code where data and/or computation should be moved to another computing device. Several prototypes for the accelerator proposal have already been implemented.
· SIMD constructs to vectorize both serial as well as parallelized loops. With the advent of SIMD units in all major processor chips, portable support for accessing them is essential. OpenMP 4.0 API provides mechanisms to describe when multiple iterations of the loop can be executed concurrently using SIMD instructions and to describe how to create versions of functions that can be invoked across SIMD lanes.
· Error handling. OpenMP 4.0 API defines error handling capabilities to improve the resiliency and stability of OpenMP applications in the presence of system-level, runtime-level, and user-defined errors. Features to abort parallel OpenMP execution cleanly have been defined, based on conditional cancellation and user-defined cancellation points.
· Thread affinity. OpenMP 4.0 API provides mechanisms to define where to execute OpenMP threads. Platform-specific data and algorithm-specific properties are separated, offering a deterministic behavior and simplicity in use. The advantages for the user are better locality, less false sharing and more memory bandwidth.
· Tasking extensions. OpenMP 4.0 API provides several extensions to its task-based parallelism support. Tasks can be grouped to support deep task synchronization and task groups can be aborted to reflect completion of cooperative tasking activities such as search. Task-to-task synchronization is now supported through the specification of task dependency.
· Support for Fortran 2003. The Fortran 2003 standard adds many modern computer language features. Having these features in the specification allows users to parallelize Fortran 2003 compliant programs. This includes interoperability of Fortran and C, which is one of the most popular features in Fortran 2003.
· User-defined reductions. Previously, OpenMP API only supported reductions with base language operators and intrinsic procedures. With OpenMP 4.0 API, user-defined reductions are now also supported.
· Sequentially consistent atomics. A clause has been added to allow a programmer to enforce sequential consistency when a specific storage location is accessed atomically.
This represents collaborative work by many of the brightest in industry, research, and academia, building on the consensus of 26 members. We strive to deliver high-level parallelism that is portable across 3 widely-implemented common General Purpose languages, productive for HPC and consumers, and delivers highly competitive performance. I want to congratulate all the members for coming together to create such a momentous advancement in parallel programming, under such tight constraints and industry challenges.
With this release, the OpenMP API will move immediately forward to the next release to bring even more usable parallelism to everyone. – Michael Wong, CEO OpenMP ARB.
- Neural Network Implementation Using CUDA and OpenMP
- New video on OpenMP: Dr Clay Breashears, Intel
- OpenMP Programming on Intel R Xeon Phi TM Coprocessors: An Early Performance Comparison
Tim Cramer, Dirk Schmidl, Michael Klemmy, Dieter an Mey
Recent publications of interest regarding OpenMP on the web:
- new release of GraphicsMagick Image Processing System
- Portuguese tutorial about OpenMP from Paulo Penteado, Post doctoral researcher at Departamento de Astronomia, Instituto de Astronomia, Geofísica e Ciências Atmosféricas, Universidade de São Paulo (IAG/USP).
- Debbie Greenstreet blogs about OpenMP and multicore systems on Texas Instruments Engineer to Engineer Multicore Mix.
24 vendors and research organizations now collaborating on developing shared-memory parallel programming model
Champaign, Illinois — May 2, 2013 — The Barcelona Supercomputing Center (BSC) has joined the OpenMP ARB, a group of leading hardware and software vendors and research organizations creating the standard for the most popular shared-memory parallel programming model in use today.
“We are proud to share our 15 years’ experience developing support for parallel programming models within the OpenMP community.”, says Mateo Valero, director of BSC, “Our researchers have been involved in OpenMP since the beginning, through cOMPunity. BSC has participated in the definition of the tasking model, lately with the inclusion of task dependences.”
“I look forward to BSC continuing their excellent technical contribution from the past into the future.”, says Michael Wong, OpenMP CEO.
Barcelona Supercomputing Center is an HPC research center that holds a significant group of Computer Science researchers and closely collaborates with IT Industry. Its Computer Science research covers all levels from the computer architecture to the parallel applications.
The OpenMP Architecture Review Board (ARB) now has 13 permanent members and 11 auxiliary members. Permanent members are vendors creating products for OpenMP. These are AMD, CAPS-Enterprise, Convey Computer, Cray, Fujitsu, HP, IBM, Intel, NEC, NVIDIA, Oracle Corporation, The Portland Group, Inc., and Texas Instruments. Auxiliary members are organizations with an interest in the standard but that do not sell OpenMP products. They are ANL, ASC/LLNL, BSC, cOMPunity, EPCC, LANL, NASA, ORNL, RWTH Aachen University, Sandia National Lab and the Texas Advanced Computing Center.
Michael Wong, CEO of the OpenMP Architecture Review Board (ARB), comments in his blog about the forthcoming OpenMP 4.0 specifications:
So much has been happening in OpenMP since SC 12 that we I hope to capture it all in this post while flying back from a C++ Standard meeting.
When we last spoke, you heard that OpenMP has introduced a Technical Report process to improve its agility at issuing interim specifications, and more importantly to obtain user feedback. We used that process to introduce TR1 for accelerator support. We also released Release Candidate 1 which had 31 feature/defect fixes.
Since then, we had the Houston F2F meeting in January 2013, where we gathered to complete the work of
- Incorporating feedback for accelerators and strengthen NVIDIA support where synchronization between teams are not implicit
- Complete work on cancellation
- Improve taskgroup support
- Improve Fortran 2008 support
- Fully specify affinity
- Improved SIMD
- Generalized Tooling and Debugger support
You can read it all »here.
9th International Workshop on OpenMP — September 16-18, 2013
The »International Workshop on OpenMP (IWOMP) is an annual workshop dedicated to the promotion and advancement of all aspects of parallel programming with OpenMP. It is the premier forum to present and discuss issues, trends, recent research ideas and results related to parallel programming with OpenMP. The international workshop affords an opportunity for OpenMP users as well as developers to come together for discussions and sharing new ideas and information on this topic.
The deadline for paper submissions has been extended to May 10th.
IWOMP 2013 will be a three-day event. The first day will consist of tutorials focusing on topics of interest to current and prospective OpenMP developers, suitable for both beginners as well as those interested in learning of recent developments in the evolving OpenMP standard. The second and third days will consist of technical papers and panel sessions during which research ideas and results will be presented and discussed.
Go to »iwomp.org for more information.
Here are three recently published papers regarding OpenMP:
- C++ announced a proposal for leveraging the OpenMP infrastructure for language level parallelisation.
- A Paper on “Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation” has been submitted to arXiv.
- A portable OpenMP runtime library based on MCAPI/MRAPI has been announced.
Release Candidate 2 of the OpenMP 4.0 API specifications currently under development is now available for public discussion.
In addition to a number of corrections and clarifications to the specifications, Release Candidate 2 includes the following major enhancements:
- Initial accelerators support: Device Data Environments (p16), target constructs (p68: target, target data, target update, declare target, teams, distribute; p151: map clause; and associated runtime routines (p191).)
- Task dependency support through the new depend clause. (p91)
- Initial error model support through cancel and cancellation point constructs to request cancellation of specified region types and to declare a user-defined cancellation point to check for cancellation requests.. (Section 2.13, p116: Cancellation Constructs)
- Support for Array Sections in C and C++ as well as adding sectioning support for Fortran. (Section 2.4, p36: Array Sections)
- Extends declare simd to allow multiple declarations. (p64)
- New environment variable OMP_DISPLAY_ENV instructs the runtime to display the OpenMP version number and initial values during initialization. (p219)
- Additional enhancements to support Fortran 2003.
These are in addition to enhancements introduced in RC1: thread affinity, initial support for Fortran 2003, SIMD constructs to vectorize both serial and parallelized loops, TASKGROUP, user-defined reductions, and sequentially consistent atomics.
The OpenMP ARB plans to follow this public discussion period with the finalized full 4.0 API specifications later this year.
The 4.0 Release Candidate API specifications (4.0 RC2, 4.0 RC1) and the Technical Report (TR1) PDFs can be downloaded from the »OpenMP Specifications Page.