Industry feedback drives new generation open standard for cross-platform parallel programming with increased flexibility, functionality and performance
November 18th 2013 – SC13 - Denver, CO –-The Khronos™ Group today announced the ratification and public release of the finalized OpenCL™ 2.0 specification. OpenCL 2.0 is a significant evolution of the open, royalty-free standard that simplifies cross-platform, parallel programming. With an enhanced execution model and a subset of the C11 and C++11 memory model, synchronization and atomic operations, OpenCL now enables a significantly richer range of algorithms and programming patterns to be easily accelerated with improved performance. Significant feedback from the developer community was incorporated into the final specification, following its provisional release in July. The OpenCL 2.0 specifications are available at www.khronos.org/opencl/.
“Khronos received significant and thoughtful developer feedback from the provisional release of OpenCL 2.0, much of which has been adopted, or will be merged with emerging hardware capabilities as this state-of–the-art parallel programming platform continues to evolve,” said Neil Trevett, chair of the OpenCL working group, president of the Khronos Group and vice president of mobile content at NVIDIA. “OpenCL continues to gather momentum on desktop, mobile and embedded devices, including providing a unified programming environment for dynamically balancing diverse CPU, GPU, DSP and hardware resources in mobile SOCs for advanced use cases ranging from vision processing for Augmented Reality to physics simulation for mobile gaming.”
OpenCL 2.0 updates and additions include:
Shared Virtual Memory
Host and device kernels can directly share complex, pointer-containing data structures such as trees and linked lists, providing significant programming flexibility and eliminating costly data transfers between host and devices.
Nested Parallelism
Device kernels can enqueue kernels to the same device with no host interaction, enabling flexible work scheduling paradigms and avoiding the need to transfer execution control and data between the device and host, often significantly offloading host processor bottlenecks.
Generic Address Space
Functions can be written without specifying a named address space for arguments, especially useful for those arguments that are declared to be a pointer to a type, eliminating the need for multiple functions to be written for each named address space used in an application.
Images
Improved image support including sRGB images and 3D image writes, the ability for kernels to read from and write to the same image, and the creation of OpenCL images from a mip-mapped or a multi-sampled OpenGL® texture for improved OpenGL interop.
C11 Atomics
A subset of C11 atomics and synchronization operations to enable assignments in one work-item to be visible to other work-items in a work-group, across work-groups executing on a device or for sharing data between the OpenCL device and host.
Pipes
Pipes are memory objects that store data organized as a FIFO and OpenCL 2.0 provides built-in functions for kernels to read from or write to a pipe, providing straightforward programming of pipe data structures that can be highly optimized by OpenCL implementers.
Android Installable Client Driver Extension
Enables OpenCL implementations to be discovered and loaded as a shared object on Android systems.
Industry Support
“Premiere Pro’s support for OpenCL has proved to be a massive hit with our customers; providing dramatic performance improvements while allowing for real-time editing and creativity. We’re excited about the technological developments in OpenCL 2.0 and look forward to discovering how it will enable us to push the performance envelope even further,” said Al Mooney, senior product manager, editing workflows at Adobe.
"AMD has played a significant role in the evolution of OpenCL and in the development of OpenCL 2.0. OpenCL 2.0 has made significant improvements in programmability and is also very well aligned with the hardware features that related industry standards bodies such as the HSA Foundation are developing" said Manju Hegde, CVP HAS Group AMD. "AMD strongly believes in and promotes OpenCL as one of the standards for programming compute on its GPUs and APUs and looks forward to increased adoption because of OpenCL 2.0"
“The Khronos Group’s OpenCL 2.0 is the first key, foundational, programming language to truly support the core capabilities of HSA enabled hardware. It is going to be exciting to see where developers take this much richer programming platform,” said Gregory Stoner, managing director and vice president of HSA Foundation.
“It is impressive that OpenCL is supporting an increasingly diverse range of heterogeneous computing units and accelerators,” said Zhenya Li, vice president of 2012 Lab, Huawei Technologies. “We expect the OpenCL standard to be widely adopted by the information and communications technology (ICT) sector, and to be a key software standard used in Network Function Virtualization (NFV) accelerators. Huawei will actively participate in and contribute to OpenCL, and help it to provide an easy-to-use development platform for future ICT virtualized applications.”
“As a long-time member of Khronos and a leading contributor to OpenCL standards efforts, Imagination is delighted that Khronos continues to create standards which make GPU compute programming easier for developers. With our broad range of IP including PowerVR processors and MIPS CPUs, our customers are creating innovative designs for mobile, consumer, automotive and more. GPU compute is key to creating new applications within the power envelope of these next generation devices,” said Peter McGuinness, director of multimedia technology marketing, Imagination Technologies.
“We are very excited about the user benefits of OpenCL 2.0’s new features”, said Simon McIntosh-Smith, Head of the Microelectronics Research Group at the University of Bristol. “These latest evolutions in OpenCL will enable us to efficiently solve a much wider range of parallel processing problems than ever before, and across a growing range of embedded and HPC hardware platforms. The new shared virtual memory (SVM) feature will make it easier for programmers to develop heterogeneous parallel programs, while support for dynamic parallelism will enable more efficient solutions for a much wider range of applications.”
Contact:
Jonathan Hirshon,
Principal,
Horizon PR
Email Contact
Phone: +1 (415) 952-3001