NVIDIA has always been looking for new ways to make parallel programming easier and more accessible.
In addition to enhancing the CUDA Toolkit with new performance and usability features, NVIDIA also look for opportunities to give developers in-depth info and training.
Preparing for today’s release of the production version of CUDA 6 – available now as a free download on the CUDA website at developer.nvidia.com/cuda-downloads – NVIDIA recently offered a sprawling collection of sessions and tutorials at our 2014 GPU Technology Conference. NVIDIA’s goal: to help developers get the most out of programming on GPUs.
Here’s a sampling of some of the best CUDA sessions from last month’s GTC. Click through to replays of each session to review all of the material:
- “CUDA 6 and Beyond” – NVIDIA’s chief technologist for GPU Computing, Mark Harris, provided a detailed look at the top new features of CUDA 6, including a deep-dive review of Unified Memory, which makes GPU programming easier by automatically migrating data between the CPU and GPU. He also examined the new drop-in BLAS and FFTW libraries, which provides instant acceleration of applications by up to 8x by allowing developers to replace CPU-only versions of BLAS or FFTW libraries with the new, GPU-accelerated equivalent. Watch the replay here.
- “An Introduction to CUDA Programming” – Chris Mason, from Acceleware, a firm focused on parallel computing training and consulting for the oil and gas and CAD industries, covered the basics of using CUDA. He provided a brief overview of the platform and data-parallelism, and then covered the fundamentals of GPU kernels, host and device responsibilities, CUDA C/C++ syntax and more. Watch the replay here.
- “Fast, Parallel Algorithms for Computer Vision and Machine Learning with GPUs” – Accelerating next-gen computer vision and machine learning applications is the new frontier for CUDA and GPU accelerators. To help developers get a jump start, ArrayFire’s Umar Arshad armed attendees with best practices and examples for implementing parallel versions of popular computer vision and machine learning algorithms on GPUs. Watch the replay here.
- Mobile GPU Compute with Tegra K1 – NVIDIA’s Amit Rao and Mark Ebersole provided an in-depth session on building sophisticated mobile applications that harness the power of NVIDIA’s high-performance Tegra K1 mobile processor. This comprehensive session included a review of the various GPU-accelerated APIs for programming the Tegra SoC with CUDA, and a number of coding examples. Watch the replay here.
This is just a taste of the range of informative CUDA-related sessions at GTC. If you missed them the first time around, click on the links above for full session replays. Or, log in with your GTC 2014 credentials at www.gputechconf.con and view these and many other sessions here.
If you’d like a deeper tech dive into the new features of CUDA 6, check out, “5 Powerful New Features in CUDA 6,” on the Parallel Forall blog.
Don’t forget to download version 6 of the CUDA Toolkit to get access to Unified Memory, drop-in libraries, and the other cool new features.