GPU Accelerated NAMD
NAMD, recipient of a 2002 Gordon Bell Award and a 2012 Sidney Fernbach Award, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR. NAMD is distributed free of charge with source code. You can build NAMD yourself or download binaries for a wide variety of platforms. Our tutorials show you how to use NAMD and VMD for biomolecular modeling.
GPU Acceleration of Molecular Modeling Applications
Modern graphics processing units (GPUs) contain hundreds of arithmetic units and can be harnessed to provide tremendous acceleration for numerically intensive scientific applications such as molecular modeling. The increased capabilities and flexibility of recent GPU hardware combined with high level GPU programming languages such as CUDA and OpenCL has unlocked this computational power and made it accessible to computational scientists. The key to effective GPU computing is the design and implementation of data-parallel algorithms that scale to hundreds of tightly coupled processing units. Many molecular modeling applications are well suited to GPUs, due to their extensive computational requirements, and because they lend themselves to data-parallel implementations.
Multi-GPU Coulomb Summation
Just as scientific computing can be done on clusters composed of a large number of CPU cores, in some cases problems can be decomposed and run in parallel on multiple GPUs within a single host machine, achieving correspondingly higher levels of performance. One of the drawbacks to the use of multi-core CPUs for scientific computing has been the limited amount of memory bandwidth available to each CPU socket, often severely limiting the performance of bandwidth-intensive scientific codes. Recently this problem has been further exacerbated since the memory bandwidth available to each CPU socket hasn’t kept pace with the increasing number of cores in current CPUs. Since GPUs contain their own on-board high performance memory, the available memory bandwidth available for computational kernels scales as the number of GPUs is increased. This property can allow single-system multi-GPU codes to scale much better than their multi-core CPU based counterparts. Highly data-parallel and memory bandwidth intensive problems are often excellent candidates for such multi-GPU performance scaling.