ScholarGate
Asisten

GPU and Accelerator Computing

GPU and accelerator computing harnesses massively parallel many-core processors to accelerate data-parallel workloads far beyond what general-purpose CPUs achieve.

Temukan Topik dengan PaperMindSegeraFind papers & topics
Tools & resources
Unduh salindia
Learn & explore
VideoSegera

Definition

GPU and accelerator computing is the use of specialized many-core processors, optimized for high-throughput data-parallel execution, to offload and speed up the parallelizable portions of a computation under a host-device programming model.

Scope

This topic covers the architecture of graphics processing units and other accelerators as throughput-oriented, many-core SIMD/SIMT machines; the programming models that target them (CUDA, OpenCL, and directive-based offloading); the thread-hierarchy and memory-hierarchy abstractions (threads, warps, blocks, grids; global, shared, and register memory); and the performance considerations—occupancy, memory coalescing, and divergence—that govern achievable throughput.

Core questions

  • How does the throughput-oriented, many-core accelerator model differ from a general-purpose CPU?
  • How are computations expressed as massively parallel kernels over a thread hierarchy?
  • What memory and execution behaviors—coalescing, divergence, occupancy—limit achievable performance?

Key theories

SIMT execution model
GPUs run thousands of lightweight threads grouped into warps that execute in lockstep (single-instruction, multiple-thread); performance depends on keeping warps busy and avoiding control-flow divergence within a warp.
Hierarchical thread and memory model
CUDA organizes threads into blocks and grids and exposes a memory hierarchy of registers, fast shared memory, and large global memory; mapping data and computation onto this hierarchy is the central performance task.
General-purpose GPU computing
The evolution from fixed-function graphics pipelines to programmable, general-purpose accelerators turned GPUs into a mainstream platform for scientific and data-intensive computing.

Clinical relevance

Accelerators are the workhorses of modern computing-intensive applications: deep-learning training and inference, scientific simulation, image and signal processing, and cryptography all rely on GPUs for order-of-magnitude speedups over CPUs.

History

GPUs evolved from fixed-function graphics hardware into programmable parallel processors; the 2007 release of CUDA, described by Nickolls and colleagues in 2008, made general-purpose GPU computing accessible, and accelerators subsequently became central to high-performance and machine-learning computing.

Key figures

  • John Nickolls
  • Wen-mei Hwu
  • David Kirk
  • John Owens

Related topics

Seminal works

  • nickolls2008
  • kirk2016
  • owens2008

Frequently asked questions

Why are GPUs so much faster than CPUs for some workloads?
GPUs devote far more of their transistors to arithmetic units and run thousands of threads to hide memory latency, trading single-thread speed for aggregate throughput. This makes them excellent for regular, highly data-parallel work, though poorly suited to branchy, latency-sensitive code.

Methods for this concept

Related concepts