ScholarGate
Assistent

Cloud and Cluster Computing

Cloud and cluster computing organize large numbers of commodity machines into on-demand, scalable platforms that deliver computation and storage as a utility.

Onderwerp vinden met PaperMindBinnenkortFind papers & topics
Tools & resources
Dia's downloaden
Learn & explore
VideoBinnenkort

Definition

Cluster computing networks many independent computers to work as a single system; cloud computing delivers such pooled, virtualized computing and storage resources to users on demand over the network, with elastic scaling and usage-based pricing.

Scope

This area covers the evolution from clusters and grids to warehouse-scale data centers and the cloud; the virtualization and containerization that enable elastic, multi-tenant resource sharing; large-scale data-processing frameworks (MapReduce and its successors); and scalable distributed storage and file systems. It is where the theory of distributed and parallel computing is realized at internet scale.

Sub-topics

Core questions

  • How are thousands of commodity machines organized to behave like one elastic computer?
  • How does virtualization enable elastic, multi-tenant resource sharing?
  • How can data sets too large for one machine be processed and stored reliably across a cluster?

Key theories

Utility and elastic computing
Cloud computing turns computation into a metered utility, providing the illusion of infinite, elastic resources available on demand and shifting capital cost to operating cost, a shift analyzed by Armbrust and colleagues.
Warehouse-scale computing
Treating an entire data center as a single computer—designing for the cost, energy, and failure characteristics of tens of thousands of servers—reframes systems design around the data center as the unit of deployment.
Data-parallel cluster processing
Frameworks like MapReduce let programmers process massive data sets across a cluster by expressing computation as map and reduce functions, with the runtime handling parallelization, data distribution, and fault tolerance.

Clinical relevance

Cloud and cluster platforms host essentially all large-scale internet services, scientific and enterprise computing, and machine-learning pipelines; their design directly determines the cost, scalability, and reliability of modern computing infrastructure.

History

Cluster computing grew from networks of workstations in the 1990s into grid computing for shared scientific infrastructure (Foster and colleagues, 2001); Google's MapReduce and file system (2003-2008) demonstrated warehouse-scale data processing, and the late-2000s rise of public cloud platforms, analyzed by Armbrust and colleagues, made elastic utility computing mainstream.

Debates

Grid versus cloud as the model for shared computing
Grid computing emphasized federation across administrative domains for scientific collaboration, while cloud computing centralized resources under a provider with elastic, on-demand pricing; the cloud model largely prevailed commercially, though grid ideas persist in scientific computing.

Key figures

  • Jeffrey Dean
  • Sanjay Ghemawat
  • Luiz Andre Barroso
  • Ian Foster
  • Michael Armbrust

Related topics

Seminal works

  • armbrust2010
  • dean2008
  • barroso2018

Frequently asked questions

What is the difference between cluster computing and cloud computing?
A cluster is a set of networked machines acting as one system, typically owned and operated by its users. Cloud computing delivers pooled, virtualized resources—often built on clusters—to many tenants on demand over the network, with elastic scaling and pay-as-you-go pricing.

Methods for this concept

Related concepts