ScholarGate
Assistant

NoSQL Data Stores

NoSQL data stores are non-relational databases — key-value, document, wide-column, and graph — that adopt flexible data models and distribution strategies to scale horizontally and stay available at the cost of some relational guarantees.

Definition

A NoSQL data store is a database that departs from the relational model, organizing data as key-value pairs, documents, wide sparse columns, or graphs, and typically distributing it across a cluster with replication and relaxed consistency to achieve scalability and availability.

Scope

This topic covers the main categories of NoSQL systems and their data models: key-value stores for simple lookups, document stores for nested records, wide-column stores for sparse, large tables, and graph databases for highly connected data. It treats the design choices common to these systems — sharding, replication, and tunable consistency — and the access patterns each model suits. It excludes the broad consistency theory (CAP and consistency models) and processing frameworks, which are adjacent topics.

Core questions

  • What data model does each NoSQL category (key-value, document, wide-column, graph) provide?
  • What access patterns and workloads suit each category?
  • How do NoSQL stores shard and replicate data for scale and availability?
  • What relational features (joins, transactions, schemas) do they relax, and why?
  • How do tunable consistency settings let applications balance latency and freshness?

Key concepts

  • key-value store
  • document store
  • wide-column store
  • graph database
  • sharding and replication
  • tunable consistency
  • schema flexibility
  • denormalized access patterns

Key theories

Key-value and wide-column models
Key-value stores map opaque keys to values for simple, fast lookups, while wide-column stores organize data into rows with flexible, sparse column families; both, exemplified by Dynamo and Bigtable, scale to huge clusters with sharding and replication.
Document and graph models
Document stores hold self-describing nested records (often JSON) and support queries over their structure, while graph databases model entities and relationships as nodes and edges optimized for traversal of highly connected data.
Relaxed guarantees for scale
To scale horizontally and remain available, many NoSQL stores relax schemas, drop multi-row transactions and joins, and offer tunable or eventual consistency, shifting some responsibility for integrity to the application.

Clinical relevance

NoSQL stores are widely used building blocks of internet services: key-value and wide-column stores back session state, catalogs, and time-series data at massive scale, document stores fit flexible application data, and graph databases power recommendation and fraud-detection systems, making knowledge of their models essential for data engineering.

History

The NoSQL movement grew from internet companies' need to scale beyond single-node relational databases. Google's Bigtable (2006/2008) introduced the wide-column model and Amazon's Dynamo (2007) the highly available, eventually consistent key-value model; these influential designs spawned a generation of open-source key-value, document, wide-column, and graph databases in the late 2000s and 2010s.

Key figures

  • Werner Vogels
  • Jeffrey Dean
  • Sanjay Ghemawat

Related topics

Seminal works

  • decandia2007
  • chang2008

Frequently asked questions

How do I choose among key-value, document, wide-column, and graph stores?
Match the model to the access pattern: key-value for simple lookups by a known key; document for self-contained, nested records queried by their fields; wide-column for very large, sparse tables with predictable row-key access; and graph for data dominated by relationships and traversals, such as social networks or recommendations.
Do NoSQL stores support transactions?
Historically many NoSQL stores offered only single-key atomic operations and no multi-record transactions, trading them for scalability. That has changed: a number of modern NoSQL and 'NewSQL' systems now provide multi-document or even distributed transactions, so transactional support varies widely and should be checked per system.

Methods for this concept

Related concepts