ScholarGate
Trợ lý

Data Visualization

Data visualization is the graphical display of data so that its patterns, distributions, and relationships can be perceived directly. Well-chosen displays — histograms, box plots, scatter plots, and others — reveal features such as skew, clustering, and outliers that numerical summaries alone can conceal, making graphics an integral part of describing and exploring data.

Tìm chủ đề với PaperMindSắp ra mắtFind papers & topics
Tools & resources
Tải xuống bản trình chiếu
Learn & explore
VideoSắp ra mắt

Definition

Data visualization is the practice of representing data and statistical summaries graphically — through plots such as histograms, box plots, and scatter plots — to make distributional shape, comparison, and relationship visually apparent.

Scope

This entry covers the role of graphical display in summarising data, the principal chart types used in the health sciences, and the principles of graphical perception that make some displays more readable than others. It is a methodological reference and does not provide clinical guidance.

Core questions

  • Which display best reveals the feature of the data in question — distribution, comparison, or relationship?
  • How do the principles of graphical perception affect which encodings are read accurately?
  • How can a chart mislead, and how is that avoided?

Key concepts

  • Histogram
  • Box plot
  • Scatter plot
  • Bar chart and frequency display
  • Graphical perception and encoding accuracy
  • Exploratory data analysis
  • Misleading graphics

Key theories

Graphical perception
Cleveland and McGill's theory of graphical perception ranks the visual encodings (position, length, angle, area, colour) by how accurately people decode them, providing an empirical basis for preferring position-based displays such as dot and scatter plots over area- or angle-based ones such as pie charts.

Mechanisms

Different displays expose different features. A histogram shows the shape of a single distribution — its centre, spread, skew, and modality. A box plot compactly summarises the median, quartiles, and outliers, making it efficient for comparing the distribution of a variable across groups. A scatter plot reveals the relationship between two continuous variables. The effectiveness of any display rests on graphical perception: empirical study shows that the eye decodes some encodings (position along a common scale) far more accurately than others (angle, area, colour saturation), which is why position-based plots are generally preferred and why displays such as pie charts and three-dimensional effects are discouraged. Sound design also avoids distortions — truncated or inconsistent axes, excessive ornamentation — that can lead the reader to a false impression.

Clinical relevance

Figures carry much of the message in clinical papers and presentations, and the ability to read them critically — and to recognise misleading ones — is part of appraising evidence. This entry describes principles of graphical display for that purpose and is not a basis for individual diagnostic or treatment decisions.

Epidemiology

Graphical display is used at every stage of health research, from exploring raw data and checking distributional assumptions to communicating findings to clinical and public audiences. The choice and honesty of displays directly affect how clearly and accurately study results are understood.

History

Statistical graphics trace to the late eighteenth and nineteenth centuries in the work of William Playfair, who introduced the line, bar, and pie charts, and Florence Nightingale, who used graphics to argue for sanitary reform. The modern era was shaped by John Tukey's exploratory data analysis (1977), which introduced and popularised displays such as the box plot, by Cleveland and McGill's empirical study of graphical perception, and by Edward Tufte's principles for the honest and efficient display of quantitative information.

Debates

Which displays should be preferred for accurate reading?
Research on graphical perception shows that quantities encoded by position along a scale are judged more accurately than those encoded by angle or area, which underpins long-standing advice to favour dot, bar, and scatter plots and to avoid pie charts and three-dimensional decoration.

Key figures

  • John W. Tukey
  • William S. Cleveland
  • Edward R. Tufte

Related topics

Seminal works

  • tukey-1977
  • cleveland-1984
  • tufte-2001
  • mcgill-1978

Frequently asked questions

Why use a graph when summary statistics are already reported?
Graphs reveal features — skew, multiple peaks, outliers, and relationships between variables — that single numbers such as the mean and standard deviation can hide, so they complement numerical summaries rather than replacing them.
What makes one chart easier to read accurately than another?
People decode position along a common scale more accurately than angle, area, or colour. Displays that rely on position, such as dot and scatter plots, are therefore generally read more reliably than pie charts or three-dimensional graphics.

Methods for this concept

Related concepts