What does it mean for a static analysis to be sound?

A sound analysis never misses a real occurrence of the property it checks: if it reports no error of a given kind, that kind of error genuinely cannot happen, though it may also raise false alarms.

Why is pointer analysis difficult?

Pointers and references introduce aliasing, where multiple names refer to the same memory, so the analysis must approximate which locations each pointer may target, a problem that trades precision against scalability.

Static Program Analysis

Static program analysis examines source or intermediate code, without executing it, to infer properties such as which values a variable may hold or whether an error can occur.

PaperMindでテーマを探す近日公開Find papers & topics

Tools & resources

スライドをダウンロード

Learn & explore

動画近日公開

Definition

Static program analysis is the computation of approximate but sound information about a program's possible behaviors directly from its code, without running it, typically by solving dataflow equations over a lattice of abstract states.

Scope

This topic covers dataflow analysis over control-flow graphs, lattice-based fixpoint computation, interprocedural analysis, pointer and alias analysis, and call-graph and control-flow analysis. It addresses the precision dimensions (flow-, context-, and path-sensitivity), the soundness of analyses, and their use in optimization and bug detection.

Core questions

How are program properties computed by propagating facts over a control-flow graph?
How do flow-, context-, and path-sensitivity affect precision and cost?
How is sound analysis carried across procedure calls?
How does pointer analysis cope with aliasing and indirection?

Key theories

Dataflow analysis as fixpoint over a lattice: Kildall unified global program optimizations as the computation of a fixpoint of dataflow equations over a lattice, providing the general framework for static analyses.
Interprocedural analysis via graph reachability: Reps, Horwitz, and Sagiv reduced a large class of precise interprocedural dataflow problems to graph reachability, enabling efficient context-sensitive analysis across procedure boundaries.
Principles and dimensions of analysis: Nielson, Nielson, and Hankin systematize dataflow, constraint-based, and type-based analyses and the precision-versus-cost trade-offs that govern their design.

Clinical relevance

Static analysis drives compiler optimizations, bug- and vulnerability-finding tools, linters, and IDE features. Sound analyses can certify the absence of certain errors, while scalable heuristic analyses are widely deployed to catch defects early in development.

History

Dataflow analysis grew out of optimizing compilers, with Kildall's 1973 lattice framework and the Allen-Cocke analyses providing the theoretical base. Interprocedural and pointer analyses advanced through the 1980s and 1990s, including the Reps-Horwitz-Sagiv reachability formulation, and static analysis later scaled to industrial bug-finding on very large codebases.

Debates

Precision versus scalability: Analysis designers continually trade precision dimensions such as context- and path-sensitivity, which reduce false positives, against the computational cost that limits how large a program can be analyzed.

Key figures

Gary Kildall
Thomas Reps
Susan Horwitz
Flemming Nielson
Hanne Riis Nielson

Seminal works

kildall1973
reps1995
nielson1999

Frequently asked questions

What does it mean for a static analysis to be sound?: A sound analysis never misses a real occurrence of the property it checks: if it reports no error of a given kind, that kind of error genuinely cannot happen, though it may also raise false alarms.
Why is pointer analysis difficult?: Pointers and references introduce aliasing, where multiple names refer to the same memory, so the analysis must approximate which locations each pointer may target, a problem that trades precision against scalability.