How do you show a language is not regular?

The most common tool is the pumping lemma, which says every long enough string in a regular language contains a substring that can be repeated any number of times while staying in the language. Finding a string that cannot be pumped this way proves the language is not regular; the Myhill–Nerode theorem gives an alternative argument by exhibiting infinitely many distinguishable prefixes.

If nondeterministic automata are no more powerful, why use them?

They are often far smaller and easier to design or to derive from a regular expression. The subset construction can convert them to deterministic form when needed, at a possible exponential cost in the number of states, so nondeterminism is a convenient specification tool rather than an increase in recognizing power.

Finite Automata and Regular Languages

Finite automata are the simplest computing machines, reading input one symbol at a time using only a bounded number of internal states, and the languages they recognize are precisely the regular languages.

Find Topic with PaperMindSoonFind papers & topics

Tools & resources

Download slides

Learn & explore

VideoSoon

Definition

A finite automaton is a machine with a finite set of states, a transition function on input symbols, a start state, and accepting states; it accepts a string if reading the string drives it from the start state into an accepting state, and the set of accepted strings is its language.

Scope

This topic covers deterministic and nondeterministic finite automata, their equivalence via the subset construction, regular expressions and Kleene's theorem, the Myhill–Nerode theorem and state minimization, closure properties of regular languages, and the pumping lemma used to prove that certain languages are not regular.

Core questions

What languages can be recognized using only a fixed, finite amount of memory?
Why are deterministic and nondeterministic finite automata equally powerful?
How can one prove that a specific language is not regular?
What is the smallest automaton recognizing a given regular language?

Key theories

Subset construction: Any nondeterministic finite automaton can be simulated by a deterministic one whose states are sets of the original states, proving the two models recognize exactly the same languages.
Kleene's theorem: A language is recognized by a finite automaton if and only if it is described by a regular expression, establishing the equivalence of the machine and algebraic descriptions of regular languages.
Myhill–Nerode theorem: A language is regular exactly when its relation of string indistinguishability has finitely many classes, and these classes determine the unique minimal deterministic automaton for the language.

Clinical relevance

Finite automata are the engine behind regular-expression matching, scanners and lexical analyzers in compilers, search-and-replace in text editors, and pattern detection in network intrusion systems, where bounded-memory recognition makes processing fast and predictable.

History

Building on McCulloch and Pitts's 1943 model of nerve nets, Kleene characterized the events that finite automata can represent around 1951, and Rabin and Scott formalized nondeterministic automata and their decision problems in 1959, work later recognized with the Turing Award.

Key figures

Stephen Kleene
Michael Rabin
Dana Scott
Anil Nerode

Seminal works

rabinScott1959
sipser2013

Frequently asked questions

How do you show a language is not regular?: The most common tool is the pumping lemma, which says every long enough string in a regular language contains a substring that can be repeated any number of times while staying in the language. Finding a string that cannot be pumped this way proves the language is not regular; the Myhill–Nerode theorem gives an alternative argument by exhibiting infinitely many distinguishable prefixes.
If nondeterministic automata are no more powerful, why use them?: They are often far smaller and easier to design or to derive from a regular expression. The subset construction can convert them to deterministic form when needed, at a possible exponential cost in the number of states, so nondeterminism is a convenient specification tool rather than an increase in recognizing power.