Context-free languages (CFLs) occupy the second level of the Chomsky hierarchy and represent the class of languages generated by context-free grammars (CFGs) or equivalently recognized by nondeterministic pushdown automata. CFLs are the most widely used formalism for describing the syntactic structure of both programming languages and natural languages, since they naturally capture the recursive, hierarchical phrase structure that is the hallmark of human language.
Context-Free Grammars
A context-free grammar G = (V, Σ, R, S) generates strings by repeatedly replacing non-terminal symbols with sequences of terminals and non-terminals, starting from the start symbol S. The key restriction that makes these grammars "context-free" is that each production rule has exactly one non-terminal on its left-hand side — the replacement does not depend on the surrounding context. This restriction permits efficient parsing algorithms while still capturing the recursive embedding characteristic of natural language.
This grammar generates {aⁿbⁿ | n ≥ 0}, demonstrating that CFGs can model
the nested dependencies (e.g., subject-verb agreement across embedded clauses)
that regular grammars cannot.
Parse Trees and Ambiguity
Every derivation in a CFG corresponds to a parse tree that reveals the hierarchical constituent structure of the string. A grammar is ambiguous if some string has two or more distinct parse trees. Ambiguity is pervasive in natural language — the sentence "I saw the man with the telescope" has two parse trees depending on whether the prepositional phrase attaches to the verb or the noun. Resolving such ambiguity, whether through grammar rewriting, probabilistic models, or semantic constraints, is a central challenge in natural language parsing.
Just as regular languages have a pumping lemma, context-free languages have their own pumping lemma: for any CFL, sufficiently long strings can be decomposed as w = uvxyz where |vy| > 0 and uv^ixy^iz is in the language for all i ≥ 0. This is used to prove that languages like {aⁿbⁿcⁿ | n ≥ 0} are not context-free — a result with direct linguistic implications, since cross-serial dependencies in Swiss German exhibit precisely this pattern.
Parsing Algorithms
The Cocke-Younger-Kasami (CYK) algorithm and the Earley parser are the two classic algorithms for context-free recognition and parsing. CYK operates in O(n³|G|) time on grammars in Chomsky normal form, using dynamic programming to fill a triangular parse table. Earley's algorithm handles arbitrary CFGs in O(n³) worst case but runs in O(n²) for unambiguous grammars and O(n) for many practical grammars. These algorithms form the basis for the probabilistic parsing methods — particularly probabilistic context-free grammars (PCFGs) — that dominated statistical NLP for decades.
While CFGs are the workhorse of syntactic analysis, their limitations for natural language are well documented. The inability to model cross-serial dependencies, the difficulty of capturing long-distance movement and agreement, and the lack of a lexicalized perspective motivated the development of more powerful formalisms such as tree-adjoining grammars and head-driven phrase structure grammars, which extend context-free power in controlled ways.