Chart parsing is a general framework for parsing context-free grammars that stores intermediate results in a chart (also called a well-formed substring table) to avoid redundant computation. The chart records all constituents that have been discovered for each substring of the input, ensuring that no constituent is derived more than once. Both the CYK and Earley algorithms can be viewed as specific chart parsing strategies.
The Chart Data Structure
Active edge: [A → α • β, i, j] — partially recognized rule
Fundamental rule of chart parsing:
Active [A → α • B β, i, k] + Passive [B, k, j]
⇒ [A → α B • β, i, j]
The chart contains two types of entries: passive edges, which represent completed constituents (a nonterminal A spanning from position i to position j), and active edges, which represent partially recognized rules (a rule whose right-hand side has been partially matched). The fundamental rule of chart parsing combines an active edge expecting nonterminal B with a passive edge for B to produce a new (possibly still active) edge.
Parsing Strategies
Different chart parsing strategies differ in the order in which edges are added to the chart. Bottom-up strategies (like CYK) start from the words and build larger constituents. Top-down strategies start from the start symbol and predict what constituents are needed. Left-corner strategies combine both: they proceed bottom-up from the left corner of each rule but top-down for the rest. The agenda-based architecture maintains a priority queue of edges to process, enabling best-first search guided by probabilities or heuristics.
Efficiency and Applications
The key efficiency property of chart parsing is that each substring is analyzed at most once for each nonterminal, regardless of how many different parse trees use that constituent. This is critical for natural language, where ambiguity causes the number of parse trees to grow exponentially with sentence length. Chart parsing reduces this to polynomial-time recognition while still implicitly representing all parses. Modern statistical and neural parsers often use chart-based decoding to search over the space of possible trees efficiently.