Peter F. Brown was a researcher at IBM's Thomas J. Watson Research Center whose work in the late 1980s and early 1990s helped launch the field of statistical machine translation. Together with colleagues Stephen Della Pietra, Vincent Della Pietra, Robert Mercer, and others, he developed the IBM alignment models that demonstrated machine translation could be treated as a statistical estimation problem rather than a linguistic rule-engineering task.
Early Life and Education
Born in 1958, Brown studied mathematics and computer science before joining IBM Research. There, he became part of Frederick Jelinek's speech and language group, which was applying information-theoretic methods to natural language problems with unprecedented success.
Co-authored "A Statistical Approach to Machine Translation" with colleagues at IBM
Co-authored "Class-Based n-gram Models of Natural Language"
Published "The Mathematics of Statistical Machine Translation" defining IBM Models 1–5
Left IBM for Renaissance Technologies
Key Contributions
The landmark 1993 paper "The Mathematics of Statistical Machine Translation: Parameter Estimation" defined five increasingly sophisticated IBM alignment models. Model 1 assumes uniform alignment probabilities and learns only word translation probabilities; Models 2–5 progressively add alignment, fertility, and distortion parameters. These models formalised the noisy channel approach to MT: find the English sentence e that maximises P(e|f) = P(f|e)P(e), where P(f|e) is the translation model and P(e) is the language model.
Brown also co-authored influential work on class-based n-gram models, which group words into clusters based on distributional similarity and use these clusters to improve language model estimates. This work anticipated modern approaches to word representation learning and remains relevant to language model smoothing.
"The mathematics of statistical machine translation is surprisingly elegant, but the real challenge lies in parameter estimation from parallel corpora." — Peter Brown et al., paraphrased from "The Mathematics of Statistical Machine Translation" (1993)
Legacy
The IBM alignment models became the foundation of statistical machine translation for two decades. The GIZA++ toolkit implementing these models was used by virtually every SMT researcher. Brown's class-based language models influenced distributional semantics and word clustering. After leaving IBM, Brown joined Renaissance Technologies, applying statistical methods to finance, but his NLP contributions remain among the most cited in the field.