Percy Liang is a computer scientist at Stanford University who directs the Center for Research on Foundation Models (CRFM). His research spans semantic parsing, language grounding, and the systematic evaluation of large language models. He has been a leading voice in developing transparent, holistic evaluation methodologies for foundation models.
Early Life and Education
Born in 1983, Liang studied at MIT and earned his PhD from UC Berkeley in 2011. His doctoral work on semantic parsing and learning from natural language supervision demonstrated how NLP systems could map natural language utterances to executable logical forms. He joined the Stanford faculty and rapidly established a research program bridging theoretical machine learning with practical NLP systems.
Born in the United States
Completed PhD at UC Berkeley
Joined Stanford University faculty
Developed the SQuAD reading comprehension benchmark
Co-founded the Stanford Center for Research on Foundation Models
Released HELM (Holistic Evaluation of Language Models)
Key Contributions
Liang's work on semantic parsing advanced methods for mapping natural language to formal representations. His approach to learning semantic parsers from question-answer pairs (rather than annotated logical forms) reduced the annotation burden and enabled broader application of semantic parsing to question answering over databases and knowledge bases.
He co-created SQuAD (Stanford Question Answering Dataset), one of the most widely used benchmarks for reading comprehension, which spurred rapid progress in neural question answering. His HELM (Holistic Evaluation of Language Models) framework provides comprehensive, multi-dimensional evaluation of large language models across accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency, addressing the need for more nuanced assessment than single-number benchmarks provide.
"We need to evaluate language models not just on accuracy, but on a broad set of metrics that reflect the diverse ways these models impact people." — Percy Liang, on the motivation for HELM
Legacy
Liang's contributions to semantic parsing, benchmark development, and foundation model evaluation have shaped how the field measures progress and identifies limitations. SQuAD became a standard benchmark used by researchers worldwide. HELM and the CRFM have established new standards for transparency and comprehensiveness in language model evaluation. His work connects technical NLP research with broader questions of AI governance and accountability.