Computational Linguistics
About

Speech Acts

Speech act theory analyzes utterances as actions — asserting, requesting, promising, and commanding — and computational models classify these illocutionary forces to enable systems that understand what speakers do with their words.

SA(u) = ⟨F, P⟩ where F = illocutionary force, P = propositional content

Speech act theory, originating with J. L. Austin's (1962) How to Do Things with Words and formalized by John Searle (1969), holds that producing an utterance is not merely conveying information but performing an action. When a speaker says "I promise to be there," they are not describing a promise but making one. This insight — that language is a form of action — fundamentally reshaped pragmatics and has had deep influence on computational linguistics, particularly in dialogue systems, email classification, and social media analysis where understanding communicative intent is essential.

Austin-Searle Taxonomy

Speech Act Components Locutionary act: the literal content of the utterance
Illocutionary act: the intended force (assert, request, promise, …)
Perlocutionary act: the effect on the hearer

Searle's taxonomy of illocutionary acts:
Representatives (asserting), Directives (requesting),
Commissives (promising), Expressives (thanking),
Declarations (pronouncing)

Austin distinguished three levels at which an utterance functions. The locutionary act is the production of a meaningful expression. The illocutionary act is what the speaker does in making the utterance — asserting, questioning, commanding, promising, and so forth. The perlocutionary act is the effect achieved on the hearer — convincing, frightening, or amusing them. Searle refined this framework by identifying felicity conditions for each speech act type: preparatory conditions, sincerity conditions, and essential conditions that must hold for the act to succeed. He also developed the concept of indirect speech acts, where the literal force differs from the intended force, as when "Can you pass the salt?" functions as a request despite its interrogative form.

Computational Speech Act Recognition

Automatic speech act classification assigns illocutionary force labels to utterances in text or dialogue. Early work in email speech act detection (Cohen, Carvalho, and Mitchell, 2004) classified messages as requests, commitments, proposals, or informational statements to support task management. In dialogue systems, speech act recognition (also called dialogue act tagging) is typically performed using sequence labeling models that consider both the current utterance and dialogue history. The Switchboard Dialog Act Corpus (SWDA) defines 42 dialogue act types and has served as a standard benchmark, with modern neural models achieving above 80% accuracy.

Indirect Speech Acts in NLU

Indirect speech acts pose a fundamental challenge for natural language understanding. The utterance "It would be great if someone could take notes" is literally a statement about a hypothetical situation but functions as a request. Computational systems must learn to map surface forms to intended illocutionary forces through contextual reasoning. The Rational Speech Act framework provides one principled approach: a pragmatic listener can infer that the speaker chose an indirect form because the direct form ("Take notes") would violate social norms, and the cooperative assumption licenses the intended interpretation.

Applications in Modern NLP

Speech act theory has found renewed relevance in the era of conversational AI and social media analysis. Intent detection in task-oriented dialogue systems — classifying whether a user is making a reservation, asking for information, or expressing a complaint — is essentially speech act recognition applied to specific domains. On social media, speech act analysis helps distinguish between opinions, questions, recommendations, and complaints, supporting applications from customer service automation to public health surveillance.

The relationship between speech acts and politeness theory has motivated computational work on social meaning. Brown and Levinson's (1987) framework analyzes politeness strategies as modifications of speech acts to manage face threats. Computational politeness models (Danescu-Niculescu-Mizil et al., 2013) predict the politeness level of requests and identify the linguistic strategies — hedging, indirectness, gratitude markers — that modulate illocutionary force. These models reveal systematic patterns in how speakers balance communicative efficiency against social considerations, connecting speech act theory to broader questions about language and social cognition.

Related Topics

References

  1. Searle, J. R. (1969). Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press. doi:10.1017/CBO9781139173438
  2. Cohen, W. W., Carvalho, V. R., & Mitchell, T. M. (2004). Learning to classify email into "speech acts." Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), 309–316.
  3. Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., … & Meteer, M. (2000). Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26(3), 339–373. doi:10.1162/089120100561737

External Links