Computational Linguistics
About

Coherence Relations

Coherence relations are the semantic and pragmatic links between discourse segments that make texts interpretable as unified wholes rather than arbitrary sequences of sentences.

Coherence(T) = Π_i P(Rᵢ | uᵢ, uᵢ₊₁, context)

A text is coherent when its parts fit together in a way that allows readers to construct a unified mental representation of the content. Coherence relations — also called discourse relations or rhetorical relations — are the specific semantic and pragmatic connections between text segments that create this unity. Relations such as Cause-Effect, Contrast, Elaboration, and Temporal Sequence provide instructions for how to integrate the meaning of one segment with another. The study of coherence relations lies at the intersection of linguistics, psychology, and computational linguistics, connecting theories of text structure to models of human comprehension.

Taxonomies of Coherence Relations

Major Relation Classes Additive: Conjunction, Elaboration, Restatement
Temporal: Sequence, Synchrony
Causal: Cause, Result, Purpose
Contrastive: Contrast, Concession, Correction

Cross-cutting dimension:
Informational (semantic) vs. Intentional (pragmatic)

Numerous taxonomies of coherence relations have been proposed. Hobbs (1979) identified relations like Elaboration, Parallel, Contrast, and Explanation grounded in the inferential processes readers use. Mann and Thompson's RST (1988) defined approximately 25 relations organized by nuclearity. Sanders, Spooren, and Noordman (1992) proposed a cognitive taxonomy based on four primitives: basic operation (causal vs. additive), polarity (positive vs. negative), source of coherence (semantic vs. pragmatic), and order of segments. The PDTB hierarchy uses four top-level classes — Temporal, Contingency, Comparison, and Expansion — with finer subdivisions. Despite differences in terminology, these frameworks show substantial convergence on a core set of approximately 10–15 fundamental relation types.

Signaling and Recognition

Coherence relations can be signaled explicitly through discourse connectives ("because," "however," "then"), through other linguistic cues such as tense shifts, lexical overlap, or syntactic parallelism, or left entirely implicit. The RST Signalling Corpus (Das and Taboada, 2018) found that over 90% of relations are signaled by at least one linguistic device, though the signals are often subtle and redundant. For computational systems, explicit connectives provide the strongest cues for relation identification; implicit relation recognition, which requires deeper semantic understanding, remains substantially harder.

Coherence and Comprehension

Psycholinguistic research has demonstrated that coherence relations directly influence reading behavior. Sanders and Noordman (2000) showed that causal relations facilitate reading speed compared to additive relations, even when content is held constant. Readers slow down at discourse boundaries and when coherence breaks occur (Zwaan and Radvansky, 1998). Eye-tracking studies reveal that connectives are processed rapidly and immediately influence the integration of upcoming content, confirming that coherence relations are computed incrementally during comprehension rather than retrospectively.

Computational Modeling

Computational approaches to coherence relations address both relation classification and coherence assessment. Relation classifiers take pairs of text spans and predict the connecting relation, using features ranging from word pairs and production rules to neural representations from pre-trained models. Coherence assessment models evaluate whether a text is well-organized, often by scoring entity transitions (the entity grid model) or relation sequences. These models are applied in essay scoring, text generation evaluation, and readability assessment.

The interaction between coherence relations and other linguistic phenomena creates rich computational challenges. Coreference resolution depends on discourse structure: the accessibility of an antecedent is influenced by the rhetorical relation in which it appears. Information structure — the distinction between given and new information — interacts with coherence relations to determine appropriate sentence ordering. Text generation systems must plan coherence relations to produce well-organized output, a challenge that remains difficult for even the largest language models when producing extended argumentative or expository text.

Related Topics

References

  1. Hobbs, J. R. (1979). Coherence and coreference. Cognitive Science, 3(1), 67–90. doi:10.1207/s15516709cog0301_4
  2. Sanders, T. J. M., Spooren, W. P. M., & Noordman, L. G. M. (1992). Toward a taxonomy of coherence relations. Discourse Processes, 15(1), 1–35. doi:10.1080/01638539209544800
  3. Das, D., & Taboada, M. (2018). RST Signalling Corpus: A corpus of signals of coherence relations. Language Resources and Evaluation, 52(1), 149–184. doi:10.1007/s10579-017-9383-x

External Links