Text Encoding, Representation and Interpretation

Claus Huitfeldt
Wittgenstein Archives
University of Bergen
Allegaten 27
Bergen, N-5007

This paper argues that representation and interpretation should be regarded as relationships between concrete texts and their audiences, not as separate entities in some realm of mental or abstract objects. This perspective proves especially fruitful to discussions of the role of interpretation in text encoding.

Current views on how text encoding relates to matters of representation and interpretation seem confused. On the one hand, many encoding projects in the humanities find their raison d'etre in a claim to represent some body of texts in an accurate and reliable form. So-called "descriptive" text encoding is an essential tool in such efforts. On the other hand, difficulties in these attempts to create objectively accurate representations are often blamed on a seemingly inescapable "interpretive" element in transcription and encoding.

But if text encoding is itself at bottom interpretive, how can it be used to represent texts at all? On the other hand, if everything about this supposedly representational device is at bottom interpretive, doesn't the distinction between representation and interpretation become rather empty?

I propose two steps to get out of this problem. As a first step, we may imagine representation and interpretation as different parts of one continuum. Our task is to find a suitable location for a demarcation line between the two areas somewhere along this continuum. Then, by stating that something is a representation, we are not excluding the possibility that it might in some other perspective be seen as an interpretation, i.e. that the demarcation line might legitimately have been drawn elsewhere. Nor are we denying that representations and interpretations are, in some deeper perspective, of the same kind. But we deny the usefulness of a demarcation line placed at one extreme of the continuum.

The second step is this: "Representation" and "interpretation" are best seen as names for relations between texts. Derivatively, the terms may then also refer to individual texts which have these relationships to other texts. The move can be brought one step further by extending the relationships to include the texts' audiences: Texts are not representations or interpretations of each other in abstracto, but only in relation to certain audiences of human, i.e. social, historical and cultural beings.

On the basis of these two steps, we can clarify the role of text encoding in representation and interpretation. The word "interpret", as used in relation to text, has at least two broad senses:

  1. To identify the meaning of a text (reading, listening, deciphering)
  2. To explain the meaning of a text (paraphrasing, analysing, discussing)
Whereas the first of these senses talks about identification of meaning, the second talks about explanation of meaning. On the relational view suggested above, the meaning of a text can only be given by another text: identification as well as explanation must themselves take the form of text.

My proposal is that we reserve the word "representation" for the identification of meaning. Just as identification normally precedes explanation, representation normally precedes interpretation. To represent a text, then, is to identify and reproduce its meaning in the form of another text. This meaning is the linguistic content, roughly in the sense of what representatives of the Text Encoding Initiative seem to refer to when they speak about "the text itself", or what Nelson Goodman has called "sequences of letters, spaces and punctuation marks" [Goodman 1976, p 115]. For some purposes we may legitimately want to extend this definition to include compositional or typographic features like paragraphs, chapters, font shifts etc., but as an absolute minimum it must include linguistic content in the sense just indicated. This is as close as we can come to a "rock bottom" of objective facts in textual scholarship.

Neither the faculties and skills nor the arguments and methods we employ in identifying text in this sense are essentially different from those we employ in interpretive activities. This observation is simply a restatement of the principle of the "hermeneutic circle", which also accounts for the fact that just as interpretations are made on the basis of representations, representations themselves may be revised in the light of interpretations.

According to what we have said so far, to interpret a text is either

  1. to put forth another text which is rephrasing or commenting upon the first, or
  2. to add something to the original text which contributes to a more or less specific understanding of it.
We may call the first kind of interpretation "stand-alone", and the second "in-line".

For in-line interpretation, which is the main focus of the rest of this paper, text encoding provides exciting possibilities. Some examples are:

A detailed discussion, including a discussion of how such devices have been judged by one specific encoding project (the Wittgenstein Archives), will provide examples of the kinds of considerations which may be relevant for the application of interpretive encoding in philosophy as well as other disciplines.

The discussion suggests that the proposed understanding of encoding practices agrees with the following, common-sense view of representation and interpretation:

  1. A text is a representation of another text if the first has the same linguistic content, i.e. the same wording, as the other.
  2. A text is an interpretation of another text if the first is not a representation of the other, but explains, discusses, or gives an alternative account of the meaning of the other with other words.
  3. A text may contain both a representation and an interpretation of another text, provided that the representation and interpretation are clearly distinguished from each other.


Goodman, Nelson 1976: The Languages of Art: An approach to a Theory of Symbols. (1969) 2nd ed Bobbs-Merrill, Indianapolis.

Næss, Arne 1966: Communication and Argument - Elements of Applied Semantics. Universitetsforlaget, Oslo. 1981

Sperberg-McQueen, C. Michael & Burnard, Lou (red) 1994: Guidelines for the Encoding and Interchange of Machine-Readable Texts (TEI P3). Chicago & Oxford.

Wittgenstein, Ludwig 1998: Wittgenstein's Nachlass. The Bergen Electronic Edition. CD-ROM Edition in four volumes, 1998-99 Oxford University Press.