From text analysis to reception analysis: A change of paradigms in Literary Computing ?

Jan Christoph Meister
Literaturwissenschaftliches Seminar
University of Hamburg
Brahmsalle 39
Hamburg, 20146

I. Computing the text, computing the mind

Most computational studies of Literary texts make use of either of two methodologies: quantitative or qualitative analysis. Though in both cases the Literary text is our point of departure, quantitative and qualitative analysis effectively deal with two distinct types of preprocessed textual database. The quantitative analysis of "raw" textual data relies on a descriptive mark-up of elements (words) and base structures (sentences) in terms of grammatical categories and functions. Qualitative analysis, on the other hand, necessitates a degree of higher-level hermeneutic mark-up. Theoretical models of the human interpretive faculty developed in cognitive science offer some fascinating insight into the complexity of this activity, but they also make quite clear that for the time being, practical models or algorithms attempting to simulate the human understanding of texts presuppose the strict delimitation of pragmatic context. For technological and methodological reasons computer based research into qualitative textual aspects cannot shy away from being reductionist: If you want to compute, you first have to translate the analogous into the digital. But it is na´ve to believe that digitization would eradicate or neutralize meaning -- a raised finger will always point at something: a potential object as well as a logical subject of signification. The fact therefore remains that any mark-up -- and particularly the one designed to capture higher-level semantic structures - will eventually transgress deductive categorical classification. At some stage in Literary Computing, we are no longer computing texts -- we are computing our own mind. A dilemma? A chance! Rather than enforcing ever stricter, less ambiguous mark-up conventions we should perhaps consider advocating flexibility, shifting our focus from text to reception analysis.

II. "Parsing for the theme" -- an experiment

I would like to illustrate the above thesis by way of a recently conducted experiment. It was motivated by an attempt to sketch out a functionalist definition of the concept of "theme". Three alternatives for this definition have been discussed in recent contributions toward a new theory of "theme": The generative ("theme" as a pre-script and frame of reference for the production of a text); the cognitive ("theme" as a cognitive construct enabling readers to integrate various aspects of informational content in a text); finally the intertextual ("theme" as a universal shared among various texts).

Can these aspects be integrated? I tried to look at how a "theme" construct is taking shape in the course of reception, and whether any correlative to the thematic hypothesis produced by a reader can be identified in the textual material by way of computational text analysis. This involved a two stage experiment based on an excerpt from Caroll's "Alice in Wonderland". Comparing the findings gathered from a sample of reading protocols on the one hand (stage 1), and from the basic statistical analysis of the textual word material on the other (stage 2) the data did indeed suggest that there is a significant correlation between peak readings of thematic categories in reader protocols, and high frequency words and word clusters (collocates and maximal phrases) in the textual database.

III. In search of the improbable

The example shows that comparative studies of text and reception data may indeed offer an interesting methodological alternative to purely quantitative or qualitative analyses. But the suggestion to re-focus our attention from text to reception analysis would have far reaching consequences for Literary Computing. To quote Karl R.Popper (1984:29; my translation): "The provability of a theory goes hand in hand with its information content, that is, with its improbability. ... Therefore, the better or more preferable hypothesis is often the more improbable one."

Surely if there is any point in advocating a change of paradigms in Literary Computing it must be for the sake of the more meaningful, the more enlightening results one hopes to achieve. Yet paradoxically enough it might well be the logically -- and numerically - more "improbable" that we need to consider more carefully. Our pretense to objectivity would have to emancipate itself from the empirical and re-incorporate the speculative dynamics of reading and interpretation once again. Can Literary Computing -- or Humanities Computing for that matter -- accept this risk?


Claude Bremond, Joshua Landy, Thomas Pavel:, Thematics. New Approaches. SUNY Press, Albany, 1995

Karl.R.Popper: Objektive Erkenntnis. Ein evolutionärer Entwurf , Gütersloh (Bertelsmann) 1984. The original English version was published under the title Objective Knowledge, Oxford (Clarendon Press) 1972