Pamphlets of the Stanford Literary Lab
Refine
Document Type
- Working Paper (17)
Language
- English (17)
Has Fulltext
- yes (17)
Is part of the Bibliography
- no (17)
Keywords
- Digital Humanities (15)
- Quantitative Literaturwissenschaft (5)
- Englische Literatur (4)
- Roman (4)
- Romantheorie (3)
- Dialoganalyse (2)
- Literaturgeschichte (2)
- Literaturkanon (2)
- Literaturtheorie (2)
- Literaturwissenschaft (2)
Institute
- Extern (1)
1
This paper is the report of a study conducted by five people – four at Stanford, and one at the University of Wisconsin – which tried to establish whether computer-generated algorithms could "recognize" literary genres. You take 'David Copperfield', run it through a program without any human input – "unsupervised", as the expression goes – and ... can the program figure out whether it's a gothic novel or a 'Bildungsroman'? The answer is, fundamentally, Yes: but a Yes with so many complications that it is necessary to look at the entire process of our study. These are new methods we are using, and with new methods the process is almost as important as the results.
2
In the last few years, literary studies have experienced what we could call the rise of quantitative evidence. This had happened before of course, without producing lasting effects, but this time it’s probably going to be different, because this time we have digital databases, and automated data retrieval. As Michel’s and Lieberman’s recent article on "Culturomics" made clear, the width of the corpus and the speed of the search have increased beyond all expectations: today, we can replicate in a few minutes investigations that took a giant like Leo Spitzer months and years of work. When it comes to phenomena of language and style, we can do things that previous generations could only dream of.
When it comes to language and style. But if you work on novels or plays, style is only part of the picture. What about plot – how can that be quantified? This paper is the beginning of an answer, and the beginning of the beginning is network theory. This is a theory that studies connections within large groups of objects: the objects can be just about anything – banks, neurons, film actors, research papers, friends... – and are usually called nodes or vertices; their connections are usually called edges; and the analysis of how vertices are linked by edges has revealed many unexpected features of large systems, the most famous one being the so-called "small-world" property, or "six degrees of separation": the uncanny rapidity with which one can reach any vertex in the network from any other vertex. The theory proper requires a level of mathematical intelligence which I unfortunately lack; and it typically uses vast quantities of data which will also be missing from my paper. But this is only the first in a series of studies we’re doing at the Stanford Literary Lab; and then, even at this early stage, a few things emerge.
3
If there is one thing to be learned from David Foster Wallace, it is that cultural transmission is a tricky game. This was a problem Wallace confronted as a literary professional, a university-based writer during what Mark McGurl has called the Program Era. But it was also a philosophical issue he grappled with on a deep level as he struggled to combat his own loneliness through writing. This fundamental concern with literature as a social, collaborative enterprise has also gained some popularity among scholars of contemporary American literature, particularly McGurl and James English: both critics explore the rules by which prestige or cultural distinction is awarded to authors (English; McGurl). Their approach requires a certain amount of empirical work, since these claims move beyond the individual experience of the text into forms of collective reading and cultural exchange influenced by social class, geographical location, education, ethnicity, and other factors. Yet McGurl and English's groundbreaking work is limited by the very forms of exclusivity they analyze: the protective bubble of creative writing programs in the academy and the elite economy of prestige surrounding literary prizes, respectively. To really study the problem of cultural transmission, we need to look beyond the symbolic markets of prestige to the real market, the site of mass literary consumption, where authors succeed or fail based on their ability to speak to that most diverse and complicated of readerships: the general public. Unless we study what I call the social lives of books, we make the mistake of keeping literature in the same ascetic laboratory that Wallace tried to break out of with his intense authorial focus on popular culture, mass media, and everyday life.
4
The nineteenth century in Britain saw tumultuous changes that reshaped the fabric of society and altered the course of modernization. It also saw the rise of the novel to the height of its cultural power as the most important literary form of the period. This paper reports on a long-term experiment in tracing such macroscopic changes in the novel during this crucial period. Specifically, we present findings on two interrelated transformations in novelistic language that reveal a systemic concretization in language and fundamental change in the social spaces of the novel. We show how these shifts have consequences for setting, characterization, and narration as well as implications for the responsiveness of the novel to the dramatic changes in British society.
This paper has a second strand as well. This project was simultaneously an experiment in developing quantitative and computational methods for tracing changes in literary language. We wanted to see how far quantifiable features such as word usage could be pushed toward the investigation of literary history. Could we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? To this end, we present a second set of results, the techniques and methodological lessons gained in the course of designing and running this project.
5
We would study not style as such, but style 'at the scale of the sentence': the lowest level, it seemed, at which style as a distinct phenomenon became visible. Implicitly, we were defining style as a combination of smaller linguistic units, which made it, in consequence, particularly sensitive to changes in scale—from words to clauses to whole sentences.
6
The concept of length, the concept is synonymous, the concept is nothing more than, the proper definition of a concept ... Forget programs and visions; the operational approach refers specifically to concepts, and in a very specific way: it describes the process whereby concepts are transformed into a series of operations—which, in their turn, allow to measure all sorts of objects. Operationalizing means building a bridge from concepts to measurement, and then to the world. In our case: from the concepts of literary theory, through some form of quantification, to literary texts.
7
Loudness in the novel
(2014)
The novel is composed entirely of voices: the most prominent among them is typically that of the narrator, which is regularly intermixed with those of the various characters. In reading through a novel, the reader "hears" these heterogeneous voices as they occur in the text. When the novel is read out loud, the voices are audibly heard. They are also heard, however, when the novel is read silently: in this la!er case, the voices are not verbalized for others to hear, but acoustically created and perceived in the mind of the reader. Simply put: sound, in the context of the novel, is fundamentally a product of the novel’s voices. This conception of sound mechanics may at first seem unintuitive—sound seems to be the product of oral reading—but it is only by starting with the voice that one can fully appreciate sound’s function in the novel. Moreover, such a conception of sound mechanics finds affirmation in the works of both Mikhail Bakhtin and Elaine Scarry: "In the novel," writes Bakhtin, "we can always hear voices (even while reading silently to ourselves)."
8
10
Different scales, different features. It’s the main difference between the thesis we have presented here, and the one that has so far dominated the study of the paragraph. By defining it as "a sentence writ large", or, symmetrically, as "a short discourse", previous research was implicitly asserting the irrelevance of scale: sentence, paragraph, and discourse were all equally involved in the "development of one topic". We have found the exact opposite: 'scale is directly correlated to the differentiation of textual functions'. By this, we don't simply mean that the scale of sentences or paragraphs allows us to "see" style or themes more clearly. This is true, but secondary. Paragraphs allows us to "see" themes, because themes fully "exist" only at the scale of the paragraph. Ours is not just an epistemological claim, but an ontological one: if style and themes and episodes exist in the form they do, it's because writers work at different scales – and do different things according to the level at which they are operating.