One of the initial project goals was to provide a means for examining the text using keywords, or more specifically a variation on the tools designed for “keywords in context” (KWIC). This technique involves generating a concordance, a list of the words used within a text along with those words immediately surrounding them, as a means for providing context. Traditionally, generating a concordance was a labor-intensive process reserved for only the most important books, often source texts that serve as the foundation for religion, such as the Judeo-Christian Bible.
As a linguistic tool, examining the collocation of terms and their respective uses across a corpus can provide valuable research insights, and with the advent of digital text analytics applications generating these annotated corpora became significantly faster and more accessible.
It was our hope that, when applied to our collection of Marxist writings, the results could also have pedagogical value: By searching for a keyword of personal interest, perhaps drawn from current events or an unrelated research interest, a scholar unfamiliar with Marxism could gain insight from how the terms were applied, resulting in something of a “Marxist perspective” on the topic.
Unfortunately the Open Source tools available to build a custom KWIC web application are somewhat limited. There are several desktop applications, such as AntConc and KHCoder, but the server-side applications appear to be both built primarily using Java and requiring significant customization. Although this doesn’t rule out building one, for a near-term solution we used the Voyant Tools, a collection of research applications developed to enhance humanities reading through lightweight text analytics, with particular emphasis on HTML documents. With our corpus available in valid XHTML, it was uploaded and examined using the Voyant “Contexts” tool.
The initial results were interesting. Contexts identified the top ten keywords (“class” being first), and generated concordances for them across the entire corpus.
The selections are listed alphabetically by author, with the document title in the first column, and the selected keyword is displayed in the center. The “context” and “expand” sliders at the bottom of the screen increase the amount of text displayed on either side of the keyword, and using the “plus” icon to the left of the title reveals the expanded passage.
Particularly interesting are the results from a random keyword. Assuming it can be found, Contexts applies wildcard stemming, revealing unexpected instances. Consider the search for “fool” and its variants:
Applying a keyword relevant to a particular interest, such as use of the term “suffrage” is also revelatory:
Although it is possible to limit the results to specific texts, as a web application hosted on an external server the Voyant Contexts tool can be slow and prone to stalling, particularly when non-indexed keywords are searched. Adding or removing materials requires generating a new corpus, and there are limits placed on how many documents the service is willing to contain. As an open source project, however, it is possible to host a dedicated instance of the Voyant Tools. Should the project generate sufficient interest, such an approach is worthy of consideration.
In the meantime, access to the corpus via Contexts is available at http://adiuva.me/marx/contexts.html.