Selection of Digitales articles

“Even learning very basic programming, even if you’re not very good at it, I think it totally transforms the idea of what you’re doing with that computer, or what a computer should be able to do for you.”

For the past year or so, Dr Michael Radich, Senior Lecturer in VUW’s Religious Studies Programme, has set aside an hour a day to learn Python, a widely-used, high-level computer language. It’s not common for someone in his field (he specialises in the study of Buddhist texts) to delve this far into the world of programming, but it’s the kind of disciplined dedication that’s enabled him to gain a degree in music composition, a PhD from Harvard, and teach himself no less than ten natural languages.

During an hour-and-a-half long discussion over coffee at Milk and Honey, I learn that getting to grips with his first programming language has been one of the most frustrating endeavours of his career, but also one of the most rewarding.

“I was worried the learning curve would be too long and steep. Now I’ve reached that good point where I figure the process is so intrinsically worthwhile that the calculus of input to endpoint has become immaterial.”

In other words, the payoff has proved to be well worth the time invested, even though “it drives me up the wall quite regularly, and I spit the dummy fairly regularly too.”

The reason I’ve pinned him down for an interview is that Michael belongs to a new generation of humanities researchers embracing computational methods and quantitative analysis to challenge assumptions, find new ways to answer time-honoured questions, and to ask new ones.

His story, like many in this space of digital research, begins with a particular problem; in his case it’s how to accurately determine authorship of select parts of the 3000 canonical Chinese Buddhist texts. To get a sense of the scale Michael’s dealing with, the Chinese Buddhist canon is 243 times larger than all the biblical texts combined. Imagine a set of those Encyclopaedia Britannicas you see forlornly collecting dust at your local Sally Army store — you’d need over four of those to match the canon’s 188 million words.

During a kind of cultural ‘arms race’ in the 90s, however, Japan, Korea and Taiwan all competed to be the first to digitise the full Buddhist canon. Researchers in the field now have access to not one but three digitised versions of the entire corpus. Seeing an opportunity to dig deeper, Michael teamed up with local developer Jamie Norrish to create an open-source tool called TACL (Textual Analysis for Corpus Linguistics), which helps sift through millions of Chinese character strings and returns an avalanche of raw data to explore further.

If the Chinese Buddhist canon is the proverbial haystack, think of TACL as a well-calibrated metal detector capable of pointing the way to more needles than anyone realised even existed. But as Michael contends, to fruitfully sort the desired needles from misleading shrapnel takes the disciplinary knowledge of a human expert. TACL allows for increased exploratory power, but Michael’s extensive and hard-earned expertise provides the explanatory power needed to frame this wealth of evidence in a way which maximises impact and minimises misunderstanding.

It’s a great example of the kind of rich analysis and argument made possible when the minds of machine and man work in tandem. As Michael points out, “Computers are just weird collaborators.”

Over the course of our conversation I’m reminded that even when a topic is esoteric and arcane, a researcher’s passion can be infectious, bridging the gap between the minute details of a disciplinary rabbit hole and the general interest of a layperson such as myself. Seemingly curious and critical in equal measures, I love how Michael’s calm, considered manner gives way to a boyish enthusiasm when discussing the subjects that excite him: not only Buddhist texts, but also language learning and educational philosophy.

On the benefits of learning a second language, for example, he opines: “If you don’t know a second language, just learn one. Not having a second language is practically like missing a limb . . . one of the most important lessons [from learning another language] is that things can be done differently — even apparently simple or obvious things can be said differently, or thought differently.”

Keeping an open mind, embracing the uncertainty that comes with true learning, and always remaining up for a new challenge — these are the qualities that characterise the best academic minds, minds like Michael’s that never seem to settle for the status quo. And increasingly, even in the humanities, these minds are turning to computational methods as another way of realising things can be thought — and done — differently.

Sinologie Heidelberg Alumni Netzwerk (SHAN) Interview with Professor Michael Radich

“I find it impossible to imagine that in a few decades . . . we will not be looking back at this period before computational humanities the same way we that we now look back the natural sciences before mechanical aids to human perception. I think that computational tools in the Humanities resemble tools like microscopes or telescopes in other domains, which have vastly expanded the range of human perception, and make it possible for us to see very far into interstellar space or very small microscopic space.”

Where do you think computational aids can lead Sinology, Buddhology and the Humanities in general?

I think it is very dangerous to try and predict the future: most people who try will obviously fail. But very broadly speaking, I find it impossible to imagine that in a few decades, or some appropriate period of time, we will not be looking back at this period before computational humanities the same way we that we now look back the natural sciences before mechanical aids to human perception. I think that computational tools in the Humanities resemble tools like microscopes or telescopes in other domains, which have vastly expanded the range of human perception, and make it possible for us to see very far into interstellar space or very small microscopic space. We also have other tools that allow us to see parts of the light spectrum that we were no previously able to detect. Through those kinds of expanded perception, it’s been possible for us to empirically assess all kinds of hypotheses about the natural world around us. Exactly the same is true of the textual world. When we study the textual world, we are very severely limited by the range, types and patterns of attention that we normally bring to texts, in several respects.

First, we cannot read or process more than a very small amount of text in a single human lifetime. The most prodigious humanists have always been limited in the range of texts that they could consider, be aware of, or hold in their memories. Computers, in the last analysis, do not have such limitations, or at least, quantitatively speaking, they have a vastly greater capacity than any human.

Secondly, even the sharpest human can only perceive a certain level of detail in the texts. The tools that I use have already shown me that there is a vast amount of detail that humans typically overlook, which can evidentially meaningful if only we know where to look.

The third thing is this: While I don’t mean to say that computers will replace humans, we also have large blind spots. We only see the things that we are conditioned to find meaningful or interesting and significant. Those patterns of attention are determined by a complex range of factors, including our psychology, our education, our culturation, and the history of problems that have been regarded as meaningful in a scholarly field. A computer does not entirely have the same biases. Some of these biases can follow through into the way a computer is programmed, but at the same time, precisely because it is a very crude and blunt instrument, the computer also finds a whole lot of stuff we weren’t looking for. Sometimes in the middle of a large amount of garbage we find gold, things that are extremely meaningful.

So I think in those respects it’s reasonable to anticipate that over time computational tools—and here I mean not just tools that digitize the texts, but rather, processes or tools that radically expand or transform our definition of what it is to read or what it is to examine a text—such tools are likely to significantly expand the range of questions we can ask and answer, when we study a large amount of text in various constellations. This is a very vague kind of answer, because it is very difficult to anticipate exactly how that will happen. But I feel reasonably certain that it eventually will be the norm, as it is now in the natural sciences, for scholars in some sense to be cyborgs, that is, humans who operate using the extended powers of perception and computation that machines can give us.

Matt Plummer
Matt Plummer
Senior Research Partner

Digital interpreter with a background in the arts and technology.