
Visualization of the frequency of the words 'socialism' (orange) and 'capitalism' (green) in New York Times articles since 1981. (by Jer Thorp)
Jo Guldi of Inscape has a provocative post up describing how she used available web-based tools to produce a rather sophisticated analysis of the use of the word pseudoscience in Wikipedia entries. Her hypothesis, to paraphrase, is that “pseudoscience” is less a rigorous, ’scientific’ term than a discursive ‘marker’ for attempts to delegitimize opposing arguments.
I started on a hunch, and that hunch involved one of my favorite twentieth-century words for starting a fight. The word “pseudoscience” was one of the twentieth century’s most powerful tools for turning a pleasant argument into an all-out jam-throwing mud-in-your-eye punch fest. Establish something as a “pseudoscience” and you dismiss the evidence offered out of hand. The processes to which pseudoscience refers are very modern: expert communities, institutional truth, public discourse, and the body of scientific thinking. [...] I suppose I’m attracted to things labeled pseudoscience in part because they represent communities whose thinking is out-of-time or out-of-place, and twenty-first-century interlocutors frequently have problems determining which and why. Take acupuncture. The elemental metaphors relating bodily organs to the seasons appear to be whacked out-of-time from the perspective of twentieth-century western science. But acupuncture’s reliance on communities of practice, responding to the experience of the patient, frequently surpasses western medicine in its skills of communicating with the patient, treating the physical ailment, the psychological, the psychosomatic, and the patient-doctor relationship as parts of an entire whole. If anything, acupuncture isn’t out-of-time, it’s out-of-place, the creature of a tradition external to the west and frequently misunderstood by western practitioners. That doesn’t stop western pracitioners from labeling acupuncture a “pseudoscience,” or cultural anthropologists from accusing the category “pseudoscience” of incorporating western bias. Pseudoscience is a great term for pointing to fissures.
Guldi’s methods for identifying rhetorical instances of the word could just as well be used to identity other ‘fissures’. To be sure, a more robust (and affordable) means for sifting through, and refining, the large sets of data to which humanities scholars now have access, has long been overdue. (The whole ‘tag cloud’ phenomenon shows just how desperate we are for a handy, user-friendly data visualizer.) Here’s what she did:
I asked DevonAgent (a personalizable search agent for automatically pulling big sets of text from the web without cut and paste, ~$25 education rate) to find me all the Wikipedia articles that reference pseudoscience, either in the class heading, the article itself, or the backboard discussion. Those articles range from articles explaining Popper’s definition of pseudoscience to clearly discredited theories (phrenology, flat-earth theory) to controversial subjects from the borderlands of western institutional knowledge (e.g. acupuncture).
The resulting database of articles is a collection of knowledge fissures: places where one group of researchers has attempted to tell another to shush.
So I asked the machines who was involved in the fights. ManyEyes, a free collection of visualization tools from IBM, will let anyone look for recurring grammatical relationships between certain words in a given piece of text. When ManyEyes looked at the Wikipedia Pseudoscience database, it quickly recognized certain names coming up regularly. It knew that they were names only from the grammatical construction of a personal relationship, i.e., the word was followed by an apostrophe-s, e.g. Freud‘s book, Sheldrake’s claim, Popper’s hypothesis. So I asked ManyEyes for a list of the players who appear in the fights over pseudoscience. Here they are, to the best of the machine’s knowledge.
Guldi’s general thesis – that certain words or phrases mark disciplinary fissures — is compelling, if limited in application. Not every fissure, of course, manifests itself with semantic markers, nor does every regular accusation suggest a deep break, but generally speaking a great deal of authoritative research could be done with a method such as this one.
Guldi also points the way to scholarly uses of web-accessible archives for purposes other than mere ‘visualization’, and her timing couldn’t be better. Over the last five months or so, The New York Times has been rolling out a series of APIs for interacting with its 158-year corpus, an ambitious project that culminated recently in the February 20 Times Open developer seminar. Artist-developers like Tim Schwartz and Jer Thorp have since produced some interesting visualizations of the archive — of the frequency of “socialism” vs. “capitalism” between 1984 and 2009, for instance — and there are other artists doing similar work without APIs – Jeff Clark’s News Spectrum or Dave Bowker’s “One week of The Guardian” — but, again, these projects are for the most part aimed at producing visually-pleasing representations of data and take little interest in identifying discursive ‘fissures’ or mining historical archives for definitive ‘macro’ patterns. But it’s only a matter of time before social scientists and historians like Guldi turn their attention to archive APIs like the Times‘ and begin making all sorts of interesting discoveries.
Recent Comments