Citation Software Automates the Wrong Step

Yesterday I read with interest an article on the Chronicle’s blog Lingua Franca about citation software. In it, the author argues for the importance of consistency in citation style and notes that most academics try to accomplish that through the use of citation management software. However, because the software is misused, the quality and consistency of citations is degenerating. The proposed solution is better training, especially of current students, who will one day be the major manuscript contributors and copy editors. I argue that the problem is with the tools themselves, in this case both the citation management software and writing environment, and that there are two potential solutions.

At first blush, citation management seems like a problem just begging for a software solution: citation information is highly structured and its insertion into a working manuscript should follow a set of rules. Various software solutions have done a good job at data entry and management. Programs like Papers, Sente, and Zotero are my favorites in this regard. I’ve been a Papers user almost since its release due to its lovely and highly useable interface. All of these programs (and many others) are very good at managing references.

But the second part of the equation is inserting a citation into a manuscript, and I still find this process entirely too fragile. As pointed out in the comments at the original article, much of this frailty is because of the thousands of citation styles, many of which are not even internally consistent. When I choose a particular journal from the list of 1400 supported formats in Papers, though (if my journal is on the list, which it usually is not), I expect the citations and bibliographic information inserted to be perfect. It is almost never perfect, so I argue that the existence of the named journal in the style list sets up a false expectation: automatic = perfect.

One solution to this problem is to focus so much on the details that the software is capable of every possible rule and permutation, which is the direction this seems to be going (did I mention there are 1400 named styles in Papers?). But if these styles don’t produce a picture perfect citation, with not a comma out of place and the correct abbreviations and colons and bolds and italics, you’re in for trouble. What I have found is that, at least for the volume of writing I do, I can format my citations by hand as fast as I can correct that misplaced comma, as long as the program spits out a citation that is close enough. This is getting better, largely because of concerted efforts to standardize the definition files that make the rules (CSL files). With online tools that make editing CSL files easy, this is getting better and better. But it still seems so fragile. The focus in this system is still confounding display and content, so we’ve automated at the wrong step in the process.

I guess I’ve taken all these words to say that I think we’re doing it wrong. I wonder why we haven’t greatly simplified the display of citations, relying completely on DOIs and endowing our display software to present human-readable citations based on a lookup of the record associated with the DOI? At least this would move the complexity of styles out of the users’ hands and onto the network. I’m thinking here about doing for citations what the combination of HTML and CSS does for web pages: it separates the display style from and content, which is really what we mean when we talk about citations and their style. We want to display certain information from a citation in a context-dependent way. Why not build that into our reading software?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.