Friday, March 18, 2005

Cite it Right!

The last few months I found out that obtaining correct bibliographic references for publications I want to cite, is a time consuming job.

As the majority of mathematical/technical scientists I use LaTeX for my publications, and combine it with BibTeX for generating reference lists automatically. BibTeX takes your document, and your bibliography database file, and extracts only the necessary entries from the latter and inserts a bibliography into your LaTeX document. Key point here is to have an extensive bibliography database of your own.

Of course one starts with adding some entries by hand, title, author, and some more fields that are evident from the paper copy you just printed. Soon this becomes boring though, especially when you've come to @techreport's or @incollection's, where you always have to look up again which fields were mandatory and which optional fields might be handy as well.

All this data should be readily available of coarse, at the publisher's website for example. Elsevier's ScienceDirect is pretty good at this, but it takes some searching. After clicking twice, saving a .ris file, converting it with Bibutils to xml and subsequently to BibTeX, pasting it into one of my central .bib files, and finally copy-pasting the doi url, I'm done. Quite some effort, not? Other services (e.g. the ACM Digital Library) offer only limited data in their BibTeX, like author, title and a url. So, availability can be a problem, and also uniformity -- or better: completeness -- is missing when getting bibliographic references from the web.

Another solution to this would be that in the "list of publications" that every researcher has on his/her webpage, each entry is accompanied by citation data, in RIS or BibTeX format, I assume the author is willing to put some effort in getting the correct and complete data together. It would be even better to have some central, worldwide database that can be queried, both interactively and automated (e.g. through scripts and prescribed URLs, and filled with official, complete data by the publishers. Just like the CDDB music album database.

An intermediate solution is for institutions to set up their local citation database, it seems to me RefDB is a nice software package for that, I might try that some day soon. For now, I'll just maintain my own .bib files, by copy-pasting a lot, and querying with BibTool and some quick-n-dirty shell scripts.