look up bibliographical information from an arxiv id

this python script takes one or more arxiv ids as input (command line arguments) and gives bibtex entries back which carry the bibliographic information.

#!/usr/bin/env python
 
# get the arxiv id
import sys
from string import strip, split
for arg in sys.argv[1:]:
    arg = strip(arg)
    arg = strip(arg, chars="arxiv:")
    arg = strip(arg, chars="http://")
    arg = strip(arg, chars="www.")
    arg = strip(arg, chars="arxiv.org/abs/")
    arg = split(arg, sep='v')[0]
    xid = strip(arg)
 
    # download the xml
    import urllib
    from xml.dom import minidom
    usock = urllib.urlopen('http://export.arxiv.org/api/query?id_list='+xid)
    xmldoc = minidom.parse(usock)
    usock.close()
 
    print xmldoc.toxml()
    print ""
 
    d = xmldoc.getElementsByTagName("entry")[0]
 
    date = d.getElementsByTagName("updated")[0].firstChild.data
    text_year = date[:4]
 
    title = d.getElementsByTagName("title")[0]
    text_title = title.firstChild.data#.encode('ascii', 'ignore')
 
    authorlist = []
    first = True
    for person_name in d.getElementsByTagName("author"):
        # get names
        name = person_name.getElementsByTagName("name")[0]
        text_name = name.firstChild.data#.encode('ascii', 'ignore')
        text_given_name = ' '.join(text_name.split()[:-1])
        text_surname = text_name.split()[-1]
        authorlist.append(text_surname+", "+text_given_name)
        #first author?
        if first:
            text_first_author_surname = text_surname
            first = False
 
    # output
 
    print "@MISC{"+text_first_author_surname+text_year[-2:]+","
    print "author = {"+" and ".join(authorlist)+"},"
    print "title = {"+text_title+"},"
    print "year = {"+text_year+"},"
    print "eprint = {"+xid+"},"
    print "URL = {http://www.arxiv.org/abs/"+xid+"},"
    print "}"

This entry was posted on Tuesday, October 18th, 2011 at 13:47 and is filed under misc. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

3 Responses to “look up bibliographical information from an arxiv id”

Dan Stahlke says:

2012/08/16 at 14:59

Thank you for this script, it has saved me tons of time! I have made two small changes, first to accept a greater variety of inputs (you can give a url to the pdf) and second to handle funny characters in author’s names. If you would like a copy of these changes, send me an email.
Automatically look up arXiv reference information | DL-UAT says:

2015/03/16 at 06:37

[…] a python script that returns the corresponding bib-entry from an arxiv ID, you can find it on http://www.thamnos.de/misc/look-up-bibliographical-information-from-an-arxiv-id/ . If you save it e.g. as arxiv2bib.py, you can call it as arxiv2bib.py 1234.5678 or as arxiv2bib.py […]
Anonymous says:

2024/04/30 at 15:30

Thanks! You are a hero!

look up bibliographical information from an arxiv id

3 Responses to “look up bibliographical information from an arxiv id”

Leave a Reply