I am try to re-learn python so my skills are lacking. I am currently playing with the Pubmed APIs. I am trying to parse the XML file that is given here, and then run a loop to go through each child ('/pubmedarticle') and grab a few things, for now just the article title, and enter them into a dictionary under the key of the pubmedid (pmid).
i.e. the output should look like:
{'29150897': {'title': 'Determining best outcomes from community-acquired pneumonia and how to achieve them.'}
'29149862': {'title': 'Telemedicine as an effective intervention to improve antibiotic appropriateness prescription and to reduce costs in pediatrics.'}}
Later I will add in more factors like author and journal etc, for now I just want to figure out how to use lxml package to get the data I want into a dictionary. I know there are plenty of packages that can do this for me, but that defeats the purpose of learning. I've tried a bunch of different things and this is what I'm currently trying to do:
from lxml import etree
article_url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&tool=PMA&id=29150897,29149862"
page = requests.get(article_url)
tree = etree.fromstring(page.content)
dict_out = {}
for x in tree.xpath('//PubmedArticle'):
pmid = ''.join([x.text for x in x.xpath('//MedlineCitation/PMID[@Version="1"]')])
title = ''.join([x.text for x in x.xpath('//ArticleTitle')])
dict_out[pmid] = {'title': title}
print(dict_out)
I probably have a misunderstanding about how to go about this process, but if anyone can offer insight or lead me in the right direction for resources, that would be greatly appreciated.
Edit: My apologies. I wrote this far quicker than I should have. I have fixed up all the cases. Also, the result it throws seems to combine the PMIDs while just giving the first title:
{'2725403628806902': {'title': 'Handshake Stewardship: A Highly Effective Rounding-based Antimicrobial Optimization Service.Monitoring, documenting and reporting the quality of antibiotic use in the Netherlands: a pilot study to establish a national antimicrobial stewardship registry.'}}
Ta