0

I am trying to work directly with the pubmed api from R using httr. There are excellent packages available such as RISmed and easypubmed but for this particular task, I need to interact directly with the api.

Using this set of instructions (https://www.ncbi.nlm.nih.gov/books/NBK25500/), I started with this code but the returned is a list without any details or pmid. Any guidance is appreciated. or if you are aware of particular tutorials on using R in this setting.

library(XML)
library(httr)
library(glue)
 
query = 'asthma[mesh]+AND+leukotrienes[mesh]+AND+2009[pdat]'
 
reqq = glue ('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term={query}')

op = GET(reqq)

I also tried the code from this post (why i get that error : XML content does not seem to be XML), but it give this error Error in read_xml.raw(x, encoding = encoding, ...) : Opening and ending tag mismatch: meta line 17 and head [76]

Bahi8482
  • 489
  • 5
  • 15
  • 2
    Cannot repro. I had no problem with `httr::GET(url = glue('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term={query}'))` – QHarr Sep 12 '21 at 03:12
  • 2
    What are you expecting to get? I ran this, got a 200 http status code, and an xml document as `contents(op)` – camille Sep 12 '21 at 04:25
  • 1
    your `op` contains a valid XML with a list of PMIDs - do you have problems extracting them? – Vasily A Sep 12 '21 at 07:58
  • thanks for your input. I had difficulty viewing the xml content, but after using `contents(op)`, I can. – Bahi8482 Sep 12 '21 at 12:13

1 Answers1

1

You can read XML response and parse it to collect pubmed IDs, for instance:

library(magrittr) 
df_op <- op %>% xml2::read_xml() %>% xml2::as_list()

pmids <- df_op$eSearchResult$IdList %>% unlist(use.names = FALSE)

which gives:

> pmids
 [1] "20113659" "20074456" "20046412" "20021457" "20008883" "20008181" "19912318" "19897276" "19895589"
[10] "19894390" "19852204" "19839969" "19811112" "19757309" "19749079" "19739647" "19706339" "19665766"
[19] "19648384" "19647860"
Guillaume
  • 606
  • 3
  • 13
  • thanks for your help and for providing the code. – Bahi8482 Sep 12 '21 at 12:13
  • 1
    Why magrittr? R 4.1.0 has its own pipe now in base package: `|>` ! – Parfait Sep 12 '21 at 16:14
  • @Parfait, why (1): `|>` is a nice feature but I'm not sure everybody runs R 4.1. It's recent. In enterprise for instance people rarely have up-to-date systems to begin with. By saying "magrittr %>%" it's safe. And why: (2) I just find this syntax easier to read for learning purpose (left to right). (3) Furthermore |> introduces backward compatibility problems in base R code which I think is not a real progress for teams intended to learn R basics on different platforms. – Guillaume Sep 14 '21 at 06:25