0

I'm trying to generate an API query in R. I have the following MWE:

ids <- c('P19366','P0DH99','Q9FN92','Q9C8E6','O49347','P27140','P10896','Q9C534','O04130','Q9XI93')
query <- paste0('"', ids, '"', collapse = '+or+')
query <- paste0('https://www.uniprot.org/uniprot/?query=', query, '&format=tab&columns=id,genes(OLN)')
#If you print query it looks like this
> query
[1] "https://www.uniprot.org/uniprot/?query=\\\"P19366\\\"+or+\\\"P0DH99\\\"+or+\\\"Q9FN92\\\"+or+\\\"Q9C8E6\\\"+or+\\\"O49347\\\"+or+\\\"P27140\\\"+or+\\\"P10896\\\"+or+\\\"Q9C534\\\"+or+\\\"O04130\\\"+or+\\\"Q9XI93\\\"&format=tab&columns=id,genes(OLN)"

I then try to read it with vroom and get the following:

library(vroom)
vroom(query)
Error in vroom_(file, delim = delim, col_names = col_names, col_types = col_types,  : 
  Evaluation error: HTTP error 400..

This makes sense as the \ appears in the string, but if I were to run the string as a message and paste into my web browser it works just fine:

message(query)
#Copy the link below into ur browser
https://www.uniprot.org/uniprot/?query="P19366"+or+"P0DH99"+or+"Q9FN92"+or+"Q9C8E6"+or+"O49347"+or+"P27140"+or+"P10896"+or+"Q9C534"+or+"O04130"+or+"Q9XI93"&format=tab&columns=id,genes(OLN)

Copying the link to the browser works just fine, how do I escape the error of having \ when reading the https with vroom?

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Baraliuh
  • 593
  • 3
  • 12

1 Answers1

3

When you paste the URL in your browser, your browser will URL encode the path. Quotes are technically not allowed in URLs. They need to be escaped as "%22". So this should work

ids <- c('P19366','P0DH99','Q9FN92','Q9C8E6','O49347','P27140','P10896','Q9C534','O04130','Q9XI93')
query <- paste0('%22', ids, '%22', collapse = '+or+')
query <- paste0('https://www.uniprot.org/uniprot/?query=', query, '&format=tab&columns=id,genes(OLN)')
cat(query)
# https://www.uniprot.org/uniprot/?query=%22P19366%22+or+%22P0DH99%22+or+%22Q9FN92%22+or+%22Q9C8E6%22+or+%22O49347%22+or+%22P27140%22+or+%22P10896%22+or+%22Q9C534%22+or+%22O04130%22+or+%22Q9XI93%22&format=tab&columns=id,genes(OLN)

So the problem is not with the slash at all; it's with the quotation mark. I found this by making the request in the browser and then looking at the network tab to see the request as it was sent and it's clear the escaping was done in the request.

MrFlick
  • 195,160
  • 17
  • 277
  • 295