0

How to modify this script to read form a list of IDs and return the results ?

# Install relevant library for HTTP requests
library(httr)

# Set gene_id variable for AR (androgen receptor)
gene_id <- "ENSG00000169083"

# Build query string to get general information about AR and genetic constraint and tractability assessments 
query_string = "
  query target($ensemblId: String!){
    target(ensemblId: $ensemblId){
      id
      approvedSymbol
      biotype
      geneticConstraint {
        constraintType
        exp
        obs
        score
        oe
        oeLower
        oeUpper
      }
      tractability {
        id
        modality
        value
      }
    }
  }
"

# Set base URL of GraphQL API endpoint
base_url <- "https://api.platform.opentargets.org/api/v4/graphql"

# Set variables object of arguments to be passed to endpoint
variables <- list("ensemblId" = gene_id)

# Construct POST request body object with query string and variables
post_body <- list(query = query_string, variables = variables)

# Perform POST request
r <- POST(url=base_url, body=post_body, encode='json')

# Print data to RStudio console
print(content(r)$data)

I tried just a simple query from the documentation of Open Targets. Their graphQL API doesn't support multiple IDs.

1 Answers1

0

Iterating over multiple IDs might be done like this:

IDs <- c("ENSG00000169083", "ENSG00000169084", ...)
alldata <- lapply(IDs, function(gene_id) {
  post_body <- list(query = query_string, variables = list("ensemblId" = gene_id))
  res <- tryCatch(
    POST(url=base_url, body=post_body, encode='json'),
    error = function(e) e)
  if (inherits(res, "error")) {
    warning("error for gene ", sQuote(gene_id, FALSE), ": ",
            conditionMessage(res), call. = FALSE)
    res <- NULL
  } else if (status_code(res) != 200) {
    warning("abnormal return for gene ", sQuote(gene_id, FALSE), ": ",
            status_code(res), call. = FALSE)
    res <- NULL
  } else {
    res <- content(res)$data
  }
  res
})

From here, you should have a list with one element per ID, over to you on combining the data (whether rbind, bind_rows, rbindlist, or something list-specific perhaps).

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • gene_id is getting its values from IDs? So i dont have to use anymore the first line gene_id <- "ENSG00000169083" and instead to replace it with this part of the code right ? – Dimitris Zisis Mar 31 '23 at 13:49
  • Yes. You mentioned wanting to repeat this over multiple genes but never specified where that gene list/vector was stored, so I thought I'd label it `IDs` here. Use what you have. – r2evans Mar 31 '23 at 13:50
  • Yes i can have a list of 10 ensemblID or a csv for example. I want to repeat the query for different IDs and then get the data in JSON format or as you mentioned combine some of them in dataframe. – Dimitris Zisis Mar 31 '23 at 13:55
  • Yes. Try this method first with 2 or 3 such ids. The _symbol_ `IDs` is not critical, you can use whatever you want, so long as the first argument to `lapply` (`IDs` here) provides a vector or list of individual genes, each a single string. (Remove the `...` from my `IDs` definition, that is a placeholder to indicate "as many genes as you have/need".) – r2evans Mar 31 '23 at 13:57
  • Greate Thank you! i already tried it for 2 ensemblIDs and it works. Returns a list for the 2 ids with the information which can be used as we wish maybe to construct a dataframe etc. What if i want to save this result in a JSON format ? Is it possible ? – Dimitris Zisis Mar 31 '23 at 14:05
  • Since each return is a JSON, you can likely do something like `writeLines(paste(alldata, collapse="\n"), "somefile.json")` to create a newline-delimited json (ndjson), which is consumable by many packages including `jsonlite` in R. If you only need it in the current session, it may be simpler for you (if you don't want/need to save it) to `lapply(alldata, jsonlite::fromJSON)` (with any of _its_ options/arguments) and then consider if/how to combine into one frame or a [list of frames](https://stackoverflow.com/a/24376207/3358227). – r2evans Mar 31 '23 at 14:09
  • can we use the `head(content(res)$data, 1)` for example out of the function in a print ? i want to use some of the content for example to create data frame for each IDs and the info returned from the function for multiple IDs – Dimitris Zisis Apr 03 '23 at 14:04
  • You can always `print` something from inside the function, but nothing `print`ed is going to be used to form a dataframe (well, doing so is particularly obscure and a LOT more work than it is worth). If you need to take the return value and create a frame with a sample of data, then I suggest you create another function that iterates over the embedded frames/list-structures, extracts one or two elements, and returns a frame. Wrap the call to that function in `data.table::rbindlist` or `dplyr::bind_rows` and you have a full summary. – r2evans Apr 03 '23 at 14:12
  • I tried the `writeLines(paste(alldata, collapse="\n"), "test_gwas.json")` but it creates JSON file which is like `list(search = list(variants = list(list(id = "1_2909753_G_A", rsId = "rs114818383", mostSevereConsequence = "intergenic_variant", nearestGene = list(chromosome = "1", start = 2636986, end = 2801693, id = "ENSG00000215912", symbol = "TTC34", tss = 2801693, description = "tetratricopeptide repeat domain 34 [Source:HGNC Symbol;Acc:HGNC:34297]", bioType = "protein_coding", `__typename` = "Gene"), nearestGeneDistance = 108060))))` and it contains list(search = list(variants = list(list( – Dimitris Zisis Apr 05 '23 at 07:40
  • If you need the unparsed text, read [`?content`](https://httr.r-lib.org/reference/content.html). It suggests `content(res, as="text")`. – r2evans Apr 05 '23 at 13:50
  • I used the `json_data <- toJSON(alldata, force = T)` to make it proper and i can see it in a browser but still it doesn't look good to me. Can i simplify the format of my JSON file ? Lets say looks like that `[ { "search": { "variants": [ { "id": "1_2909753_G_A", "rsId": "rs114818383"}]}` . Is it possible to remove the `[ { "search": { "variants": [ { and just have the results ? – Dimitris Zisis Apr 05 '23 at 13:57
  • You say you want to save to a json file, why are you converting out of json before writing it? – r2evans Apr 05 '23 at 14:30