0

I would like to store into a csv file the list of all public funded projects in France, which are listed in the website below:

https://aides-territoires.beta.gouv.fr/aides/?integration=&targeted_audiences=&perimeter=&text=&apply_before=&is_charged=all&action=search-filter&page=1

I used the websste API to get the JSON file containing all the projects, with the following command (using "jsonlite" package):

my_url <- "https://aides-territoires.beta.gouv.fr/api/aids/all/"

results <- 
  httr::content(
    httr::GET(my_url),
    as="text",  
    httr::content_type_json(),  
    encoding= "UTF-8"    
  )

The problem is after... I am totally beginner with JSON files manipulation, and I do not manage to transpose the information which is contained in "results" to a data frame, with column names corresponding to each project ("id","slug","url","name",etc.). Some project items are lists, others are character vectors, etc.

I tried some commands I found such as below:

df <- data.frame(
  lapply(c("id","slug","url","name","name_initial","short_title","financers",
           "instructors","programs","description","eligibility","perimeter",
           "mobilization_steps","origin_url","is_call_for_project",
           "application_url","is_charged",
           "destinations","start_date","predeposit_date","submission_deadline",
           "subvention_rate_lower_bound","subvention_rate_upper_bound",
           "loan_amount","recoverable_advance_amount","contact","recurrence",
           "project_examples","import_data_url","import_data_mention",
           "import_share_licence","date_created","date_updated"), 
         function(x){fromJSON(results,flatten = TRUE)$results[[x]]})
)

But I get the message below:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 2, 0, 3, 4, 11, 7, 5, 15

  • What do you want the final table to look like exactly? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. With nested JSON data it's not clear how you would transform that into a "clean" data.frame. Data frames are inherently a "rectangular" data structure and JSON files are not. What do you need to do with it after it's in a data.frame? – MrFlick Jan 19 '23 at 18:40
  • When keeping the "flat" portion in one df / csv and each nested feature in its own 2-col table together with IDs from main table it's actually a pretty manageable dataset, 1 + 8 tables, ready to load into duckdb, sqlite or what not, ~12MB when saved as csv-s. – margusl Jan 19 '23 at 21:23

1 Answers1

0

With httr2 package you can do:

library(tidyverse)
library(httr2)

"https://aides-territoires.beta.gouv.fr/api/aids/all/" %>% 
  request() %>% 
  req_perform() %>% 
  resp_body_json(simplifyVector = TRUE) %>% # SimpplifyVector is the real hero 
  pluck("results") %>% # Grab the results list
  as_tibble() # Create a tibble

# A tibble: 3,282 × 31
       id slug            url   name  short_title financers instructors programs
    <int> <chr>           <chr> <chr> <chr>       <list>    <list>      <list>  
 1  70202 2d94-se-former… /aid… Se f… ""          <chr [1]> <chr [0]>   <chr>   
 2   8075 ae3b-etude-reh… /aid… Mett… ""          <chr [1]> <chr [0]>   <chr>   
 3 117392 c650-preserver… /aid… Prés… ""          <chr [1]> <chr [0]>   <chr>   
 4 117180 e8e0-soutenir-… /aid… Sout… ""          <chr [1]> <chr [0]>   <chr>   
 5  78196 ef73-soutenir-… /aid… Sout… ""          <chr [1]> <chr [0]>   <chr>   
 6  22827 c372-aide-a-la… /aid… Fina… ""          <chr [1]> <chr [0]>   <chr>   
 7  90762 6564-creer-une… /aid… Crée… ""          <chr [2]> <chr [0]>   <chr>   
 8  30762 9e6a-soutien-d… /aid… Sout… ""          <chr [1]> <chr [0]>   <chr>   
 9  90797 f299-activites… /aid… Sout… ""          <chr [1]> <chr [0]>   <chr>   
10  94752 46de-accelerer… /aid… Déve… ""          <chr [2]> <chr [0]>   <chr>   
# … with 3,272 more rows, and 23 more variables: description <chr>,
#   eligibility <chr>, perimeter <chr>, mobilization_steps <list>,
#   origin_url <chr>, categories <list>, is_call_for_project <lgl>,
#   application_url <chr>, targeted_audiences <list>, aid_types <list>,
#   destinations <list>, start_date <chr>, predeposit_date <chr>,
#   submission_deadline <chr>, subvention_rate_lower_bound <int>,
#   subvention_rate_upper_bound <int>, loan_amount <int>, …
Chamkrai
  • 5,912
  • 1
  • 4
  • 14