I have a file ("my_file") in R that looks something like this:
NAME Address_Parse
1 name1 [('372', 'StreetNumber'), ('river', 'StreetName'), ('St', 'StreetType'), ('S', 'StreetDirection'), ('toronto', 'Municipality'), ('ON', 'Province'), ('A1C', 'PostalCode'), ('9R7', 'PostalCode')]
2 name2 [('208', 'StreetNumber'), ('ocean', 'StreetName'), ('St', 'StreetType'), ('E', 'StreetDirection'), ('Toronto', 'Municipality'), ('ON', 'Province'), ('J8N', 'PostalCode'), ('1G8', 'PostalCode')]
In case the structure is confusing, here is how the file looks like
my_file = structure(list(NAME = c("name1", "name2"), Address_Parse = c("[('372', 'StreetNumber'), ('river', 'StreetName'), ('St', 'StreetType'), ('S', 'StreetDirection'), ('toronto', 'Municipality'), ('ON', 'Province'), ('A1C', 'PostalCode'), ('9R7', 'PostalCode')]",
"[('208', 'StreetNumber'), ('ocean', 'StreetName'), ('St', 'StreetType'), ('E', 'StreetDirection'), ('Toronto', 'Municipality'), ('ON', 'Province'), ('J8N', 'PostalCode'), ('1G8', 'PostalCode')]"
)), class = "data.frame", row.names = c(NA, -2L))
Objective: For each row, I would like to take each of the "elements" (e.g. "StreetNumber", "StreetName", "StreetType", etc.) and convert it into a new column. This would look something like this:
name StreetNumber StreetName StreetType StreetDirection Municipality Province PostalCode
1 name1 372 river St S toronto ON A1C9R7
2 name2 208 ocean St E Toronto ON J8N1G8
To me, it appears that the address field is in JSON format (I could be wrong about this). I tried to look at different ways I could parse the JSON. For example, I tried to apply the answer provided here (R: convert nested JSON in a data frame column to addtional columns in the same data frame):
library(dplyr)
library(tidyr)
library(purrr)
library(jsonlite)
final = my_file %>%
mutate(
json_parsed = map(Address_Parse, ~ fromJSON(., flatten=TRUE))
) %>%
unnest(json_parsed)
However, this is giving me the following error:
Error in `mutate()`:
! Problem while computing `json_parsed = map(Address_Parse, ~fromJSON(., flatten = TRUE))`.
Caused by error:
! lexical error: invalid char in json text.
[('372', 'StreetNumber'), ('rive
(right here) ------^
Run `rlang::last_error()` to see where the error occurred.
I then tried another approach:
final <- my_file %>%
rowwise() %>%
do(data.frame(fromJSON(.$Address_Parse , flatten = T))) %>%
ungroup() %>%
bind_cols(my_file %>% select(-Address_Parse ))
But I now get a new error:
Error: lexical error: invalid char in json text.
[('372', 'StreetNumber'), ('rive
(right here) ------^
Can someone please show me to resolve this?
Thank you!