0

I'm starting my journy with r, I find it incredibly fun and useful, yet I have so many things to learn so I'd appreciate any suggestions:

I have a large list of 351 elements (called 'geo_info'), which is the result of applying cartociudad_geocode() function (which retrieves available geographic info from character strings containing addresses) to my 'addresses' vector.

Here's part of 'geo_info' str() (bc it's too long):

 $ Jesus Martin Nº 32 Planta 3 Castellon De La Plana/Castello De La Plana             : NULL
 $ Obispo Climent Nº S/N Planta Pb+2 Castellon De La Plana/Castello De La Plana       :'data.frame':    1 obs. of  13 variables:
  ..$ id               : chr "120400000973"
  ..$ province         : chr "Castellón/Castelló"
  ..$ comunidadAutonoma: chr "Comunitat Valenciana"
  ..$ muni             : chr "Castelló de la Plana"
  ..$ type             : chr "portal"
  ..$ address          : chr "OBISPO CLIMENT"
  ..$ geom             : chr "POINT(-0.0364751229999456 39.985197242)"
  ..$ tip_via          : chr "CALLE"
  ..$ lat              : num 40
  ..$ lng              : num -0.0365
  ..$ stateMsg         : chr "Resultado exacto de la búsqueda"
  ..$ state            : chr "1"
  ..$ countryCode      : chr "011"
 $ Huerto Sogueros, Plaza Nº Sn Planta 3 Castellon De La Plana/Castello De La Plana   :'data.frame':    1 obs. of  14 variables:
  ..$ id               : chr "120400005053"
  ..$ province         : chr "Castellón/Castelló"
  ..$ comunidadAutonoma: chr "Comunitat Valenciana"
  ..$ muni             : chr "Castelló de la Plana"
  ..$ type             : chr "portal"
  ..$ address          : chr "HUERTO SOGUEROS"
  ..$ geom             : chr "POINT(-0.0421460729999694 39.986166069)"
  ..$ tip_via          : chr "PLAZA"
  ..$ lat              : num 40
  ..$ lng              : num -0.0421
  ..$ portalNumber     : chr "5"
  ..$ stateMsg         : chr "Resultado exacto de la búsqueda"
  ..$ state            : chr "1"
  ..$ countryCode      : chr "011"
 $ Mijares, Ronda Nº S/N Castellon De La Plana/Castello De La Plana                  : NULL

I want to transform such list into a data frame, and it worked when using DF<-dplyr::bind_rows(geo_info, .id = 'data.frame') but that code doesn't take the null values into account. You can see some of the strings didn't get geolocalized (due to whatever) and appear as 'NULL'. Is there a way I can transform my list into a data frame while keeping the null elements? whith 'NA' or '0'.

For extra info, here's the code I used for the geolocalization: geo_info<-sapply(addresses, cartociudad_geocode, on.error="warn")

And here's the head() of the list

> dput(head(geo_info, 4))
list(`Valencia, Avenida Nº S.n. Planta 1 Castellon De La Plana/Castello De La Plana` = structure(list(
    id = "2061380170886", province = "Badajoz", comunidadAutonoma = "Extremadura", 
    muni = "Valdetorres", type = "portal", address = "PLAN PARCIAL Nº 1", 
    postalCode = "06474", poblacion = "Valdetorres", geom = "POINT(-6.07339711488708 38.91670317033)", 
    tip_via = "BARRIO", lat = 38.91670317033, lng = -6.07339711488708, 
    portalNumber = "0", stateMsg = "Resultado exacto de la búsqueda", 
    extension = "", state = "1", countryCode = "011"), row.names = c(NA, 
-1L), class = "data.frame"), `Mayor Nº S.n. Planta 1 Castellon De La Plana/Castello De La Plana` = structure(list(
    id = "2061380170886", province = "Badajoz", comunidadAutonoma = "Extremadura", 
    muni = "Valdetorres", type = "portal", address = "PLAN PARCIAL Nº 1", 
    postalCode = "06474", poblacion = "Valdetorres", geom = "POINT(-6.07339711488708 38.91670317033)", 
    tip_via = "BARRIO", lat = 38.91670317033, lng = -6.07339711488708, 
    portalNumber = "0", stateMsg = "Resultado exacto de la búsqueda", 
    extension = "", state = "1", countryCode = "011"), row.names = c(NA, 
-1L), class = "data.frame"), `Notario Mas, Plaza Nº 3 Piso 1º Castellon De La Plana` = structure(list(
    id = "120400001216", province = "Castellón/Castelló", comunidadAutonoma = "Comunitat Valenciana", 
    muni = "Castelló de la Plana", type = "portal", address = "NOTARIO MAS", 
    geom = "POINT(-0.0414310339999702 39.9877939630001)", tip_via = "PLAZA", 
    lat = 39.9877939630001, lng = -0.0414310339999702, portalNumber = "5", 
    stateMsg = "Resultado exacto de la búsqueda", state = "1", 
    countryCode = "011"), row.names = c(NA, -1L), class = "data.frame"), 
    `Mayor Nº 56 Piso 1º Castellon De La Plana` = NULL)
  • If your data `geolong` is a list, would you try `dput(head(geolong, 4))` and paste result on your post? It will be much easier to help with reproducible example. – Park Jun 03 '22 at 07:53
  • @Park ok! sorry, even if I go through the question before posting it, I'll still have some mistakes. – Dulipip Truman Miller Jun 03 '22 at 08:25
  • Thanks @Park! and don't worry. I obviously didn't explain myself correctly, I would like to keep the null values in the DF but keeping the order in which they appear, since my intention is joining the df with a main data frame that I have (that I used to extract the 'addresses' vector. For example, in the str(), the first string didn't get geolocalized, when converting it into a df (as NA o null), and joining it (horizontally) with its 'father' df, the row order (1st row in 'father' df and 1st row in 'child' will correspond to the same observation. Don't worry if it's too complicate ! – Dulipip Truman Miller Jun 06 '22 at 08:16

1 Answers1

0

Sorry I answering late after you reply.

You may achieve your goal pretty easily, just manually adding NULL valued things.

I let your list as df

bind_rows(df, .id = 'data.frame') %>%
  full_join(data.frame(data.frame = names(df[which(sapply(df, is.null))])))

                                                                      data.frame            id           province    comunidadAutonoma                 muni
1 Valencia, Avenida Nº S.n. Planta 1 Castellon De La Plana/Castello De La Plana 2061380170886            Badajoz          Extremadura          Valdetorres
2             Mayor Nº S.n. Planta 1 Castellon De La Plana/Castello De La Plana 2061380170886            Badajoz          Extremadura          Valdetorres
3                        Notario Mas, Plaza Nº 3 Piso 1º Castellon De La Plana  120400001216 Castellon/Castello Comunitat Valenciana Castello de la Plana
4                                    Mayor Nº 56 Piso 1º Castellon De La Plana          <NA>               <NA>                 <NA>                 <NA>
    type            address postalCode   poblacion                                        geom tip_via      lat         lng portalNumber
1 portal PLAN PARCIAL Nº 1      06474 Valdetorres     POINT(-6.07339711488708 38.91670317033)  BARRIO 38.91670 -6.07339711            0
2 portal PLAN PARCIAL Nº 1      06474 Valdetorres     POINT(-6.07339711488708 38.91670317033)  BARRIO 38.91670 -6.07339711            0
3 portal        NOTARIO MAS       <NA>        <NA> POINT(-0.0414310339999702 39.9877939630001)   PLAZA 39.98779 -0.04143103            5
4   <NA>               <NA>       <NA>        <NA>                                        <NA>    <NA>       NA          NA         <NA>
                         stateMsg extension state countryCode
1 Resultado exacto de la busqueda               1         011
2 Resultado exacto de la busqueda               1         011
3 Resultado exacto de la busqueda      <NA>     1         011
4                            <NA>      <NA>  <NA>        <NA>

New

This answer is from @ytu, in rbinding with a list containing empty lists for NAs in R

library(dplyr)
change_others_to_dataframe <- function(x) {

  if (is.data.frame(x)) {return(x)}
  else {return(setNames(data.frame(matrix(ncol = ncol(df[[1]]), nrow = 1)),
                        names(df[[1]])))}
}

mynewList <- lapply(df, change_others_to_dataframe)

bind_rows(mynewList, .id = "id")
Park
  • 14,771
  • 6
  • 10
  • 29
  • Thank you Park and don't worry. I obviously didn't explain myself correctly, I would like to keep the null values in the DF but keeping the order in which they appear, since my intention is joining the df with a main data frame that I have (that I used to extract the 'addresses' vector. For example, in the str(), the first string didn't get geolocalized, when converting it into a df (as NA o null), and joining it (horizontally) with its 'father' df, the row order (1st row in 'father' df and 1st row in 'child' will correspond to the same observation. Don't worry if it's too complicate ! @Park – Dulipip Truman Miller Jun 06 '22 at 08:14
  • @DulipipTrumanMiller I'm not sure if I understand your purpose correctly. If you want to make dataframe in order, new code may works. – Park Jun 07 '22 at 00:04