4

I have seen this answer: Error in bind_rows_(x, .id) : Column can't be converted from factor to numeric but I can't mutate_all() a list.

library(rvest)
library(dplyr)
library(tidyr)

fips <- read_html("https://www.census.gov/geo/reference/ansi_statetables.html") %>% 
  html_nodes("table") %>% 
  html_table() %>% 
  bind_rows(.[[1]][1:3] %>% 
          transmute(name = `Name`,
                    fips = as.character(`FIPS State Numeric Code`),
                    abb = `Official USPS Code`),
        filter(.[[2]][1:3], !grepl("Status",c(`Area Name`))) %>% 
          transmute(name = `Area Name`,
                    fips = as.character(`FIPS State Numeric Code`),
                    abb = `Official USPS Code`))

Error in bind_rows_(list_or_dots(...), id = NULL) : 
  Column `FIPS State Numeric Code` can't be converted from integer to character

This code however works just fine:

fips <- read_html("https://www.census.gov/geo/reference/ansi_statetables.html")

dat3 <- fips %>%
  html_nodes("table") %>% 
  html_table()

rbind(dat3[[1]][1:3] %>% 
        transmute(name = `Name`,
                  fips = `FIPS State Numeric Code`,
                  abb = `Official USPS Code`),
filter(dat3[[2]][1:3], !grepl("Status",c(`Area Name`))) %>% 
  transmute(name = `Area Name`,
            fips = `FIPS State Numeric Code`,
            abb = `Official USPS Code`))
SCDCE
  • 1,603
  • 1
  • 15
  • 28

1 Answers1

7

As pointed out by @akrun in the comments, bind_rows is type sensitive. Therefore, I would first use lapply within dplyr to mutate_if over the list and then bind_rows of the character data frames, setNames to be able to call the variable in filtering by Area_Name in the final step:

fips <- read_html("https://www.census.gov/geo/reference/ansi_statetables.html") %>% 
  html_nodes("table") %>% 
  html_table() %>% 
  lapply(., mutate_if, is.integer, as.character) %>%
  bind_rows() %>%
  setNames(gsub(" ", "_", names(.))) %>%
  filter(!grepl("Status", Area_Name)) %>%
  mutate(Name = ifelse(is.na(Name), Area_Name, Name)) %>%
  select(Name, FIPS_State_Numeric_Code, Official_USPS_Code)
Nic
  • 363
  • 2
  • 8
  • I think there is something else going on in the evaluation, may be the order of evaluation as the OP did convert both the datasets to same class – akrun Jul 11 '18 at 15:26
  • If you do this separately ie. create two objects and then do the `bind_rows`, it works well – akrun Jul 11 '18 at 15:29
  • I edited the code (last two lines `mutate` and `select` and combined Name and Area Name. Not sure if this is the result that he expects though. If he doesn't want the rows where Code is ``, just add `%>% filter(!is.na(Official_USPS_Code))` – Nic Jul 11 '18 at 15:40
  • Looks good, thanks! I'm still not really understanding why my code above doesn't work though. – SCDCE Jul 11 '18 at 15:51