1

Very new to R but I'm hoping this will be simple. I have a html table I scraped into R and it looks like this:

`'data.frame':  238 obs. of  6 variables:
$ Facility Name      : chr  "Affinity Healthcare Center"     "Alameda Care Center" "Alcott Rehabilitation Hospital" "Alden Terrace Convalescent Hospital" ...
$ City               : chr  "Paramount" "Burbank" "Los Angeles" "Los Angeles" ...
$ State              : chr  " CA" " CA" " CA" " CA" ...
$ Confirmed Staff    : chr  "26" "36" "14" "27" ...
$ Confirmed Residents: chr  "29" "49" "26" "85" ...
$ Total Deaths       : chr  26 36 14 27 19 3 1 7 16 3 ...`

I want Confirmed Staff, Confirmed Residents and Total Deaths to be integers so I can do some math on them and sort, order, etc.

I tried this for one variable and it seemed to work ok:

`tbls_ls4$`Total Deaths` <- as.integer(tbls_ls4$`Confirmed Staff`)`

But I'd like to apply it to all three variables and not sure how to do that.

jkandel
  • 11
  • 1
  • Wouldn't it be better to have `factors` ? – linog May 08 '20 at 17:00
  • Does this answer your question? [Convert data.frame columns from factors to characters](https://stackoverflow.com/questions/2851015/convert-data-frame-columns-from-factors-to-characters) – NelsonGon May 08 '20 at 17:02
  • 1
    `lapply(df,as.integer)` or `dplyr::mutate(across(is.character,~as.integer)` – NelsonGon May 08 '20 at 17:03
  • Be aware that `across` is only available in development versions of `dplyr`. – Ian Campbell May 08 '20 at 17:08
  • 1
    @NelsonGon: I was asking @jkandel, if he/she knew that `factor` variables allow sorting/ordering etc. without loosing the information about label. I think that, given the question, `factors` are better for the first two variables – linog May 08 '20 at 17:08

2 Answers2

2

Several ways to do that:

library(data.table)
setDT(df)
cols <- c("Confirmed Staff", "Confirmed Residents", "Total Deaths")
df[,cols := lapply(.SD, as.integer),.SDcols = cols]

or if you prefer base R :

df[, cols] <- lapply(cols, function(d) as.integer(df[,d]))
linog
  • 5,786
  • 3
  • 14
  • 28
0

You could also use mutate_at from the tidyverse package to convert certain columns from character to integer.

library(tidyverse)

df<-df %>%
  # Indicate the column names you wish to convert inside vars
  mutate_at(vars(ConfirmedStaff,ConfirmedResidents,TotalDeaths),
            # Indicate the function to apply to each column
            as.integer)