-2

I have the attached csv file. How can I fill the blank cells with NA or remove them?

I have tried following codes from Change the Blank Cells to "NA" but nothing happens.

sm2=read.csv('sample_info_2.csv',header=T, na.strings=" ")
sm3 <- sm2 %>% mutate_all(na_if,"")
sm3=sm2[complete.cases(sm2),]

enter image description here I want to remove the blank cells or fill them with NA. Any help? Best, Zillur

zillur rahman
  • 355
  • 1
  • 13
  • 2
    May be you have `""` instead of `NA` use `read.csv('sample_info_2.csv',header=T, na.strings=c(" ", ""))` and then with `complete.cases` should work – akrun Aug 01 '19 at 15:28
  • 1
    If akrun's comment or my answer do not address your issue, however, please do not post an image of code/data/errors: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: https://meta.stackoverflow.com/a/285557/3358272 (and https://xkcd.com/2116/). Please just include the code or data (e.g., `dput(head(x))` or `data.frame(...)`) directly. – r2evans Aug 01 '19 at 15:49

2 Answers2

1

I think @akrun's comment is right (and this really should be handled on data-import if possible), but if you need to handle it post-import then it might be missing one point: since your mutate_all(na_if, "") did not work, you have fields that are whitespace-only with one or more whitespace characters. If all of these fields have the same amount of whitespace, then perhaps

sm2 %>% mutate_all(na_if, " ") # or "  " or "   " ...

will work, but it is inflexible. A more accommodating technique might be:

sm2 %>% mutate_if(is.character, ~ if_else(grepl("\\S", .), ., NA_character_))

which does two things:

  1. I do not assume that all of your columns are strings, so mutate_if(is.character, ...) instead of mutate_all, perhaps minor; and

  2. grepl("\\S", .) will return TRUE for all elements of a vector that have one (or more) non-whitespace character, and the if_else handles replacement when false. (It might also be feasible to picture this as nzchar(gsub("\\s", "", .)) which first removes all whitespace and then returns true for string length greater than 0.)

r2evans
  • 141,215
  • 6
  • 77
  • 149
1

To fill all of the blanks with NA you need to cover more options with the na.strings argument. Here is a dummy example that demonstrates how to fill the blanks in numeric and string columns.

col1 <- rep(c(1,2,"",3), 3)
col2 <- rep(c(2,3,"",4), 3)
col3 <- rep(c("a","","b","c"), 3)
my_df <- data.frame(col1, col2, col3)

csv_file_path <- paste0(getwd(), "/", "data_with_gaps.csv")
write.csv(my_df, csv_file_path, row.names = FALSE)

df <- read.csv(csv_file_path, header = TRUE, na.strings = c(""," ", "NA"))