R - remove rows from a data frame with empty lines (not only numbers)

Question

The issue seems to be something already treated but after a check I couldn't find any solution. I load a table from a file and it could be (don't know how) that some entire lines are empty. So when I get the data frame I got

  #   id c1 c2
  # 1  a  1  2
  # 2  b  2  4
  # 3    NA NA
  # 4  d  6  1
  # 5  e  7  5
  # 6    NA NA

if I do

apply(df, 1, function(x) all(is.na(x))

I got all FALSE as the first column is not a number (the table is much bigger with mixed character and numeric columns) and I can't filter these lines. Also with na.omit or complete.cases I cannot sort it out. Is there any function or expression to check empty rows?

score 0 · Accepted Answer · edited May 23 '17 at 10:28

0

You may be able to cut this problem off at the source with the parameters you pass to read.csv:

For instance if the blanks are one space or blanks you could use

df <- read.csv(<your other logic here>, na.strings=c("NA","", " ")

This question seems to raise similar issues: read.csv blank fields to NA

If this works, then you can use the apply logic to work with the offending rows.

edited May 23 '17 at 10:28

Community

1
1

answered Mar 07 '16 at 15:55

Josh Rumbut

2,640
2
32
43

So do the opposite than there and replace black spaces by `NA` and then the apply logic? – Stefano Mar 07 '16 at 16:01
Yeah, for me the linked issue is definitely distinct from yours, however it brings some options such as the na.strings parameter that I believe will solve your problem. Could you post an excerpt of the file that causes the issue, then maybe I can figure out something more specific? – Josh Rumbut Mar 07 '16 at 16:10
Actually you are totally right!! using `na.strings` I can set empty data by NA and then I can use the apply function! thanks a lot! – Stefano Mar 07 '16 at 16:12
Glad to help @Stefano! – Josh Rumbut Mar 07 '16 at 16:19

R - remove rows from a data frame with empty lines (not only numbers)

1 Answers1