0

I have a column (species) in the data set with two factors. However, the summary function and the environment show an additional factor: an empty string "". The empty string has zero counts. I cant see the empty string manually on examination of data frame. I have tried all sorts of ways to remove the empty string.

dat <- dat[dat$species != "",] #directly removing it

dat[dat$species=="",] <- NA #converting to NA and them removing it dat <- dat[!(is.na(dat$species) | dat$species==""), ]

dat[complete.cases(dat$species),] #complete cases

However, I've had no luck. The empty string persists. I suspect that this is causing a big problem in further analysis.

Community
  • 1
  • 1
Rspacer
  • 2,369
  • 1
  • 14
  • 40
  • Use `dat <- droplevels(dat)` – MrFlick Feb 18 '20 at 21:02
  • do you mean dat <- droplevels(dat$species) I may have some other unfilled columns in other columns. However, species I know is full – Rspacer Feb 18 '20 at 21:04
  • 1
    If you only want to clean up one column, then `dat$species<-droplevels(dat$species)` would work. But if you want to clean all columns of factor levels that are missing, then you can just run it on the whole data.frame – MrFlick Feb 18 '20 at 21:05
  • I see that you have closed the topic and said the question had already been answered. There is no way would have gotten to the terms "drop factor level" if I hadnt seen your answer. – Rspacer Feb 18 '20 at 21:45
  • Right. We try to point you to the questions with the correct terms that can answer your question. In a perfect world, every question would already have an answer somewhere on Stack Overflow, but they may not always be easy to find. Rather than answering redundant questions, we just try to point people to the right place. – MrFlick Feb 18 '20 at 21:49

0 Answers0