0

I have a data frame with an '#' in one row of column 'PLZ'. The str function for the dataframe gives me

 $ PLZ                 : Factor w/ 1939 levels "#","10115","10117",..: 1

It shows the '#' as a level of the data in that column.

When filtering out all entries with '#' into a new dataframe and calling the str function for the filtered data frame I still get a '#' shown as level of the data in column 'PLZ'.

    filtered_data <-filter(data_frame,PLZ!="#")
    str(filtered_data)
'data.frame':   85297 obs. of  25 variables:
...
$ PLZ                 : Factor w/ 1939 levels "#","10115","10117",..: 647 588 499 499 499 499 499 499 499 499 ...
 ...

Since all entries with '#' in 'PLZ' were filtered out, I expected no level entry for '#' in the output of the str-function.

Are there an explanations for this?

1 Answers1

0

Try using droplevels, as following :

data_frame <- data.frame(PLZ = as.factor(c("#","10115","10117")))
filtered_data <-filter(data_frame,PLZ!="#") %>% droplevels()
str(filtered_data)
FALL Gora
  • 481
  • 3
  • 8