How to remove certain junk values from a column in a data frame?

Question

I have a column called Region in a data frame which is of character type. It has certain junk values as below which I want to remove:

"#VALUE!","10.1","10.2","138","145","161"

But when I try to remove using things like subset they don't get removed as follows:

subset(pro_202_data,Region != c("#VALUE!","10.1","10.2","138","145","161"))

I have tried using only != but that also doesn't work.

Please help.

Does this answer your question? [Filter rows which contain a certain string](https://stackoverflow.com/questions/22850026/filter-rows-which-contain-a-certain-string) — user438383, Oct 13 '20 at 09:36

score 0 · Answer 1 · answered Oct 13 '20 at 09:42

Does this answer, I've created a dataframe with what you've provided and tried to filter out first two rows, you can try similarly for your entire dataframe.

> pro_202_data
   Region
1 #VALUE!
2    10.1
3    10.2
4     138
5     145
6     161
> subset(pro_202_data, !(Region %in% c("#VALUE!","10.1")))
  Region
3   10.2
4    138
5    145
6    161
>

score 0 · Answer 2 · answered Oct 13 '20 at 09:54

You can subset like this:

Single vector:

x <- c("#VALUE!","10.1","10.2","138","145","161")
x[!x=="#VALUE!"]
[1] "10.1" "10.2" "138"  "145"  "161"

Dataframe:

df <- data.frame(
  Region = c("#VALUE!","10.1","10.2","138","145","161"), stringsAsFactors = F
)

df[!df$Region=="#VALUE!",]
[1] "10.1" "10.2" "138"  "145"  "161"

Note the addition of the ,to select all columns of the dataframe.

How to remove certain junk values from a column in a data frame?

2 Answers2