0

I am trying to remove data from a data frame based off of conditions from two columns.

Data1 <- Data[- grep("",Data$Item# || "12345" Data$Charge)

Basically I would like to remove the entire row if there is no value in Data$Item# and if there is the value "12345" in Data$Charge. I can do them each separately but cannot combine them.

Here's the data

Item#   Charge 
50      00000
61      12345
        12345
43      00000
        02521
7       12345

What I am trying to get to is

50      00000
61      12345
43      00000 
        02521
7       12345
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
Cherub
  • 1
  • 1
  • In your code you use 'or' but in your question you used 'and', so I'm not sure what the condition is, but something like this should work: `Data1 <- Data[ -c(which(Data$Item# %in% "" | Data$Charge %in% "12345" ))` – dario Mar 07 '20 at 19:21
  • If you need more assistance it would be really helpful if you added a If you add a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) with an example how the corresponding **result** should look like, in an edit to your question. That way it's easier for others to finda and test their answers. – dario Mar 07 '20 at 19:24
  • Your data and your result are identical and neither has two columns. Probably something like `Data1 <- Data[ -(Data$Item=="" | Data$Charge=="12345"), ]`. Note the comma after the logical expression and the closing bracket. – dcarlson Mar 07 '20 at 21:42

1 Answers1

0

There are a couple of problems with your code. Firstly, you have a "#" in one of your column names, which means you need to quote the column name; otherwise R will think you are starting a comment and will ignore the rest of the line. So your line

Data1 <- Data[- grep("",Data$Item# || "12345" Data$Charge)

will be interpreted as

Data1 <- Data[- grep("",Data$Item

by R, which will give you a syntax error.

In any case, this syntax isn't how grep works. If you want to test columns based on multiple regexes, you can use multiple grepl functions, each of which returns a logical vector, and just combine these. However, you can't use grep or grepl with an empty string, since every string contains an empty string! You can test for an empty string just by using == ""

When subsetting by rows, you need a comma after the conditions and the closing bracket, as @dcarlson pointed out.

Lastly, you should use & if you want to find the cases where only both conditions apply rather than ||.

Therefore your code should be:

Data1 <- Data[-which(Data$'Item#' == "" & grepl("12345", Data$Charge)),]
Data1
#>   Item# Charge
#> 1    50      0
#> 2    61  12345
#> 4    43      0
#> 5         2521
#> 6     7  12345

I believe this matches your expected output.


Data used

df <- structure(list(`Item#` = c("50", "61", "", "43", "", "7"), Charge = c("0", 
"12345", "12345", "0", "2521", "12345")), row.names = c(NA, -6L
), class = "data.frame")
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87