8

I want to delete some rows based on two conditions. Here is my code

test <-datasetjoin[!(datasetjoin$Occupation == "Clerical" & datasetjoin$AvgMonthSpend > 58.515 ),]  
test <- test[!(test$Occupation == "Management" & test$AvgMonthSpend > 59.24 ),] 
test <- test[!(test$Occupation == "Manual" & test$AvgMonthSpend > 54.28 ),] 
test <- test[!(test$Occupation == "Professional" & test$AvgMonthSpend > 60.08 ),]   
test <- test[!(test$Occupation == "Skilled Manual" & test$AvgMonthSpend > 57.06 ),] 
test <- test[!(test$NumberCarsOwned == "1" & test$YearlyIncome > (81300-51140) * 1.5 + 81300),] 

Is it possible to get the same result in a more elegant way?

Thanks in advance

Occupation MonthlySpend 
Clerical   60           
Management 59           
Clerical   62           
Clerical   58           
Clerical   63              
Management 56
Management 58      

If Occupation = clerical and MonthlySpend > 60 then drop these rows If Occupation = management and MonthlySpend > 57 then drop these rows. At the end I should get this:

Occupation MonthlySpend
Clerical   58
Management 56
Taher A. Ghaleb
  • 5,120
  • 5
  • 31
  • 44
Mohamed Khafagy
  • 81
  • 1
  • 1
  • 3
  • Please provide [reproducible examples](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for your question. – Adam Quek Apr 17 '17 at 05:21
  • @AdamQuek Please check the post, I edited with an example. Is there a way to do this with a loop ? or apply() ? – Mohamed Khafagy Apr 17 '17 at 07:47

2 Answers2

15

Combine all conditions by using OR :|

Like:

test <- test[!(test$Occupation == "Management" & test$AvgMonthSpend > 59.24 ) | !(test$Occupation == "Manual" & test$AvgMonthSpend > 54.28 ),] 
Erdem Akkas
  • 2,062
  • 10
  • 15
3

you can try something like this.

step 1. define the limits :

df <- read.table(text="Occupation MonthlySpend 
Clerical   60           
Management 59           
Clerical   62           
Clerical   58           
Clerical   63              
Management 56
Management 58 ", stringsAsFactors=FALSE, header = TRUE)


df2 <- read.table(text="Occupation lmt 
Clerical   60           
Management 57           
", stringsAsFactors=FALSE, header = TRUE)

Step2. Join and filter

df %>% left_join(df2, by = "Occupation") %>%
  group_by(Occupation) %>%
  filter(MonthlySpend < lmt ) %>%
  select(MonthlySpend)

which gives:

Source: local data frame [2 x 2]
Groups: Occupation [2]

  Occupation MonthlySpend
       <chr>        <int>
1   Clerical           58
2 Management           56

This way, you have to spend some resources in defining the second dataframe, but the actual process of filtering is streamlined.

Aramis7d
  • 2,444
  • 19
  • 25