0

I'm trying to filter only rows from my table that have the word "dog" in the title column but I cannot get it to work.

Here's a data example:

    ID NozamaItemID                                                    NozamaTitle 
1 4557  12000017544 Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each) 
2 4558  12000021992                                        Pepsi, 8Ct, 12Oz Bottle 
3 4559  12000024542                     Zuke'S Natural Hip Action dog Treats, 3 Oz 
4 4560  12000030680                  Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans 
5 4561  12000030680                  Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans 
6 4562  12000030680                  Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans 

The following code should work but does not:

amzp <- select(amz, ID, NozamaItemID, NozamaTitle, NozamaCustomerID)

searchTerm="cat|dog"
amzp.a <- mutate(amzp, animalFood = ifelse(grepl(searchTerm, amzp$NozamaTitle, ignore.case = TRUE) == TRUE, TRUE, FALSE))

I would expect to see a TRUE for row 3. Any help is appreciated. Thanks

Jaap
  • 81,064
  • 34
  • 182
  • 193
DirkLX
  • 1,317
  • 1
  • 10
  • 16
  • @DrikLX please refrain from adding code snippets when they don't work – Jaap Dec 12 '15 at 12:49
  • Furthermore: you are much more likely to get a good answer when you include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Dec 12 '15 at 12:51
  • 2
    @jaap, I don't agree with your first comment - if they didn't provide any code people would ask them 'what have you tried?' and it will be a better learning for them if people can tell them why their approach failed imo. – talat Dec 12 '15 at 12:56
  • @docendodiscimus Reading my comment again, I see that I expressed myself poorly. I agree that code needs to be included. But I disagree about including blue "run code snippet" buttons which do nothing more that just give the code which was already mentioned above the button. They are just a distraction in this case – Jaap Dec 12 '15 at 13:01
  • 1
    @jaap, in that case I do agree with you. – talat Dec 12 '15 at 13:14
  • 1
    I noticed that you didn't accept any answer on any of the questions you asked before. Although it is not mandatory to accept an answer, it is considered good practice to do so if one of the answers worked for you. This will give future readers a clue about the value of the solution. See also this help page: [What should I do when someone answers my question?](http://stackoverflow.com/help/someone-answers) – Jaap Dec 12 '15 at 13:38
  • Your original code contains several components which are not needed (the `ifelse`-statement and using `data$column` inside a standard dplyr function), it does work. Using `amzp.a <- mutate(amzp, animalFood = grepl(searchTerm, NozamaTitle, ignore.case = TRUE))` gives you the same result. Voting to close as no longer reproducible. – Jaap Dec 12 '15 at 19:59
  • Sorry. Next time I'll do better. – DirkLX Dec 12 '15 at 20:19

2 Answers2

3

You are close, you just need to get rid off the ifelse:

amzp.a <- mutate(amzp, animalFood = grepl(searchTerm, 
                         NozamaTitle, ignore.case = TRUE))

which gives:

> amzp.a
    ID NozamaItemID                                                     NozamaTitle animalFood
1 4557  12000017544  Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each)      FALSE
2 4558  12000021992                                         Pepsi, 8Ct, 12Oz Bottle      FALSE
3 4559  12000024542                      Zuke'S Natural Hip Action dog Treats, 3 Oz       TRUE
4 4560  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE
5 4561  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE
6 4562  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE

Used data:

amzp <- structure(list(ID = 4557:4562,
                       NozamaItemID = c(12000017544, 12000021992, 12000024542, 12000030680, 12000030680, 12000030680),
                       NozamaTitle = structure(c(4L, 1L, 2L, 3L, 3L, 3L), .Label = c("Pepsi, 8Ct, 12Oz Bottle","Zuke'S Natural Hip Action dog Treats, 3 Oz","Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans","Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each)"), class = "factor")),
                  .Names = c("ID", "NozamaItemID", "NozamaTitle"), class = "data.frame", row.names = c(NA, -6L))

EDIT: Your original code:

amzp.a <- mutate(amzp, animalFood = ifelse(grepl(searchTerm, amzp$NozamaTitle, ignore.case = TRUE) == TRUE, TRUE, FALSE))

does actually work. Although it contains several components which are not needed (the ifelse-statement and using data$column inside a standard dplyr function), it gives the desired result:

> amzp.a
    ID NozamaItemID                                                     NozamaTitle animalFood
1 4557  12000017544  Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each)      FALSE
2 4558  12000021992                                         Pepsi, 8Ct, 12Oz Bottle      FALSE
3 4559  12000024542                      Zuke'S Natural Hip Action dog Treats, 3 Oz       TRUE
4 4560  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE
5 4561  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE
6 4562  12000030680                   Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans      FALSE

So, you might want to describe the "does not work" statement in more detail.

akrun
  • 874,273
  • 37
  • 540
  • 662
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • 1
    Thank you all! I guess I overlooked something as the solution indeed works. However, I learned a lot and so this writing was worth it. I appreciate the help. – DirkLX Dec 12 '15 at 15:23
2

I'm not exactly sure what you're trying to achieve but if your aim is just to be left with only rows where the word "dog" appears in the NozamaTitle column, you just need to use dplyr::filter. Using chickwts as an example in lieu of a minimal reproducible example:

levels(chickwts$feed)
# [1] "casein"    "horsebean" "linseed"   "meatmeal"  "soybean"  
# [6] "sunflower"

df <- filter(chickwts, grepl("bean", feed))
df
#    weight      feed
# 1     179 horsebean
# 2     160 horsebean
# 3     136 horsebean
# ...
# 11    243   soybean
# 12    230   soybean
# 13    248   soybean
# ...

Is this what you're after?

Phil
  • 4,344
  • 2
  • 23
  • 33