2
  1. I have a dataframe called df with variables AllCustomerName, and sum.of.FY.Total . The first column "AllCustomerName" has a list of all clients. I have a separate list that contains the names of the customers who I need information on. The code below is meant to loop through the dataframe column AllCustomerName, search for all values that equal my list values, and respectively add all values per each list observation.

    y <- list("client 1", "client 2", "client 3") for ( i in y){ if ( df$AllCustomerName == i ){ sum(df$Sum.of.FY.Total) } }

When I run the code I however get warnings saying "the condition has length > 1 and only the first element will be used"

Thanks

Lyle
  • 97
  • 2
  • 2
  • 10
  • So you're trying to code your own version of `merge`? Perhaps you could make a reproducible example? [There are great tips here](http://stackoverflow.com/q/5963269/903061). Either simulate data or share it with `dput()`. – Gregor Thomas Apr 12 '16 at 17:15
  • Please consider removing your rstudio tag. R and rstudio are distinct pieces of software and your question is not related to rstudio. – lmo Apr 12 '16 at 17:29
  • Or maybe you want `aggregate(sum.of.FY.Total ~ AllCustomerName, FUN = sum, data = subset(df, AllCustomerName %in% y))`. – Gregor Thomas Apr 12 '16 at 18:10
  • Why not just subset df where `df$AllCustomerName %in% y` then `group_by AllCustomerName` and `summarise` – Rajesh S Apr 13 '16 at 01:04
  • @Imo you are right. Not sure how/why rstudio was tagged. – Lyle Apr 13 '16 at 13:35

1 Answers1

0

This could be done with data.table

library(data.table)
setDT(df)[AllCustomerName %chin% unlist(y), .(Sum = sum(Sum.of.FY.Total)), 
                by = AllCustomerName]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    That works perfectly. I didn't think about using this package. Thanks! – Lyle Apr 13 '16 at 13:28
  • 1
    Yep. Did that a few minutes ago. – Lyle Apr 13 '16 at 13:37
  • Quick question. Pardon if it sounds trivial. Keeping the same data.table syntax, is it possible to look for and sum all partial matches for the parent items in list "y". Only way I can think is to use the `pmatch()` function, which works but with your solution isn't effective. – Lyle Apr 13 '16 at 13:56
  • @Lyle Can you post as a new question with a reproducible example – akrun Apr 14 '16 at 02:00
  • I just did [link](http://stackoverflow.com/questions/36649749/how-to-search-dataframe-for-partial-matches-and-sum-separate-column-then-return) @akrun – Lyle Apr 15 '16 at 15:00