-3

I have created a dataframe which has three columns Name, Month and Amount . The format is such that there are mutiple names in each month and each combination has an amount . I want to find the top 5 users based on their monthly spending. Which means the final data in the data frame will have only top 5 earnings for each month . The way i have calculated the data now now is **

Extract_Month<- months(Credit$Transaction.Date)
Extract_Month
TopSpend<-aggregate(Credit$Amount, 
                    by=list(Credit$User,Extract_Month)
                    , FUN=mean)

** I am stuck beyond this point . Please help

Here is some sample data

User<-c(6,2,3,4,5,6)
Transaction.Date<-c("11-1-2019","11-2-2019","11-3-2019",
"12-1-2019","12-2-2019","11-1-2019")
Amount<-c(100,200,300,400,500,150)

Credit<-data.frame(User,Transaction.Date,Amount)
Chabo
  • 2,842
  • 3
  • 17
  • 32
S_Gupta
  • 1
  • 3

2 Answers2

1

Here is a solution:

 library(tidyverse)
 df<-data.frame(Name=c("A","B","C"),Month=as.factor(c(11,11,11)),Amount=c(123,456,789))
 df %>% 
 arrange(desc(Amount)) %>% 
 top_n(2,Amount)#change 2 to 5

Best to provide sample data:

iris %>% 
  group_by(Species) %>% 
  arrange(desc(Sepal.Length)) %>% 
  top_n(5,Sepal.Length)

OR:: Based on @Chabo 's data:

User<-c(6,2,3,4,5,6)
Transaction.Date<-c("11-1-2019","11-2-2019","11-3-2019",
                    "12-1-2019","12-2-2019","11-1-2019")
Amount<-c(100,200,300,400,500,150)
df1<-data.frame(Amount,Transaction.Date,User)
df1 %>% 
  group_by(User,Transaction.Date) %>% 
  arrange(desc(Amount)) %>% 
  top_n(5,Amount) %>% 
  ungroup() %>% 
  top_n(5,Amount)
Chabo
  • 2,842
  • 3
  • 17
  • 32
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • Error in arrange_impl(.data, dots) : Evaluation error: `as_dictionary()` is defunct as of rlang 0.3.0. Please use `as_data_pronoun()` instead. – S_Gupta Jan 18 '19 at 20:40
  • What code are you using? Reinstall tidyverse Could you also add sample data to your question to avoid us making stuff up? https://stackoverflow.com/questions/52957136/defunct-as-of-rlang-0-3-0-and-mutate-impl – NelsonGon Jan 18 '19 at 20:41
  • Error in TopSpend %>% group_by(Group.1, Group.2) %>% arrange(desc(x)) %>% : could not find function "%>%" – S_Gupta Jan 18 '19 at 20:47
  • TopSpend %>% group_by(Group.1,Group.2) %>% arrange(desc(x)) %>% top_n(5,x) – S_Gupta Jan 18 '19 at 20:47
  • Did you call `library(tidyverse)`? What is x? Use `dput` to add data to your question. – NelsonGon Jan 18 '19 at 20:48
  • Yes . I called it . Reinstalled the package and restarted the Studio as well – S_Gupta Jan 18 '19 at 20:49
  • Hmmm...use `dplyr` directly instead. `library(dplyr)` – NelsonGon Jan 18 '19 at 20:50
  • Error: package or namespace load failed for ‘tidyverse’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called ‘Rcpp’ – S_Gupta Jan 18 '19 at 20:53
  • `install.packages("Rcpp",dependencies=T)` or uninstall and reinstall the tidyverse with dep set to T. – NelsonGon Jan 18 '19 at 20:55
1

Using made up data over multiple months. May not be the best approach but it works. I would recommend working with @NelsonGon on the tidyverse approach.

Data Creation:

library(dplyr)

User<-c(6,2,3,4,5,6)
Transaction.Date<-c("11-1-2019","11-2-2019","11-3-2019",
"12-1-2019","12-2-2019","11-1-2019")
Amount<-c(100,200,300,400,500,150)

Credit<-data.frame(User,Transaction.Date,Amount)

Aggregate, Arrange and Subset:

#Aggregate user by avg amount spent and date
TopSpend<-aggregate(Credit$Amount, 
                by=list(Credit$User,Credit$Transaction.Date)
                , FUN=mean)

#Reverse so high in the start                    
TopSpend<-arrange(TopSpend, rev(rownames(TopSpend)))
                    print(TopSpend)

#Rename for clarity                
names(TopSpend)<-c("User", "Date","Amount")

#Format date for split              
TopSpend$Date<-as.POSIXct(TopSpend$Date, format="%m-%d-%Y")

#Split based on month             
TopSpend_Fin<-split(TopSpend, format(TopSpend$Date, "%Y-%m"))

#Get first 5 elements (non-existent won't throw error)
TopSpend_Fin<-lapply(TopSpend_Fin, head, n = 5L)

$`2019-11`
  User       Date Amount
3    3 2019-11-03    300
4    2 2019-11-02    200
5    6 2019-11-01    125

$`2019-12`
  User       Date Amount
1    5 2019-12-02    500
2    4 2019-12-01    400
Chabo
  • 2,842
  • 3
  • 17
  • 32
  • How would i get top 5 for each month – S_Gupta Jan 18 '19 at 20:53
  • @StutiGupta see edits, `lapply(TopSpend_Fin, head, n = 5L)` – Chabo Jan 18 '19 at 21:32
  • what about sorting in Decreasing order and then getting top 5? – S_Gupta Jan 19 '19 at 05:56
  • @StutiGupta The program already sorts in decreasing order, and the top 5 is already being pulled. In the example there are not 5 total options per month so it pulls as many as it can, in decreasing order. If you want increasing order, delete `TopSpend<-arrange(TopSpend, rev(rownames(TopSpend)))` as this reverses the default which is increasing. – Chabo Jan 22 '19 at 15:21