-2

I have a transaction data set of 10 customers from 01-01-2013 to 01-11-2016. I split data set for each customer manually as below but I couldn't find how to create a loop to do it. What is the best loop for this?

This is how my data set looks like for one customer

customer_1 <- transactions[1:47,]
customer_2 <- transactions[48:94,]
customer_3 <- transactions[95:141,]
customer_4 <- transactions[142:188,]
customer_5 <- transactions[189:235,]
customer_6 <- transactions[236:282,]
customer_7 <- transactions[283:329,]
customer_8 <- transactions[330:376,]
customer_9 <- transactions[377:423,]
customer_10 <- transactions[424:468,]
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Lots of options for manipulating a data frame by group. The `group_by()` function in the `dplyr` package is a good place to start. Using base R, you can use the `split()` function or `tapply()`. Or the data.table package has a `by` argument. See this question for ideas https://stackoverflow.com/q/11562656/134830 – Richie Cotton Sep 04 '17 at 13:55
  • 1
    Though it'd be possible to use a vector of indices to iteratively partition the data and `assign` to create the variable dynamically, I think it's a better idea to break it up into a list of data.frames (https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207) or (as @RichieCotton suggested) keep it one frame and work group-wise. – r2evans Sep 04 '17 at 13:56
  • out <- split( transactions, f = transactions$customer_id) will give you a list of elements, each one will contains all transactions from a customer – Emmanuel-Lin Sep 04 '17 at 13:57

1 Answers1

0

You should use split to split your data frame:

out <- split( transactions, f = transactions$customer_id)

Then if you want to assign a variable by customer you can do

counter = 1
for (elt in out){
  assign(paste("customer", counter, sep ="_"), elt)
  counter <- counter + 1
}

Which will create variables customer_1, customer_2....

Emmanuel-Lin
  • 1,848
  • 1
  • 16
  • 31
  • 2
    This certainly technically does what is asked, but I recommend against doing it this way: with data like this, generally whatever you do to one data.frame you'll be doing to the others as well. When broken into different variables like, you have to either manually code each one or dynamically using `ls()` and `get()`, a hack. It's more straight-forward (to code, to follow, to debug) to deal with a list of data.frames. – r2evans Sep 04 '17 at 14:22
  • Thank you guys. @Emmanuel-Lin, i used the code you shared and it worked but the customers are not in order. Customer_1 starts from row 377. How can i start customer_1 from row 1 in order? – Melike Anıtmak Sep 04 '17 at 14:51