I have a set of data and I need to sample it. Part of data is like below:
row.names customer_ID
1 10000000
2 10000000
3 10000000
4 10000000
5 10000005
6 10000005
7 10000008
8 10000008
9 10000008
10 10000008
11 10000008
12 10000008
...
take the first 2 rows from each customer then before including the next row do a check: there is a 65% chance we take the next row and 35% chance we quit and move to the next customer. If we take the row, we do it again 65% and 35% until we run out of data for the customer or we are fail the check and move to the next customer anyway. Repeat this for each customer