0

if I have a data.frame of customer shopping fee and I need to divide customer into 4 groups by quantile, how should I write the R code?

Now I only get this...

quantile(cus.df$Fee, probs=seq(from=0,to=1,by=0.2))
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • You may want to checkout `?cut`. – Prem Apr 25 '18 at 07:29
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Jaap Apr 25 '18 at 07:31

2 Answers2

3

If you just want to create a new variable, where each row gets a 1, 2, 3 or 4 depending on the value in 1 column, you could do:

library(dplyr)

mtcars %>% 
  mutate(quantilegroup = ntile(qsec, 4)) %>% 
  head(6)

   mpg cyl disp  hp drat    wt  qsec vs am gear carb quantilegroup
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4             1
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4             2
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1             3
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1             4
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2             2
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1             4
Lennyy
  • 5,932
  • 2
  • 10
  • 23
0

Look at the reproducible example - you need to combine dplyr::group_by with cut - breaks are defined by quantile(...)

library(dplyr)
mtcars %>% 
  group_by(G = cut(mpg, breaks=quantile(mpg, probs=seq(0, 1, by=0.2))))

# A tibble: 32 x 12
# Groups: G [6]
     # mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb G         
   # <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>     
 # 1  21.0    6.  160.  110.  3.90  2.62  16.5    0.    1.    4.    4. (17.9,21] 
 # 2  21.0    6.  160.  110.  3.90  2.88  17.0    0.    1.    4.    4. (17.9,21] 
 # 3  22.8    4.  108.   93.  3.85  2.32  18.6    1.    1.    4.    1. (21,24.1] 
 # 4  21.4    6.  258.  110.  3.08  3.22  19.4    1.    0.    3.    1. (21,24.1] 
 # 5  18.7    8.  360.  175.  3.15  3.44  17.0    0.    0.    3.    2. (17.9,21] 
 # 6  18.1    6.  225.  105.  2.76  3.46  20.2    1.    0.    3.    1. (17.9,21] 
 # 7  14.3    8.  360.  245.  3.21  3.57  15.8    0.    0.    3.    4. (10.4,15.~
 # 8  24.4    4.  147.   62.  3.69  3.19  20.0    1.    0.    4.    2. (24.1,33.~
 # 9  22.8    4.  141.   95.  3.92  3.15  22.9    1.    0.    4.    2. (21,24.1] 
# 10  19.2    6.  168.  123.  3.92  3.44  18.3    1.    0.    4.    4. (17.9,21] 
# ... with 22 more rows
CPak
  • 13,260
  • 3
  • 30
  • 48