How to make new columns by counting up an existing column

Question

I'd like to make some other columns by doing group_by in R.

When the original table is like below

 userID   cat1    cat2
    a        f       3
    a        f       3
    a        u       1
    a        m       1
    b        u       2
    b        m       1
    b        m       2

I group them by userID and want it come like

userID   cat1_f  cat1_m  cat1_u  cat2_1  cat2_2  cat2_3
a        2       1       1       2       0       1
b        0       2       1       1       2       0

Ronak Shah · Accepted Answer · 2019-02-05T03:18:27.370

2

We could gather all the values then count them, create a new column by pasteing cat and value values and then spread it back to wide format with fill=0.

library(tidyverse)

df %>%
  gather(cat, value, -userID) %>%
  count(userID, cat, value) %>%
  unite(cat, c(cat, value)) %>%
  spread(cat, n, fill = 0)

#  userID cat1_f cat1_m cat1_u cat2_1 cat2_2 cat2_3
#  <fct>   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#1  a          2      1      1      2      0      2
#2  b          0      2      1      1      2      0

edited Feb 05 '19 at 03:18

answered Feb 04 '19 at 04:39

Ronak Shah

377,200
20
156
213

Thank you very much! I go with it and get an error like `Warning message: attributes are not identical across measure variables; they will be dropped` . I don't want any row dropped. What do I do then?? – user9191983 Feb 05 '19 at 04:12
@user9191983 that's a warning message. It is because the `cat1` column (in this example) is a factor. If you convert it to character then that warning is gone. – Ronak Shah Feb 05 '19 at 04:17
Thanks a lot and I made another question asking the same (`https://stackoverflow.com/questions/54528876/pandas-hwo-to-groupby-create-other-columns-by-counting-values-of-existing-column`)but for python pandas this time. Your help is expected if you're familiar with it. – user9191983 Feb 05 '19 at 06:36

akrun · Answer 2 · 2019-02-04T06:47:01.213

1

We can just use table from base R

table(df)
#       cat1
#userID f m u
#     a 2 1 1
#     b 0 2 1

Or with dcast from data.table

library(data.table)
dcast(setDT(df), userID ~ paste0('cat1_', cat1))

data

df <- structure(list(userID = c("a", "a", "a", "a", "b", "b", "b"), 
cat1 = c("f", "f", "u", "m", "u", "m", "m")), class = "data.frame", 
 row.names = c(NA, -7L))

edited Feb 04 '19 at 06:47

answered Feb 04 '19 at 06:32

akrun

874,273
37
540
662

How to make new columns by counting up an existing column

2 Answers2

data

Linked