0

I have this data.

     datetime user_id song_id
1  2019-03-26       6      31
2  2019-03-26       4      30
3  2019-03-26       3      31
4  2019-03-26       9      34
5  2019-03-26      10      21
6  2019-03-26       8      38
7  2019-03-26       8      33
8  2019-03-26       8      28
9  2019-03-26       6      30

I'd like to make a third column, so the data looks like this

     datetime user_id song_id    usersong_id
1  2019-03-26       6      31    631
2  2019-03-26       4      30    430
3  2019-03-26       3      31    331
4  2019-03-26       9      34    934
5  2019-03-26      10      21    1021
6  2019-03-26       8      38    838
7  2019-03-26       8      33    833
8  2019-03-26       8      28    828
9  2019-03-26       6      30    630

I tried this code.

df %>%
  group_by(user_id, song_id) %>% 
  summarize(count = n()) %>% 
  mutate(usersong_id = c(user_id, song_id))

But, it gave me this error:

Error: Column usersong_id must be length 1 (the group size), not 2

zx8754
  • 52,746
  • 12
  • 114
  • 209
Cauder
  • 2,157
  • 4
  • 30
  • 69

2 Answers2

2

We can use unite

library(dplyr)
library(tidyr)
df %>% 
     unite(user_song_id, user_id, song_id, sep = "", remove = FALSE) %>%
     select(names(df), user_song_id)
#    datetime user_id song_id user_song_id
#1 2019-03-26       6      31          631
#2 2019-03-26       4      30          430
#3 2019-03-26       3      31          331
#4 2019-03-26       9      34          934
#5 2019-03-26      10      21         1021
#6 2019-03-26       8      38          838
#7 2019-03-26       8      33          833
#8 2019-03-26       8      28          828
#9 2019-03-26       6      30          630
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Off topic - do you know if there is any way to use `left_join` without getting `.x` and `.y` added to columns of the same name? especially when they have the same values. – Cauder Mar 27 '19 at 05:36
  • .x and .y will get added if the column names clash – Sonny Mar 27 '19 at 05:38
  • @Cauder Not clear about the situation. If you don't want to have extra columns, subset one of the dataset columns without the intersecting columns before the left_join – akrun Mar 27 '19 at 05:39
  • Yeah, but what about cases where the column names are the same and the values are the same? I still get `.x` and `.y` added. No biggie, just curious. – Cauder Mar 27 '19 at 05:40
  • Aha, that's really a great idea – Cauder Mar 27 '19 at 05:40
2

You could use any of the below:

df <- df %>%
  mutate(usersong_id = paste0(user_id, song_id))

df <- df %>%
  unite(user_song_id, user_id, song_id, sep = "", remove = FALSE)

df$usersong_id <- paste0(df$user_id, df$song_id)
Sonny
  • 3,083
  • 1
  • 11
  • 19