1

I'm trying to figure out a way to group data, and then create a column based on the content of the grouped rows.

Sample df to be manipulated

df <- tibble::tribble(
              ~name, ~position, ~G,
      "DJ LeMahieu",      "1B", 40,
      "DJ LeMahieu",      "2B", 75,
      "DJ LeMahieu",      "3B", 52,
        "Max Muncy",      "1B", 65,
        "Max Muncy",      "2B", 70,
        "Max Muncy",      "3B", 35,
  "Whit Merrifield",      "2B", 82,
  "Whit Merrifield",      "OF", 61
  )

I then want this content to be grouped at the name level. I want to create a new column called extra_position. This column would be a concatenate of the content in the position column separated by a "/". Example output below:

output_df <- tibble::tribble(
              ~name,  ~extra_position,
      "DJ LeMahieu", "1B/2B/3B",
        "Max Muncy", "1B/2B/3B",
  "Whit Merrifield",    "2B/OF"
  )

I'd like to stay within the tidyverse if possible. In addition, I'm curious to know whether you can also control the order of the data being concatenated. For example, can you make DJ LeMahieu's extra_position content show as: "3B/2B/1B"?

Jazzmatazz
  • 615
  • 7
  • 18

1 Answers1

1

We can group by 'name', paste or (str_c) the 'position' column by collapseing the elements to a single string

library(dplyr)
library(stringr)
df %>%
    group_by(name) %>% 
    summarise(extra_position = str_c(position, collapse="/"))

If we need to reverse the order

df %>% 
    group_by(name) %>% 
    summarise(position = str_c(rev(position), collapse="/"))

Or if it is based on the values

df %>% 
    group_by(name) %>%
    summarise(position = str_c(gtools::mixedsort(position,
            decreasing = TRUE), collapse="/"))

Or with data.table

library(data.table)
setDT(df)[, .(extra_position = paste(position, collapse="/")), .(name)]

In base R, use aggregate

aggregate(position ~ name, df, paste, collapse="/")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 4
    If you also want to control the order of the concatenated data, you can add `arrange(desc(position))` or `arrange(position)`, before doing the `summarise` to concatenate them in descending or ascending order, respectively. – Jonathan V. Solórzano Mar 02 '20 at 21:39