0

Sorry if the title is not very descriptive. I have data like this:

data

What I am trying to do is pivot over Product and count the ocurrences of each type of Client they have, in order to dynamically add columns like "most frequent client", "second most frequent client", etc.

If we group by Product and Client and count, we see the ocurrences of each Client in each Product:

data grouped

From here my goal would be to get to something like this:

goal

So "number 1 client" for A is C1, for B is C2, etc. Again, the thing is this should be dynamic, as I don´t know beforehand how many "nX_Client" columns are going to be.

Breaking ties (for Product C there are 2 Client with 1 ocurrence each) can be done randomly, by alphabetical order or however you want. It´s not important for my use case.

reprex:

ID <- c(1,2,3,4,5,6,7,8,9,10)
Size <- c(1,2,1,2,1,2,1,2,1,2)
Product <- c("A", "A", "A", "A", "A", "B", "B", "B", "C", "C")
Client <- c("C0", "C0", "C1", "C1", "C1", "C2", "C2", "C1", "C0", "C1")
df <- data.frame( ID, Size, Product, Client )

df <- df %>% 
  group_by(Product, Client) %>% 
  summarise(count = n())

Best regards.

AleG
  • 153
  • 8

1 Answers1

2

You can count number of rows for every Product and Client and sort them. Create a column names for each Product based on their frequency and cast the data in wide format.

library(dplyr)

df %>%
  count(Product, Client, sort = TRUE) %>%
  group_by(Product) %>%
  mutate(n = sprintf('n%d_client', row_number())) %>%
  tidyr::pivot_wider(names_from = n, values_from = Client)

# Product n1_client n2_client
#  <chr>   <chr>     <chr>    
#1 A       C1        C0       
#2 B       C2        C1       
#3 C       C0        C1      
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213