-1

I know that this question has been repeated multiple times but I am not able to look exactly for what I am looking for in the previous topics. Please feel free to close the topic in case that this is duplicated.

I have a dataframe as follows:

> data %>% arrange(customer_id)
           region market unit_key
1             2      98      320
2             2      98      321
3             4     184      287
4             4       4        7
5             4       4      287
6            66     521      899
7            66     521      900
8            66    3012      899
9            66     521      916
10           66    3011      900

I would like to make a 4th column which is a unique identifier call combination id that is formed as follows:

enter image description here

So basically for each unique pair of region and market I should get a unique identifier that will allow me to retrieve the unit_keys that they are linked with the combination of markets for an specific region.

I tried to do it with a cross-join and with tidyr::crossing() but I didnt get the expected results.

Any hints on this topic?

BR /Edgar

tfkLSTM
  • 161
  • 13
  • A unique id for each `region` and `market` combination? Try `df %>% group_by(region, market) %>% mutate(id = cur_group_id())`. Also https://stackoverflow.com/questions/42921674/assign-unique-id-based-on-two-columns – Ronak Shah Sep 08 '20 at 09:04
  • Hi Ronak, unfortunately this is not what I was trying to do. – tfkLSTM Sep 08 '20 at 11:09
  • I don't see how the picture helps in understanding the expected output. It is not clear atleast to me. It would be helpful if you edit the post to show the expected output for the data shared. – Ronak Shah Sep 08 '20 at 12:17

1 Answers1

-2

Unfortunately the proposed solution by:

df %>% group_by(region, market) %>% mutate(id = cur_group_id())

Does not work as I get the following result:

    combination_id %>% arrange(region)
    # A tibble: 373 x 4
    # Groups:   region, market [182]
              region market unit_key    id
             <dbl>   <dbl>    <dbl> <int>
     1           2      98      320     1
     2           2      98      321     1
     3           4     184      287     3
     4           4       4        7     2
     5           4       4      287     2
     6          66     521      899     4

In this case, for region 4 we should have the following combinations:

  • id=2 where market is 184
  • id=3 where market is 4
  • id=4 where market is 4 and 184
tfkLSTM
  • 161
  • 13
  • 1
    Please delete this post because it is not an answer to the question. Edit your post to include the expected output. – Ronak Shah Sep 08 '20 at 11:58