2

I have set of marbles, of different colors and weights, and I want to split them into groups based on their weight and color.

The conditions are:

  1. A group cannot weigh more than 100 units
  2. A group cannot have more than 5 different-colored marbles.

A reproducible example:

marbles <- data.frame(color=sample(1:20, 20), weight=sample(1:40, 20, replace=T))

   color weight
1      1     22
2     15     33
3     13     35
4     11     13
5      6     26
6      8     15
7     10      3
8     16     22
9     14     21
10     3     16
11     4     26
12    20     30
13     9     31
14     2     16
15     7     12
16    17     13
17    19     19
18     5     17
19    12     12
20    18     40

And what I want is this group column:

   color weight group
1      1     22     1
2     15     33     1
3     13     35     1
4     11     13     2
5      6     26     2
6      8     15     2
7     10      3     2
8     16     22     2
9     14     21     3
10     3     16     3
11     4     26     3
12    20     30     3
13     9     31     4
14     2     16     4
15     7     12     4
16    17     13     4
17    19     19     4
18     5     17     5
19    12     12     5
20    18     40     5

TIA.

jake2389
  • 1,166
  • 8
  • 22
  • 1
    This sounds like a "bin-packing" (aka "backpack" or even "knapsack", I think) problem, you will need to research optimization to get that right. – r2evans Sep 04 '19 at 23:42
  • Do you have any condition for minimum number of colors – akrun Sep 04 '19 at 23:47
  • Possibilities (unverified, I don't use them): https://cran.r-project.org/web/packages/gbp/, https://cran.r-project.org/web/packages/mknapsack/, https://stackoverflow.com/q/10467579/3358272 – r2evans Sep 04 '19 at 23:48

2 Answers2

1

The below isn't an optimal assignment to the groups, it just does it sequentially through the data frame. It's uses rowwise and might not be the most efficient way as it's not a vectorized approach.

library(dplyr)

marbles <- data.frame(color=sample(1:20, 20), weight=sample(1:40, 20, replace=T))

Below I create a rowwise function which we can apply using dplyr

assign_group <- function(color, weight) {
  # Conditions
  clists = append(color_list, color)
  sum_val = group_sum + weight
  num_colors = length(unique(color_list))

  assign_condition = (sum_val <= 100 & num_colors <= 5)
  #assign globals
  cval <- if(assign_condition) clists else c(color)
  sval <- ifelse(assign_condition, sum_val, weight)
  gval <- ifelse(assign_condition, group_number, group_number + 1)
  assign("color_list", cval, envir = .GlobalEnv)
  assign("group_sum", sval, envir = .GlobalEnv)
  assign("group_number", gval, envir = .GlobalEnv)
  res = group_number
  return(res)
}

I then setup a few global variables to track the allocation of the marbles to each group.

# globals
color_list <<- c()
group_sum <<- 0
group_number <<- 1

Finally run this function using mutate

test <- marbles %>% rowwise() %>% mutate(group = assign_group(color,weight)) %>% data.frame() 

Which results in the below

 color weight group
1      6     27     1
2     12     16     1
3     15     32     1
4     20     25     1
5     19      5     2
6      2     21     2
7     16     39     2
8     17      4     2
9     11     16     2
10     7      7     3
11    10      5     3
12     1     30     3
13    13      7     3
14     9     39     3
15    14      7     4
16     8     17     4
17    18      9     4
18     4     36     4
19     3      1     4
20     5      3     5

And seems to meet the constraints

test %>% group_by(group) %>% summarise(tot_w = sum(weight), n_c = length(unique(color)) )

  group tot_w   n_c
  <dbl> <int> <int>
1     1   100     4
2     2    85     5
3     3    88     5
4     4    70     5
5     5     3     1
RK1
  • 2,384
  • 1
  • 19
  • 36
0

in base R you could write a recursive function as shown below:

create_group = function(df,a){
  if(missing(a)) a = cumsum(df$weight)%/%100
  b = !ave(df$color,a,FUN=seq_along)%%6
  d = ave(df$weight,a+b,FUN=cumsum)>100
  a = a+b+d
  if (any(b|d)) create_group(df,a) else cbind(df,group = a+1)
}

create_group(df)
   color weight group
1      1     22     1
2     15     33     1
3     13     35     1
4     11     13     2
5      6     26     2
6      8     15     2
7     10      3     2
8     16     22     2
9     14     21     3
10     3     16     3
11     4     26     3
12    20     30     3
13     9     31     4
14     2     16     4
15     7     12     4
16    17     13     4
17    19     19     4
18     5     17     5
19    12     12     5
20    18     40     5
Onyambu
  • 67,392
  • 3
  • 24
  • 53