0

I have a data frame with parcel weights in one column and now I have to assign them to a bag that matches the requirement.

My code:

df = pd.DataFrame({'parcel':[a,b,c,d,e],
                    'weight':[85,60,15,30,150]})
# I have bags that can take 100 kg parcels. Now I want to name the parcels 
# that go into a specific bag. The Bags are numbered 1,2,3,4,5. I want to use minimal bags possible. 

Expected answer:

df = 
  parcel  weight  bag_num
0      a      85      1
1      b      60      2
2      c      15      1
3      d      30      2
4      e     150      NaN # This parcel is overweight, cannot be accommodated

My answer:

df['bag_num'] = df['weight']<100
df['bag_num'].replace(False,np.nan,inplace=True)
df=
  parcel  weight bag_num
4      e     150     NaN
0      a      85    True
1      b      60    True
3      d      30    True
2      c      15    True

I reached this place. I could not proceed further?

Mainland
  • 4,110
  • 3
  • 25
  • 56
  • Might be helpful: https://stackoverflow.com/a/46197147/10473393 – Alexander Santos Oct 15 '21 at 22:14
  • @AlexanderSantos I see that it helps to take multiple columns and produce one column. Mine is totally different. I should be able to access multiple rows while iterating at the same row. – Mainland Oct 15 '21 at 22:20
  • What you want is to access the weight and based on it's value change bag_num to some kind of category, correct? You can do the same approach from apply's fuction above, you just need a different function. (It doesn't produces one column, it creates/updates one column based on apply's return, that is why zip is used (two columns are getting updated at the same time) – Alexander Santos Oct 15 '21 at 22:24
  • 1
    @AlexanderSantos I think you misunderstand the question. To OP: I don't think this is a simple problem. A greedy algorithm might be feasible, but a global solution... – Quang Hoang Oct 15 '21 at 22:28
  • @QuangHoang Yes! you are spot on and it is a bigger problem requiring a global solution. I am working right now on a board to imagine the problem and produce a solution. – Mainland Oct 15 '21 at 22:31
  • 1
    pandas does not help you here. Just keep the data in lists. And do understand the only way to find the optimal solution to this kind of problem is to try every possibility. It's known that there are no shortcuts. – Tim Roberts Oct 15 '21 at 22:36
  • Related, not entirely the same: https://cs.stackexchange.com/questions/52766/partitioning-a-set-to-the-maximum-number-of-subsets-summing-to-zero suggests that this might be NP-hard :-) – Quang Hoang Oct 15 '21 at 22:38
  • @QuangHoang well I heard here for the first time NP-hard. No idea what it means. – Mainland Oct 15 '21 at 22:40

1 Answers1

1

You can solve the problem iterating the rows of the dataframe and allocating the bag_number accordingly:

import pandas as pd

df = pd.DataFrame(
    {"parcel": ["a", "b", "c", "d", "e"], "weight": [85, 60, 15, 30, 150]}
)


MIN_BAG = 1
MAX_BAG = 5
bags_range = range(MIN_BAG, MAX_BAG + 1)

# We keep track of the bags and how much weight they hold at any moment
used_bags = {bag_idx: 0 for bag_idx in bags_range}

# Create empty df column
df["bag_num"] = pd.NA

for row in df.itertuples():

    parcel_weight = row.weight

    if parcel_weight > 100:
        continue

    for bag in bags_range:
        temp_weight = used_bags.get(bag) + parcel_weight
        if temp_weight <= 100:
            used_bags[bag] = temp_weight
            df.at[row.Index, "bag_num"] = bag
            break


print(df)

This produces this result:

  parcel  weight bag_num
0      a      85       1
1      b      60       2
2      c      15       1
3      d      30       2
4      e     150    <NA>
tidus4400
  • 56
  • 1
  • 6
  • I appreciate your answer and it works. I tried one more combination to see how it works. It finds a solution but may not be the global solution. But I must appreciate you for writing this and solving the complex problem. I accepted it. The combination I tried was: `df = pd.DataFrame({"parcel": ["a", "b", "c", "d", "e","f","g","h","i"], "weight": [90,5, 15, 30, 150,85,75,95,10]})` In this solution, I thought it would assign 90+10 into one bag, 5+95 into another bag but it did not. – Mainland Oct 16 '21 at 00:30