0

I have this list:

lst =[['BOX_187_090_31', 'BOX_187_090_32', 'BOX_187_090_34', 'BOX_187_090_35', 'BOX_187_090_36', 'BOX_187_090_37', 
   'BOX_187_090_38', 'BOX_187_090_48', 'BOX_187_090_49', 'BOX_187_090_50', 'BOX_187_090_51', 'BOX_187_090_52', 
   'BOX_187_090_53', 'BOX_187_090_54', 'BOX_187_090_55', 'BOX_187_090_56', 'BOX_187_090_57', 'BOX_187_090_58', 
   'BOX_187_090_59', 'BOX_187_090_60'], 
  ['BOX_187_090_33', 'BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 
    'BOX_187_090_42', 'BOX_187_090_43'], 
  ['BOX_187_090_61', 'BOX_187_090_62'], 
  ['BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 'BOX_187_090_42', 'BOX_187_090_43'], 
  ['BOX_187_090_33', 'BOX_187_090_42', 'BOX_187_090_43'], ['BOX_187_090_33', 'BOX_187_090_01']]

I want to combine all sublists that have one or more elements overlap. For example, the sublists ['BOX_187_090_33', 'BOX_187_090_01'] and ['BOX_187_090_33', 'BOX_187_090_42', 'BOX_187_090_43'] have overlap by the element 'BOX_187_090_33'. The merge will then look like: ['BOX_187_090_33', 'BOX_187_090_42', 'BOX_187_090_43', 'BOX_187_090_01'].

Furthermore, in the case of three sublists it is also possible that two sublists have no overlap with eachother but they do have overlap with the third sublist, then they also have to be merged. For example:

['BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 'BOX_187_090_42', 'BOX_187_090_43'], 
['BOX_187_090_33', 'BOX_187_090_42', 'BOX_187_090_43']
['BOX_187_090_33', 'BOX_187_090_01']

Becomes: ['BOX_187_090_33', 'BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 'BOX_187_090_42', 'BOX_187_090_43', 'BOX_187_090_01']

The final result for my list should be:

lst =[['BOX_187_090_31', 'BOX_187_090_32', 'BOX_187_090_34', 'BOX_187_090_35', 'BOX_187_090_36', 'BOX_187_090_37', 
   'BOX_187_090_38', 'BOX_187_090_48', 'BOX_187_090_49', 'BOX_187_090_50', 'BOX_187_090_51', 'BOX_187_090_52', 
   'BOX_187_090_53', 'BOX_187_090_54', 'BOX_187_090_55', 'BOX_187_090_56', 'BOX_187_090_57', 'BOX_187_090_58', 
   'BOX_187_090_59', 'BOX_187_090_60'], 
  ['BOX_187_090_33', 'BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 
    'BOX_187_090_42', 'BOX_187_090_43', 'BOX_187_090_01'], 
  ['BOX_187_090_61', 'BOX_187_090_62']]

Anyone has an idea how to fix this?

Regards,

Dante

azro
  • 53,056
  • 7
  • 34
  • 70
  • 1
    I dont understand how you got to your expected your result. the use of [sets](https://docs.python.org/3/tutorial/datastructures.html#sets), however, wil be your friend. – Tom McLean Aug 02 '22 at 21:02
  • What if you had ABC/CDE/EFG/XYZ/AX (each letter being a sublist item), would you merge all? – mozway Aug 02 '22 at 21:16
  • 3
    I was about the suggest a `networkx` solution. [It already exists in this duplicate](https://stackoverflow.com/a/4843408/16343464). – mozway Aug 02 '22 at 21:28

1 Answers1

2

Sets were built for this task:

lst =[['BOX_187_090_31', 'BOX_187_090_32', 'BOX_187_090_34', 'BOX_187_090_35', 'BOX_187_090_36', 'BOX_187_090_37', 
   'BOX_187_090_38', 'BOX_187_090_48', 'BOX_187_090_49', 'BOX_187_090_50', 'BOX_187_090_51', 'BOX_187_090_52', 
   'BOX_187_090_53', 'BOX_187_090_54', 'BOX_187_090_55', 'BOX_187_090_56', 'BOX_187_090_57', 'BOX_187_090_58', 
   'BOX_187_090_59', 'BOX_187_090_60'], 
  ['BOX_187_090_33', 'BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 
    'BOX_187_090_42', 'BOX_187_090_43'], 
  ['BOX_187_090_61', 'BOX_187_090_62'], 
  ['BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_41', 'BOX_187_090_42', 'BOX_187_090_43'], 
  ['BOX_187_090_33', 'BOX_187_090_42', 'BOX_187_090_43'], ['BOX_187_090_33', 'BOX_187_090_01']]


out = []
for sub in lst:
    for o in out:
        if any( k in o for k in sub):
            o.union( sub )
            break
    else:
        out.append( set(sub) )
for sub in out:
    print(list(sub))

Output:

['BOX_187_090_54', 'BOX_187_090_34', 'BOX_187_090_49', 'BOX_187_090_32', 'BOX_187_090_57', 'BOX_187_090_35', 'BOX_187_090_53', 'BOX_187_090_31', 'BOX_187_090_52', 'BOX_187_090_36', 'BOX_187_090_59', 'BOX_187_090_56', 'BOX_187_090_38', 'BOX_187_090_58', 'BOX_187_090_50', 'BOX_187_090_60', 'BOX_187_090_51', 'BOX_187_090_55', 'BOX_187_090_48', 'BOX_187_090_37']
['BOX_187_090_39', 'BOX_187_090_40', 'BOX_187_090_43', 'BOX_187_090_41', 'BOX_187_090_42', 'BOX_187_090_33']
['BOX_187_090_62', 'BOX_187_090_61']
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • 1
    Haven't tried but I suspect this would fail to merge lists that have each an overlap with a third one but not with each other (AB/CD/BC), if the third one comes after in the loop. – mozway Aug 02 '22 at 21:19
  • True. It would merge BC into CD, but not combine all 3. That would require repeating this iteratively until the final list length did not change. – Tim Roberts Aug 02 '22 at 21:32