I currently have a list that looks something like:
>>> list_in_question = [[0, 4], [0, 3], [0, 4], [3, 4], [6, 7]]
Each list element represents a span/range with the first element being the start and the second being the end. I want to take the overlapping elements and combine them so the largest range is remaining. There are no cases where the lists are "partially overlapping" (e.g., [3, 5], [4, 6]
).
The list that I want is something like:
>>> list_in_question = [[0, 4], [0, 3], [0, 4], [3, 4], [6, 7]]
>>> list_i_want = [[0, 4], [6, 7]]
Some ways that I've thought of doing this is to make two separate lists to keep track of whether an element is valid or not, but the code is hard to read and (I assume) prone to errors. It looks something like this:
valid_lists = []
invalid_lists = []
# Outer loop to go through element lists.
for i, _ in enumerate(list_in_question):
if list_in_question[i] in valid_lists:
continue
# Inner loop to go through subsequent element lists after the i'th one.
for j in range(i + 1, len(list_in_question)):
first_range = list_in_question[i]
second_range = list_in_question[j]
first_range_expanded = set(range(first_range[0], first_range[1])
second_range_expanded = set(range(second_range[0], second_range[1])
if first_range_expanded.issubset(second_range_expanded):
if second_range not in valid_lists:
valid_lists.append(second_range)
invalid_lists.append(first_range)
elif second_range_expanded.issubset(first_range_expanded):
if first_range not in valid_lists:
valid_lists.append(first_range)
invalid_lists.append(second_range)
for range_list in list_in_question:
if range_list not in valid_lists and range_list not in invalid_lists:
valid_lists.append(range_list)
It works for some specific cases, but I'd like to know if there's a better way to go about this. Thanks.