0

I have made a function that is supposed to delete any similar occurrences in the lists of list but I was surprised by the following ERROR: IndexError: list index out of range why it is so?

for example:

 input: [['a0', 'a3'], ['a1', 'a2'], ['a0', 'a3'], ['a2', 'a1'], ['a3', 'a1']]
 expected output:[['a0', 'a3'], ['a1', 'a2'], ['a3', 'a1']]
def getList(a):
    b=a
    lena = len(a)
    print(len(a))
    for i in range(lena):

        for j in range (i+1,lena):
            print(i,j)
            print(a[i],a[j])

            if(a[i][0],a[i][1])==(a[j][1],a[j][0]) or (a[i][0],a[i][1])==(a[j][0],a[j][1]):

                print(a)

a = [['a0', 'a3'], ['a1', 'a2'], ['a0','a3'], ['a2','a1'], ['a3', 'a1']]
getList(a)

OUTPUT:

[['a0', 'a3'], ['a1', 'a2'], ['a0', 'a3'], ['a2', 'a1'], ['a3', 'a1']]
5
0 1
['a0', 'a3'] ['a1', 'a2']
0 2
['a0', 'a3'] ['a0', 'a3']
[['a0', 'a3'], ['a1', 'a2'], ['a0', 'a3'], ['a2', 'a1'], ['a3', 'a1']]
0 3
['a0', 'a3'] ['a2', 'a1']
0 4
['a0', 'a3'] ['a3', 'a1']
1 2
['a1', 'a2'] ['a0', 'a3']
1 3
['a1', 'a2'] ['a2', 'a1']
[['a0', 'a3'], ['a1', 'a2'], ['a0', 'a3'], ['a2', 'a1'], ['a3', 'a1']]
1 4
['a1', 'a2'] ['a3', 'a1']
2 3
['a0', 'a3'] ['a2', 'a1']
2 4
['a0', 'a3'] ['a3', 'a1']
3 4
['a2', 'a1'] ['a3', 'a1']

When I Modify the Code by adding b.pop(j) or anything as for example:

def getList(a):
    b=a
    lena = len(a)
    print(len(a))
    for i in range(lena):

        for j in range (i+1,lena):
            print(i,j)
            print(a[i],a[j])

            if(a[i][0],a[i][1])==(a[j][1],a[j][0]) or (a[i][0],a[i][1])==(a[j][0],a[j][1]):

                print(a)
                b.pop(j)

a = [['a0', 'a3'], ['a1', 'a2'], ['a0','a3'], ['a2','a1'], ['a3', 'a1']]
getList(a)

RESULT:

    5
0 1
['a0', 'a3'] ['a1', 'a2']
0 2
['a0', 'a3'] ['a0', 'a3']
[['a0', 'a3'], ['a1', 'a2'], ['a0', 'a3'], ['a2', 'a1'], ['a3', 'a1']]
0 3
['a0', 'a3'] ['a3', 'a1']
0 4
Traceback (most recent call last):
  File "C:/Users/I/Desktop/papers/test.py", line 21, in <module>
    getList(a)
  File "C:/Users/I/Desktop/papers/test.py", line 13, in getList
    print(a[i],a[j])
IndexError: list index out of range

I am wondering what could be the problem?

Prune
  • 76,765
  • 14
  • 60
  • 81
JIsam
  • 91
  • 2
  • 7
  • This is the classic error that occurs when you remove items from a list while iterating over it (google search shows lots of alerts on this error). – DarrylG Mar 31 '20 at 19:28

2 Answers2

1

Manipulating a list while iterating over it is a recipe for disaster and almost always can be avoided by accumulating results onto a separate structure, or at least making a copy of the original list for iteration purposes (b = a creates an alias, not a copy, which can be done with a.copy() or a[:]). When you pop, the list length changes and the iterator refers to nonexistent list elements.

Also, it's best not to conflate printing with programmatic output. The result of most algorithms should not be stdout. Instead, results should be written to a data structure and returned for the caller to use or dump as they see fit.

Another problem is efficiency: the nested loops mean O(n2) running time. Using extra space can give you a linear algorithm.

If you convert each sublist into tuples, then they become hashable and you can stick the data into a set to eliminate duplicates, then convert everything back to lists:

>>> [list(x) for x in set(tuple(sorted(x)) for x in a)]
[['a1', 'a2'], ['a1', 'a3'], ['a0', 'a3']]

The problem is that order is lost. If order should be kept, you might use a set as a lookup table:

>>> lookup = set()
>>> result = []
>>> for pair in a:
...     key = tuple(sorted(pair))
...     if key not in lookup:
...         lookup.add(key)
...         result.append(pair)
...
>>> result
[['a0', 'a3'], ['a1', 'a2'], ['a3', 'a1']]

If you're using CPython 3.6+, you can take advantage of dictionary ordering to improve the set approach shown above:

>>> [list(x) for x in dict([tuple(sorted(x)), None] for x in a)]
[['a0', 'a3'], ['a1', 'a2'], ['a1', 'a3']]

Pre-3.6 versions can use collections.OrderedDict to achieve the same result:

>>> from collections import OrderedDict
>>> [list(x) for x in OrderedDict([tuple(sorted(x)), None] for x in a)]
[['a0', 'a3'], ['a1', 'a2'], ['a1', 'a3']]
ggorlen
  • 44,755
  • 7
  • 76
  • 106
1

Your code workes fine if you call b = a.copy() and return b. b = a means that the variable name b points to the same Object(list) as a.

Hygrasiel
  • 45
  • 5