2

environment: python 3.6.4

I have two list,
list1 is nested list of words, like

[['this', 'is', 'a', 'pen', 'that', 'is', 'a', 'desk'],
 ['this', 'is', 'an', 'apple']]

list2 is list of words to remove from list1 , like

['a', 'an']

I want to get new list like

[['this', 'is', 'pen', 'that', 'is', 'desk'],
 ['this', 'is', 'apple']]

and won't change list1.

I wrote below code, but my code destroy list1, where's wrong my code?

def remove_duplicate_element_in_nested_list(li1, li2):
    """
    :param li1: <list> nested_sentences
    :param li2: <list> words_to_remove
    :return: <list>
    """
    ret = []
    for el1 in li1:
        ret.append(el1)

    for i in range(len(ret)):
        for el2 in li2:
            try:
                # list.remove() remove only one element. so loop this.
                for el in ret[i]:
                    ret[i].remove(el2)
            except ValueError:
                None

    return ret

words = [['this', 'is', 'a', 'pen', 'this', 'is', 'a', 'desk'], ['this', 'is', 'an', 'apple']]
stop_words = ['a', 'an']

print(words)
# shows [['this', 'is', 'a', 'pen', 'that', 'is', 'a', 'desk'], ['this', 'is', 'an', 'apple']]
new_words = remove_duplicate_element_in_nested_list(words, stop_words)
print(words)
# shows [['this', 'is', 'pen', 'that', 'is', 'desk'], ['this', 'is', 'apple']]
Hariom Singh
  • 3,512
  • 6
  • 28
  • 52
rootpetit
  • 421
  • 2
  • 10
  • 23
  • 1
    Possible duplicate of [Python variable reference assignment](https://stackoverflow.com/questions/11222440/python-variable-reference-assignment) – Adelin Jan 17 '18 at 07:53
  • `ret.append(el1)` appends a reference to the same list held in `li1` to `ret`, so when you mutate the list with `ret[i].remove(el2)`, those changes are visible everywhere where you hold a reference to that list. – Ilja Everilä Jan 17 '18 at 07:53
  • @Adelin u probably got the wrong link – user1767754 Jan 17 '18 at 07:54
  • ??? how do you call it removal of duplicate element??? your requirement seems that you want to remove element from list 1 which are exist in list 2 – Gahan Jan 17 '18 at 07:54
  • 2
    Possible duplicate of [Python copy a list of lists](https://stackoverflow.com/questions/28684154/python-copy-a-list-of-lists), also relevant: [How to clone or copy a list?](https://stackoverflow.com/questions/2612802/how-to-clone-or-copy-a-list) – Ilja Everilä Jan 17 '18 at 07:54
  • Actually there are multiple possible questions. Even [this one](https://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference). The problem is the list modified is being passed by reference, so I linked to a similar question – Adelin Jan 17 '18 at 07:55
  • 1
    @Adelin Though I don't actually know how, a question can be closed as a duplicate of multiple targets. I guess that could possibly apply here. Otherwise this will gather a ton of almost identical answers to a problem that has been explained elsewhere already. – Ilja Everilä Jan 17 '18 at 07:56
  • @rootpetit You should read ["Facts and myths about Python names and values"](https://nedbatchelder.com/text/names.html) by Ned Batchelder. It explains how Python's variables (names) work in detail, but in a way that is approachable. – Ilja Everilä Jan 17 '18 at 08:03
  • You're also mutating a list while iterating over it in `for el in ret[i]: ret[i].remove(el2)`, which is a bad idea (tm). – Ilja Everilä Jan 17 '18 at 08:36
  • I have to learn python's reference mechanism. I'll read links. Thanks. – rootpetit Jan 17 '18 at 08:39

7 Answers7

2

ret.append(el1) will not copy the inner list, it copies the reference to the inner list instead.

Try using ret.append(el1[:]) which uses the slice operator to create a copy. Other methods of creating a copy of a list are illustrated here: How to clone or copy a list?

0

A simple for loop method.

def remove_duplicate_element_in_nested_list(li1, li2):
    """
    :param li1: <list> nested_sentences
    :param li2: <list> words_to_remove
    :return: <list>
    """    
    ret = []
    for i in li1:
        r = []
        for k in i:
            if k not in li2:
                r.append(k)
        ret.append(r)

    return ret

A = [['this', 'is', 'a', 'pen', 'that', 'is', 'a', 'desk'], ['this', 'is', 'an', 'apple']]
B =  ['a', 'an'] 
print(remove_duplicate_element_in_nested_list(A, B))

Result:

[['this', 'is', 'pen', 'that', 'is', 'desk'], ['this', 'is', 'apple']]
princelySid
  • 6,687
  • 2
  • 16
  • 17
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • If you really want to help, why don't you also add explanation to your code, instead of simply providing the solution? OP should be aware what causes his code to be incorrect (*where's wrong my code?*), not a new solution – Adelin Jan 17 '18 at 07:56
  • @Adelin: I was just providing a simple solution to what the OP had. – Rakesh Jan 17 '18 at 08:08
  • I know. But it didn't answer OP's question – Adelin Jan 17 '18 at 08:09
0

The problem in your code is with this line

ret.append(el1)

Basically now li1 and ret, both contain the same inner lists. So when you do ret[i].remove(el2), it removes that from both li1 and ret.

You can get your code working by changing the line ret.append(el1) to ret.append(list(el1))

Hari
  • 5,057
  • 9
  • 41
  • 51
0

Because everything is object in python, and list is mutable. It is easy to test:

>>> lst = [[1], [2]]
>>> new_lst = []
>>> for e in lst:
...     new_lst.append(e)
...
>>> new_lst[0] is lst[0]
True
>>> new_lst[0].append(10)
>>> new_lst
[[1, 10], [2]]
>>> lst
[[1, 10], [2]]

copy.deepcopy is an advice

xybaby
  • 39
  • 4
  • This doesn't answer OPs question in any way – Arne Jan 17 '18 at 08:19
  • 1
    @ArneRecknagel I believe that's a bit harsh. I wouldn't say *in any way* but indeed the explanation could use quite some improvements (adding an explanation is a start) – Adelin Jan 17 '18 at 08:21
0

You must recognize that lists are mutable, and when you pass them to functions, they are references to the same object and can produce unexpected results if you're not aware of how that works. For example...

# BAD:

def filter_foo(some_list):
    while 'foo' in some_list:
        some_list.remove('foo')
    return some_list

This will alter the list passed to it as well as return the same list to the caller.

>>> a = ['foo', 'bar', 'baz']
>>> b = filter_foo(a)
>>> a # was modified; BAD
['bar', 'baz']
>>> b is a # they're actually the same object
True

The following avoids this problem by creating a new list

# GOOD:

def filter_foo(some_list):
    new_list = []
    for item in some_list:
        if item != 'foo':
            new_list.append(item)
    return new_list

The list passed was not modified and a separate list with the expected result is returned to the caller.

>>> b = filter_foo(a)
>>> a # not modified
['foo', 'bar', 'baz']
>>> b
['bar', 'baz']
>>> a is b
False

Though, that took a refactor. To fix places where you do this, a simple solution is to make a copy.

# Drop-in fix for bad example:

def filter_foo(some_list):
    some_list = some_list[:] # make a copy
    # rest of code as it was
    return some_list

A different, easy-to-read solution to your problem with simple recursion. Added some comments in case anything wasn't clear.

def filter_words(word_list, filtered_words):
    new_list = []
    for item in word_list:
        if isinstance(item, list):
            # if it's a list... filter that list then append it
            new_list.append(filter_words(item, filtered_words))
        # otherwise it must be a word...
        elif item in filtered_words:
            # if it's in our excluded words, skip it
            continue
        else:
            # it's a word, it's not excluded, so we append it.
            new_list.append(item)

Testing

>>> filter_words(l, ['a', 'an'])
[['this', 'is', 'pen', 'that', 'is', 'desk'], ['this', 'is', 'apple']]    

This should work no matter how deeply nested (up to the recursion limit) the lists are. Could be refactored to any desired level of nestedness, too.

sytech
  • 29,298
  • 3
  • 45
  • 86
0

My way to copying list is not copying values but copying reference.

 ret = []
 for el1 in li1:
     ret.append(el1)

In this case, I have to copy value, and the way are below.

ret.append(el1[:])

or

import copy
ret = copy.deepcopy(li1)

or

ret.append(list(el1))

or something else.

thanks a lot of answers.

rootpetit
  • 421
  • 2
  • 10
  • 23
0

Try this code

list1=[['this', 'is', 'a', 'pen', 'that', 'is', 'a', 'desk'],['this', 'is', 'an', 'apple']]
list2=['a', 'an']
for out in range(0, len(list1)):
  for _in in range(0,len(list1[out])):
    if list1[out][_in]==list2[out]:
       list1.remove(list1[0][1]);
Usman
  • 1,983
  • 15
  • 28