-1

I have this piece of code that accomplishes a text sequence processing task. It generally does what I want, but when I make a copy of one of my lists, the original gets modified. See below:

list_1 = [
      {'word': 'hey hows it going?', 's1': 1.2, 's2': 3.6},
      {'word': 'um', 's1': 3.7, 's2': 4.2},
      {'word': 'its raining outside today', 's1': 4.3, 's2': 5.0},
      {'word': 'and its really cold', 's1': 5.1, 's2': 6.6},
      {'word': 'dont you think?', 's1': 6.7, 's2': 7.6},
      {'word': 'its awful', 's1': 7.7, 's2': 9.0}
]
list_2 = [
  {'category': 0, 's1': 0.0, 's2': 3.8},
  {'category': 1, 's1': 3.9, 's2': 4.9},
  {'category': 1, 's1': 5.0, 's2': 7.2},
  {'category': 0, 's1': 7.3, 's2': 7.6},
  {'category': 1, 's1': 7.7, 's2': 9.0}
]


def combine_dicts(list_1, list_2):
    
    outputs = []

    for cat in list_2:
        start = cat['s1']
        end = cat['s2']
        out = copy(cat)
        out['words'] = ''
        outputs.append(out)
        list_1_copy = list_1.copy()

        for interval in list_1_copy:
            if interval['s1'] >= start and interval['s2'] <= end:
                out['words'] += interval['word'] + ' '
                interval['word'] = ''
            elif interval['s1'] <= start and interval['s2'] <= end:
                out['words'] += interval['word'] + ' '
                interval['word'] = ''
        out['words'] = out['words'].strip()
    pprint.pprint(outputs)
    
combine_dicts(list_1, list_2)

In the first for loop, I make a copy of list_1, which is list_1_copy, so that I can remove items from it as I iterate through it, without effecting the original, list_1. However, running the code always removes items from list_1 and list_1_copy.

I thought I had an understanding of the copy() function, for example, I want the result to be like this:

list_to_not_change = [1,2,3]
list_to_change = list_to_not_change.copy()

while list_to_change:
    list_to_change.pop(0)
    
print(list_to_not_change)
print(list_to_change)

Output:
[1, 2, 3]
[]

Where is my code/understanding wrong and why does the original list_1 become modified in my code? Thank you for any advice.

If useful, the output of my algorithm is:

[{'category': 0, 's1': 0.0, 's2': 3.8, 'words': 'hey hows it going?'},
 {'category': 1, 's1': 3.9, 's2': 4.9, 'words': 'um'},
 {'category': 1,
  's1': 5.0,
  's2': 7.2,
  'words': 'its raining outside today and its really cold'},
 {'category': 0, 's1': 7.3, 's2': 7.6, 'words': 'dont you think?'},
 {'category': 1, 's1': 7.7, 's2': 9.0, 'words': 'its awful'}]
connor449
  • 1,549
  • 2
  • 18
  • 49
  • 4
    Because `.copy()` is only *shallow*, and your list contains mutable dictionaries. – jonrsharpe Dec 13 '19 at 17:28
  • 2
    `copy()` makes a separate copy of _the list itself_, but not _the objects in the list_. For that, you need `deepcopy()`. – John Gordon Dec 13 '19 at 17:29
  • 1
    Does this answer your question? [How to clone or copy a list?](https://stackoverflow.com/questions/2612802/how-to-clone-or-copy-a-list) – Arn Dec 13 '19 at 17:30

2 Answers2

2

You made a shallow copy. list_1_copy has its own list -- but the list elements were not similarly copied, so the two lists share the same dict elements. When you change any dict value, that change is reflected in both lists.

You need to use deepcopy to separate them entirely.

Prune
  • 76,765
  • 14
  • 60
  • 81
1

Try with deepcopy:

Deep copy is a process in which the copying process occurs recursively. It means first constructing a new collection object and then recursively populating it with copies of the child objects found in the original. In case of deep copy, a copy of object is copied in other object. It means that any changes made to a copy of object do not reflect in the original object. In python, this is implemented using deepcopy() function.

copy.deepcopy()

ex:

import copy 

# initializing list 1 
li1 = [1, 2, [3,5], 4] 

# using deepcopy to deep copy  
li2 = copy.deepcopy(li1) 
Harsha Biyani
  • 7,049
  • 9
  • 37
  • 61