0

I have a function that parses a tab separated text file. The parsing simply takes each column and assigns it inside a dictionary. The code is as follows:

def parseRatingsFile():
    jsonArray = []
    json = {
        "userID": "placeholder",
        "itemID": "placeholder",
        "rating": "placeholder",
        "timestamp": "placeholder"
    }
    with open(filePath, 'r') as file:
        data = file.readlines()
        data = [x.strip() for x in data]
        for row in data:
            cols = row.split("\t")
            json["userID"] = cols[0]
            json["itemID"] = cols[1]
            json["rating"] = cols[2]
            json["timestamp"] = cols[3]
            # print(json)
            jsonArray.append(json)
    pprint.pprint(jsonArray)
    return jsonArray

If I print out just the json dictionary in the for loop, I get this output (which is right):

{'userID': '276', 'itemID': '246', 'rating': '4', 'timestamp': '874786686'}
{'userID': '557', 'itemID': '529', 'rating': '5', 'timestamp': '881179455'}
{'userID': '913', 'itemID': '258', 'rating': '4', 'timestamp': '889331049'}

If I print out the jsonArray list, I get this output:

[{'itemID': '258', 'rating': '4', 'timestamp': '889331049', 'userID': '913'},
 {'itemID': '258', 'rating': '4', 'timestamp': '889331049', 'userID': '913'},
 {'itemID': '258', 'rating': '4', 'timestamp': '889331049', 'userID': '913'}]

I don't particularly care about the change in order. What is strange is that jsonArray only has the last updated json dictionary, and repeats it the same number of times as there are iterations in the loop. I am using python 3.6.6.

Does append in python ignore the changes that occur in for loops? Otherwise, I have no clue why this is not working

bitscuit
  • 976
  • 1
  • 11
  • 26
  • 1
    `json` is a single dictionary object. When you modify `json`, it reflects in all instances of `json` in your list because dictionaries are mutable – roganjosh Oct 01 '18 at 20:02
  • You are adding **one and the same** dictionary to the list, over and over again. Printing after each alteration only shows you snapshots in time, writing out the current state of that one dictionary. That still doesn't give you separate copies. – Martijn Pieters Oct 01 '18 at 20:03
  • Create the a new dictionary for each entry in the loop, or create a copy each time. `list.append()` doesn't create copies. – Martijn Pieters Oct 01 '18 at 20:04
  • You may want to read up on [how names and objects work in Python](https://nedbatchelder.com/text/names.html) – Martijn Pieters Oct 01 '18 at 20:05

0 Answers0