2

I am currently writing some code that reads lines in from a text file. The line is split into 3 different segments, with the first segment being a user ID.

For example, one line would look like this:

11 490 5 

I have a list with as many elements as there are users, where each element corresponds with a user (eg exampleList[4] stores data for the 5th user).

Each list element contains a dictionary of indefinite length, where the key is the second segment of the line, and the value is the third segment of the line.

The length of the dictionary (the number of key-value pairs) increases if the same user's ID occurs in another line. The idea is that when another line with the same user ID is encountered, the data from that line is appended to the dictionary in the list element that corresponds to that user.

For example, the above line would be stored in something like this:

exampleList[10] = {490:5}

and if the program read another line like this: 11 23 9

the list item would update itself to this:

exampleList[10] = {490:5, 23:9}

The way my program works is that it first collects the number of users, and then creates a list like this:

exampleList = [{}] * numberOfUsers

It then extracts the position of whitespace in the line using re.finditer, which is then used to extract the numbers through basic string operations.

That part works perfectly, but I'm unsure of how to update dictionaries within a list, namely appending new key-value pairs to the dictionary.

I've read about using a for loop here, but that won't work for me since that adds it to every dictionary in the cell instead of just appending it to the dictionary in a certain cell only.

Sample code:

    oFile = open("file.txt", encoding = "ISO-8859-1")
    text = oFile.readlines()
    cL = [{}] * numOfUsers #imported from another method
    for line in text:
        a = [m.start() for m in re.finditer('\t', line)]
        userID = int(line[0:a[0]])
        uIDIndex = userID - 1
        cL[uIDIndex].update({int(line[a[0]+1:a[1]]):int(line[a[1]+1:a[2]])})
    print(cL)

file.txt: 
1   242 3   
3   302 3   
5   333 10  
1   666 9   

expected output:
[{242:3 , 666:9},{},{302:3},{},{333:10}]

actual output:
[{242: 3, 333: 10, 302: 3, 666: 9}, {242: 3, 333: 10, 302: 3, 666: 9}, {242: 3, 333: 10, 302: 3, 666: 9}, {242: 3, 333: 10, 302: 3, 666: 9}, {242: 3, 333: 10, 302: 3, 666: 9}]

For some reason, it populates all dictionaries in the list with all the values.

Community
  • 1
  • 1
user132520
  • 151
  • 1
  • 1
  • 5
  • Will you please share some of your code. – Hassan Mehmood May 18 '16 at 03:47
  • 1
    If you want a list of dictionaries, why do you create a list of lists? – TessellatingHeckler May 18 '16 at 03:48
  • Can you put in some solid sample and expected output? – Rajesh Yogeshwar May 18 '16 at 03:59
  • I have edited to include code, an example of the file being read, and expected output and actual output. – user132520 May 18 '16 at 04:14
  • @user132520 Why are there empty dictionaries being created in your expected output? – Keatinge May 18 '16 at 04:25
  • @Racialz the program won't neccesarily populate every single dictionary. In the expected output above, the 2nd and 4th dictionaries are empty because none of the first segments of each line in file.txt (1, 3, 5, 1) don't correspond with them. If for example, there was a line "2 541 3 1234", then i'd expect the second dictionary to contain {541:3}. – user132520 May 18 '16 at 04:31
  • Does the index `1` and the fourth value `881250949` have to be the same for two items to be qualified to go into the same dictionary? What if just one of them is the same? – Keatinge May 18 '16 at 04:34
  • @Racialz the fourth value doesn't actually matter— i'll delete it accordingly. All that matters for the two items to go into the same dictionary is the first value. – user132520 May 18 '16 at 04:36
  • 1
    `cL = [{}] * numOfUsers` is a bad idea: see [here](http://stackoverflow.com/questions/240178/python-list-of-lists-changes-reflected-across-sublists-unexpectedly) for a list example but the same issue affects dictionaries as well.. and I just noticed `exampleList = [[]] * numberOfUsers` which is also a bad idea for the same reason. – DSM May 18 '16 at 04:48
  • @DSM I didn't know how that worked, thank you. – user132520 May 18 '16 at 05:16

2 Answers2

0

You can just access the dictionary by the index. Here is a simple example:

    >>> A = []
    >>> A.append(dict())
    >>> A.append(dict())
    >>> A[0][5] = 7
    >>> A
    [{5: 7}, {}]
    >>> A[1][4] = 8
    >>> A[0][3] = 9
    >>> A[1][8] = 10
    >>> A
    [{3: 9, 5: 7}, {8: 10, 4: 8}]
Sagnik Ghosh
  • 221
  • 2
  • 3
0

I'm not positive I understand your problem correctly but I was able to get the output you desired. Note that this solution completely ignores the fourth value in the list

import re
fileData = [] #data from file.txt parsed through regex

with open("file.txt") as f:
    for line in f:
        regExp = re.match(r"(\d+)\s+(\d+)\s(\d+)", line)  #extracts data from row in file
        fileData.append((int(regExp.group(1)), int(regExp.group(2)), int(regExp.group(3)))) #make 2-d list of data
maxIndex = max(fileData, key=lambda x: x[0])[0] #biggest index in the list (5 in this case)

finaList = [] #the list where your output will be stored
for i in range(1, maxIndex+1): #you example output showed a 1-indexed dict
    thisDict = {} #start with empty dict
    for item in fileData:
        if item[0] == i:
            thisDict[item[1]] = item[2] #for every item with same index as this dict, add new key-value to dict
    finaList.append(thisDict) #add this dict to output list

print(finaList)
Keatinge
  • 4,330
  • 6
  • 25
  • 44
  • perfect! thank you so much– sorry if I made the problem hard to understand, i'm not the greatest at communicating. – user132520 May 18 '16 at 04:45
  • I don't think the problem was your explaining, I think it's just a complicated problem that's hard to convey in text. But yes understanding the problem for me was 99% of the battle, coding the solution wasn't very complicated – Keatinge May 18 '16 at 04:46