2

I have a table that looks like this:

Header = Category | US | UK | CA
Row 1 = A | value1 | value1 | value2
Row 2 = B | value2 | value1 | value3
Row 3 = C | value1 | value3 | value1

The column "category" contains unique values. The rest of the columns contain a value that can or cannot be unique. The way to read it is: for category A, US items have this value.

I am trying to create a dictionary so that keys are categories, and values are a dictionary with the countries as keys and the values as values.

Dict = {A : {US : value1, UK : value1, CA : value2}, B : 
{US:value2, UK:value1, CA:value3}, C : 
{US:value1,UK:value3,CA:value1}}

It's a long list, so I need to create it through iteration. I've been stuck with it all day. I get to create the keys correctly but I can get the "dictionary-values" right.

Is there an easy way to do this?

glibdud
  • 7,550
  • 4
  • 27
  • 37
Sapehi
  • 133
  • 1
  • 8

5 Answers5

2

Something like this should work and be easy enough to understand bascially just split on " | ":

import pprint


def main():
    pp = pprint.PrettyPrinter(indent=2)
    path = "table.txt"
    res = {}
    with open(path, "r") as f:
        catagories = f.readline().strip().split(" | ")[-3:]
        for line in f:
            key_part, *values = line.strip().split(" | ")
            key = key_part.split()[-1]
            res[key] = {
                catagories[i]: values[i]
                for i in range(len(catagories))
            }
    pp.pprint(res)


if __name__ == "__main__":
    main()

table.txt:

Header = Category | US | UK | CA
Row 1 = A | value1 | value1 | value2
Row 2 = B | value2 | value1 | value3
Row 3 = C | value1 | value3 | value1

Output:

{ 
  'A': {'CA': 'value2', 'UK': 'value1', 'US': 'value1'},
  'B': {'CA': 'value3', 'UK': 'value1', 'US': 'value2'},
  'C': {'CA': 'value1', 'UK': 'value3', 'US': 'value1'}
}
Sash Sinha
  • 18,743
  • 3
  • 23
  • 40
  • Thanks for this answer. I spent hours trying to figure out why creating a dictionary and then assigning it as another dictionary's value ended up with all resulting assignments being the same! I still can't figure it out. So I used your assigning using a live, unrolled (for ... in) { ..... } style. My attempt would have worked in perl. Curse you python! – gkd720 Feb 26 '20 at 17:25
1

Assuming your table is an array of arrays:

table = [[ 'Category', 'US', 'UK', 'CA' ], [ 'A', 'value1', 'value1',  'value2'], [ 'B', 'value2', 'value1',  'value2']]

dict =  {table[i][0] :  {table[0][j]: table[i][j] for j in range(1,len(table[i]))} for i in range(1,len(table))}
print(dict)

Gives you:

{'A': {'US': 'value1', 'UK': 'value1', 'CA': 'value2'}, 'B': {'US': 'value2', 'UK': 'value1', 'CA': 'value2'}}

skymon
  • 850
  • 1
  • 12
  • 19
0

Pandas to dictionary Pandas to Dict, perhaps load from your text file to pandas and then convert to dict setting the index to Category.

For example:

import pandas as pd

df = pd.read_csv("data.csv", sep=",")
s = df.set_index('Category').T.to_dict('series')

print(s)

data.csv

Category,US,UK,CA
A,1,1,1
B,2,2,2
C,3,3,3
ramm
  • 31
  • 4
0

The correct way to put values into a dictionary is to well, just assign them to a key:

dictionary[key] = v

Because you want to have dictionaries as values, you just need to write {US : value1, UK : value1, CA : value2} or something like that in the place of 'v', with value1, value2 and so on assigned the correct values.

And if you have US, UK and CA as strings and not variables with custom contents as keys in the inner dictionaries, write "UK": value1 instead of UK: value1

hajduzs
  • 83
  • 1
  • 7
0

Assuming your table is exactly as described in your question and is in the file sapehi.txt, this should do what you want.

f = open("sapehi.txt", "r")
first = f.readline()
table = first.split("=")[1] # Throw away the "Header =" part
columns = [x.strip() for x in table.split("|")][1:] # Create a list of the column headings

output = {} # Create the output dictionary

while True:
    line = f.readline()
    if (line == ""): # If there's no more data in the file, exit the loop
        break
    row = line.split("=")[1] # Throw away the "Row x" part
    values = [x.strip() for x in row.split("|")] # Create a row elements
    category = values[0] # Category is the first row element
    values = values[1:] # The values are all the rest
    output[category] = {} # Create a dict for this category
    for index, value in enumerate(values):
        output[category][columns[index]] = value # Populate it with the values

print(output)

This script gets the column headers into a list, which is accessed for each row.

TechnoSam
  • 578
  • 1
  • 8
  • 23