0

I looked up "nested dict" and "nested list" but either method work.

I have a python object with the following structure:

    [{
    'id': 'productID1', 'name': 'productname A',
    'option': {
        'size': {
            'type': 'list',
            'name': 'size',
            'choices': [
                {'value': 'M'},
            ]}},

    'variant': [{
        'id': 'variantID1',
        'choices':
        {'size': 'M'},
        'attributes':
        {'currency': 'USD', 'price': 1}}]
}]

what i need to output is a csv file in the following, flattened structure:

id, productname, variantid, size, currency, price
productID1, productname A, variantID1, M, USD, 1
productID1, productname A, variantID2, L, USD, 2
productID2, productname A, variantID3, XL, USD, 3

i tried this solution: Python: Writing Nested Dictionary to CSV or this one: From Nested Dictionary to CSV File

i got rid of the [] around and within the data and e.g. i used this code snippet from 2 and adapted it to my needs. IRL i can't get rid of the [] because that's simple the format i get when calling the API.

with open('productdata.csv', 'w', newline='', encoding='utf-8') as output:
    writer = csv.writer(output, delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
    for key in sorted(data):
        value = data[key]
        if len(value) > 0:
            writer.writerow([key, value])
        else:
            for i in value:
                writer.writerow([key, i, value])

but the output is like this:

"id";"productID1"
"name";"productname A"
"option";"{'size': {'type': 'list', 'name': 'size', 'choices': {'value': 'M'}}}"
"variant";"{'id': 'variantID1', 'choices': {'size': 'M'}, 'attributes': {'currency': 'USD', 'price': 1}}"

anyone can help me out, please?

thanks in advance

boese
  • 13
  • 4
  • Can you show us what you wrote that raised the error? – C.Nivs Feb 02 '21 at 22:22
  • yes, i added the code used & the error – boese Feb 02 '21 at 22:56
  • Either you're confusing where the error is actually being raised (based on your code, it should be at `data[sessionId]`), or the code you've posted is incomplete – C.Nivs Feb 02 '21 at 23:28
  • @c.Nivs yes, you're right, the `data` with `[ ]` around it, will produce the error at `data[sessionId]`. i removed the `[ ]` and got it at `writer.writerow([sessionId, item, ratings[item]])`. – boese Feb 03 '21 at 08:06
  • @C.Nivs i tweaked it a bit, but the output isn't satisfying still... ^^ and IRL i can't get rid of the `[ ]` beacause that's what i get from the API call. – boese Feb 03 '21 at 08:24

3 Answers3

0

list indices must be integers not strings

The following presents a visual example of a python list:

0 carrot.
1 broccoli.
2 asparagus.
3 cauliflower.
4 corn.
5 cucumber.
6 eggplant.
7 bell pepper

0, 1, 2 are all "indices".
"carrot", "broccoli", etc... are all said to be "values"

Essentially, a python list is a machine which has integer inputs and arbitrary outputs.

Think of a python list as a black-box:

  1. A number, such as 5, goes into the box.
  2. you turn a crank handle attached to the box.
  3. Maybe the string "cucumber" comes out of the box

You got an error: TypeError: list indices must be integers or slices, not str

There are various solutions.

Convert Strings into Integers

Convert the string into an integer.

listy_the_list = ["carrot", "broccoli", "asparagus", "cauliflower"]

string_index = "2"
integer_index = int(string_index)

element = listy_the_list[integer_index]

so yeah.... that works as long as your string-indicies look like numbers (e.g. "456" or "7")

The integer class constructor, int(), is not very smart.

For example, x = int("3 ") will produce an error.

You can try x = int(strying.strip()) to get rid of leading and trailing white-space characters.

Use a Container which Allows Keys to be Strings

Long ago, before before electronic computers existed, there were various types of containers in the world:

  • cookie jars
  • muffin tins
  • carboard boxes
  • glass jars
  • steel cans.
  • back-packs
  • duffel bags
  • closets/wardrobes
  • brief-cases

In computer programming there are also various types of "containers"
You do not have to use a list as your container, if you do not want to.

There are containers where the keys (AKA indices) are allowed to be strings, instead of integers.

In python, the standard container which like a list, but where the keys/indices can be strings, is a dictionary

thisdict = {
  "make": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict["brand"] == "Ford"

If you want to index into a container using strings, instead of integers, then use a dict, instead of a list

The following is an example of a python dict which has state names as input and state abreviations as output:

us_state_abbrev = {
    'Alabama': 'AL',
    'Alaska': 'AK',
    'American Samoa': 'AS',
    'Arizona': 'AZ',
    'Arkansas': 'AR',
    'California': 'CA',
    'Colorado': 'CO',
    'Connecticut': 'CT',
    'Delaware': 'DE',
    'District of Columbia': 'DC',
    'Florida': 'FL',
    'Georgia': 'GA',
    'Guam': 'GU',
    'Hawaii': 'HI',
    'Idaho': 'ID',
    'Illinois': 'IL',
    'Indiana': 'IN',
    'Iowa': 'IA',
    'Kansas': 'KS',
    'Kentucky': 'KY',
    'Louisiana': 'LA',
    'Maine': 'ME',
    'Maryland': 'MD',
    'Massachusetts': 'MA',
    'Michigan': 'MI',
    'Minnesota': 'MN',
    'Mississippi': 'MS',
    'Missouri': 'MO',
    'Montana': 'MT',
    'Nebraska': 'NE',
    'Nevada': 'NV',
    'New Hampshire': 'NH',
    'New Jersey': 'NJ',
    'New Mexico': 'NM',
    'New York': 'NY',
    'North Carolina': 'NC',
    'North Dakota': 'ND',
    'Northern Mariana Islands':'MP',
    'Ohio': 'OH',
    'Oklahoma': 'OK',
    'Oregon': 'OR',
    'Pennsylvania': 'PA',
    'Puerto Rico': 'PR',
    'Rhode Island': 'RI',
    'South Carolina': 'SC',
    'South Dakota': 'SD',
    'Tennessee': 'TN',
    'Texas': 'TX',
    'Utah': 'UT',
    'Vermont': 'VT',
    'Virgin Islands': 'VI',
    'Virginia': 'VA',
    'Washington': 'WA',
    'West Virginia': 'WV',
    'Wisconsin': 'WI',
    'Wyoming': 'WY'
}
Toothpick Anemone
  • 4,290
  • 2
  • 20
  • 42
  • Hi there, thanks for the reply, please see my edits.. i got rid of that type error by not indexing a list with a string-index. but still, the output isn't satisfying.. and IRL i can't get rid of the `[ ]` because that's what i get from the API call. – boese Feb 03 '21 at 08:26
  • @boese If you have more than one problem with your code, you should post each problem as a separate question on stackoverflow.com. The idea is that when people google something, like "UnicodeEncodeError: 'ascii' codec can't encode character." If you re-write your stack overflow question so that the error has been fixed, then nobody else can learn from your mistake. The purpose of stack overflow is to learn from other programmer's mistakes. If you have a second error in your code, post a new questions about that specific error; do not re-write the original question. – Toothpick Anemone Feb 09 '21 at 16:24
  • @boese Ideally, every stack overflow question only discusses one specific error. You need to break the mistakes in your program into separate pieces. Post each mistake separately. Ideally, you will delete as much of the code in your program as possible such that you still get the same error message. Simplify your code such that the same mistake is created, but the code is easier to read. Then you will post that short snippet on stack-overflow.com – Toothpick Anemone Feb 09 '21 at 16:25
  • that's exactly why i posted the final solution. the error was not part of the question, actually. but thanks for the excursion. – boese Feb 10 '21 at 11:44
0

i could actually iterate this list and create my own sublist, e.g. e list of variants

data = [{
    'id': 'productID1', 'name': 'productname A',
    'option': {
        'size': {
            'type': 'list',
            'name': 'size',
            'choices': [
                {'value': 'M'},
            ]}},

    'variant': [{
        'id': 'variantID1',
        'choices':
        {'size': 'M'},
        'attributes':
        {'currency': 'USD', 'price': 1}}]
},
    {'id': 'productID2', 'name': 'productname B',
    'option': {
        'size': {
            'type': 'list',
            'name': 'size',
            'choices': [
                {'value': 'XL', 'salue':'XXL'},
            ]}},

    'variant': [{
        'id': 'variantID2',
        'choices':
        {'size': 'XL', 'size2':'XXL'},
        'attributes':
        {'currency': 'USD', 'price': 2}}]
    }

]

new_list = {}

for item in data:

    new_list.update(id=item['id'])
    new_list.update (name=item['name'])
    
    for variant in item['variant']:
        new_list.update (varid=variant['id']) 

        for vchoice in variant['choices']:
            new_list.update (vsize=variant['choices'][vchoice])    
                
        for attribute in variant['attributes']:
            new_list.update (vprice=variant['attributes'][attribute])            

    for option in item['option']['size']['choices']:
        new_list.update (osize=option['value'])            

print (new_list)

but the output is always the last item of the iteration, because i always overwrite new_list with update().

{'id': 'productID2', 'name': 'productname B', 'varid': 'variantID2', 'vsize': 'XXL', 'vprice': 2, 'osize': 'XL'}
boese
  • 13
  • 4
0

here's the final solution which worked for me:

data = [{
    'id': 'productID1', 'name': 'productname A',

    'variant': [{
        'id': 'variantID1',
        'choices':
        {'size': 'M'},
        'attributes':
        {'currency': 'USD', 'price': 1}},
        
        {'id':'variantID2',
        'choices':
        {'size': 'L'},
        'attributes':
        {'currency':'USD', 'price':2}}
        ]
},
{
    'id': 'productID2', 'name': 'productname B',

    'variant': [{
        'id': 'variantID3',
        'choices':
        {'size': 'XL'},
        'attributes':
        {'currency': 'USD', 'price': 3}},
        
        {'id':'variantID4',
        'choices':
        {'size': 'XXL'},
        'attributes':
        {'currency':'USD', 'price':4}}
        ]
}
]

for item in data:
    
        for variant in item['variant']:
            dic = {}
            dic.update (ProductID=item['id'])
            dic.update (Name=item['name'].title())
            dic.update (ID=variant['id'])
            dic.update (size=variant['choices']['size'])
            dic.update (Price=variant['attributes']['price'])
            
                
            products.append(dic)
          
keys = products[0].keys()

with open('productdata.csv', 'w', newline='', encoding='utf-8') as output_file:
    dict_writer = csv.DictWriter(output_file, keys,delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
    dict_writer.writeheader()
    dict_writer.writerows(products)

with the following output:

"ProductID";"Name";"ID";"size";"Price"
"productID1";"Productname A";"variantID1";"M";1
"productID1";"Productname A";"variantID2";"L";2
"productID2";"Productname B";"variantID3";"XL";3
"productID2";"Productname B";"variantID4";"XXL";4

which is exactly what i wanted.

boese
  • 13
  • 4