creating a function to read file and create dictionary

Question

I am trying to read from a file that contains elements that I am going to edit and store in a dictionary. However I am facing this error when running the code

def read_file(filename):
infile = open(filename, 'r')
elements = {}
for line in infile:
    infile.readline()
    words = line.split(';')
    for i in range(len(words)):
        element = words[i].split('-')[0].upper().strip()
        density = words[i].split('-')[1].strip().replace(',', '')
        elements[element] = float(density)
return elements


read_file('atm_moon.txt')

Error:

Traceback (most recent call last):
File "atm_moon.py", line 14, in <module>
read_file('atm_moon.txt')
File "atm_moon.py", line 9, in read_file
density = words[i].split('-')[1].strip().replace(',', '')
IndexError: list index out of range

the file looks like this: https://www.uio.no/studier/emner/matnat/ifi/IN1900/h20/ressurser/live_programmering/atm_moon.txt

```Estimated Composition (night, particles per cubic cm):``` doesn't have any ```;```. So ```split``` will keep it as it is. So your list only have 1 value — , Aug 19 '21 at 10:07
I moved' infile.readline()' outside of the for-loop, and it all worked out! Thank you:) — N.A., Aug 19 '21 at 11:09

score 0 · Answer 1 · answered Aug 19 '21 at 10:19

The problem is the line:

Estimated Composition (night, particles per cubic cm):

Doesn't have any semi colons. So split will leave the string untouched. So there is only 1 element in the list. When you try to fetch [1], the list element doesn't exists and it raises an error.

1 work around is to check if the list's length is greater than 1.

for line in infile:
    words = line.strip("\n").split(';')
    if len(words)>1:
        for i in words:
            a,b=i.strip().split('-')

            elements[a]=float(Decimal(sub(r'[^\d.]', '', b)))

References:
How do I convert a currency string to a floating point number in Python?

score 0 · Answer 2 · answered Aug 19 '21 at 10:21

Your first line: Estimated Composition (night, particles per cubic cm): doesn't have - char, so words[i].split('-') will generate list of single element here.

>>> "Estimated Composition (night, particles per cubic cm):".split("-")
['Estimated Composition (night, particles per cubic cm):']

and you can't get element with index 1 from it.

I would suggest to skip all words without -. I also think that you don't need line infile.readline() so I would rewrite loop as follows

for line in infile:
    words = line.split(';')
    for i in range(len(words)):
        if "-" not in words[i]:
           continue
        element = words[i].split('-')[0].upper().strip()
        density = words[i].split('-')[1].strip().replace(',', '')
        elements[element] = float(density)
return elements

Tõnis Piip · Answer 3 · 2021-08-19T10:36:36.687

I'm not an expert or anything much in Python, but here's what I think/found: The first line is a special case that doesn't work with the rest of the code to begin with (no "-" or ";" in it). You can just take the first line as header with readline(), thus removing it from the rest of the read in text:

elements = {}
header = infile.readline().rstrip()
elements["header"] = header

Then when going through the list of elements and their densities:

for i in range(len(words)):
    element, density = words[i].split('-')
    elements[element.strip()] = float(density.strip().replace(',', ''))

You can just create two variables element and density with one split function

element, density = words[i].split('-')

since splitting it beforehand from ; and then at - leaves just two variables anyway. Sidenote: if you know how many elements/variables a function creates/returns when you use it, you can assign that many variables as that function's return values, i.e:

var1, var2, ..., varN = function_that_returns_N_variables(foo)

Then when adding them to the dict you simultaneously strip(), replace() and/or convert with float() if necessary.

creating a function to read file and create dictionary

3 Answers3