1

I am trying to parse through a file with consistent formatting: a header and several lines of text split by spacing. I want to start a new dictionary key when a line has one value, read the following lines into a list of lists with each list being the split words. I first tried to use this to try to get the program to identify the new marker and use an index counter to set a new key. I then initially used this to split the lines accordingly.

Here is what my code currently looks like:

import sys

def openfile(file):
    frames = {}
    index = 0
    with open(file, 'r') as f:
        for line in f:
            if line.strip() == '5310':
                index +=1
            else:
                newline = line
                print newline
                frames[index] = []
                frames[index].append([newline.split()])
        print frames

openfile(sys.argv[1])

The index will correctly count and 'print newline' is printing all of the lines that I want, but the dictionary that is finally printed is a nested list:

{1:[['last', 'line', 'of', 'input', 'file']]}

What I want instead is:

{1:[[line1],[line2] ...], 2:[[nextline], [nextline] ...], ... , key n : [[line], [line]....[lastline]]}

I have also tried:

def openfile(file):
    frames = {}
    index = 0
    with open(file) as f:
         for line in f:
            if str(line.strip()) == '5310':
                index += 1
            else:
                frames[index] = []
                frames[index].append([line.split()])
    return frames

This will also not work. This leaves me with two questions: 1: why will my current code print but not append the lines I want? 2. what else can I try to get this to work?

edit Thanks! I managed to get it to work. If someone is having a similar issue, here's my code that works:

import sys

def openfile(file):
    frames = {}
    index = 0
    with open(file, 'r') as f:
        for line in f:
            if line.strip() == '5310':
                index +=1
                frames[index] = []
            else:
                newline = line
                print newline
                frames[index].append([newline.split()])
        print frames

openfile(sys.argv[1])
mshyu24
  • 13
  • 3
  • 2
    I think you might want to look at [`defaultdict`](https://docs.python.org/3.3/library/collections.html#collections.defaultdict). `frames[index] = []` wipes all values stored against that key. Without setting up a test case for this, I would use `frames = defaultdict(list)` and get rid of `frames[index] = []` from the loop. Does that work? – roganjosh Jun 27 '18 at 18:34
  • `frames[index].append([newline.split()])` makes `frames[index]` into a list of lists of lists. Use `extend` or remove the extra `[...]` – Mad Physicist Jun 27 '18 at 18:50

1 Answers1

1

Your problem is obvious ... once you see the problem :-)

            frames[index] = []
            frames[index].append([newline.split()])

Every time through the loop, you wipe out the earlier progress, and start with a new, empty list. Thus, only the last iteration's result is in frames.

Initialization code has to be done only once, before you enter the loop.

with open(file) as f:
     frames[index] = []
     for line in f:

... or other appropriate point for your application.

Prune
  • 76,765
  • 14
  • 60
  • 81