0

I am not that experienced with Python, hence my request to help me improve my code.

I am trying to parse "Steve" which will be under the field "Name":

xxxx xxxx xxxx Name
zzzz zzzz zzzz Steve

my code looks like this:

for line in myfile.readlines():
    [..]
    if re.search(r'Name =', line):
        print("Destination = ")
        samples+=line[15:19]
        nextline = "y"
    if nextline == 'y':
        samples+=line[15:19]

Eventually I will print everything:

[..]    
for s in samples:
   myfile2.write(s)

It does work, but I can't believe there is no smarter way to do this (like accessing to the following line once the conditions are met..).

This is an example of the file I need to parse. But the structure may vary as for instance

#This is another example
Name =
Steve

Any help is appreciated.

psk
  • 121
  • 1
  • 12

3 Answers3

0

list.txt:

zzzz zzzz zzzz Abcde
xxxx xxxx xxxx Name
zzzz zzzz zzzz Steve
zzzz zzzz zzzz Efghs

and then:

logFile = "list.txt"

with open(logFile) as f:
    content = f.readlines()    
# you may also want to remove empty lines
content = [l.strip() for l in content if l.strip()]

# flag for next line
nextLine = False

for line in content:
    find_Name = line.find('Name')       # check if Name exists in the line

    if find_Name > 0                    # If Name exists, set the next_line flag
        nextLine = not nextLine
    else:
        if nextLine:                    # If the flag is set, grab the Name
            print(line.split(" ")[-1])  # Grabbing the last word of the line
            nextLine = not nextLine

OUTPUT:

Steve
DirtyBit
  • 16,613
  • 4
  • 34
  • 55
0

Don't reinvent the wheel. Use the csv module, for example with a DictReader:

import csv
with open("input") as f:
    reader = csv.DictReader(f, delimiter=" ")
    for line in reader:
        print(line["Name"])

This assumes that "Steve" will not always be literally below "Name", as the position could vary if the items in the other columns are longer or shorter, but rather the item in the same column. Also, this assumes that the line with "Name" will be the first line in the file.

If this is not the case, and if Name could appear in any line, and you want only the name in the line below that, you could just call next on the same iterator used by the for loop:

import re
with open("input") as f:
    for line in f:  # note: no readlines!
        if re.search(r'\bName\b', line):  # \b == word boundary
            pos = line.split().index("Name")
            name = next(f).split()[pos]
            print(name)
tobias_k
  • 81,265
  • 12
  • 120
  • 179
0

list.txt:

zzzz zzzz zzzz Abcde
xxxx xxxx xxxx Name
zzzz zzzz zzzz Steve
zzzz zzzz zzzz Efghs

You can split each line on a space and then read the array index of interest.

As example below:

logFile = "list.txt"

with open(logFile) as f:
    lines = f.readlines()

    for line in lines:
        # split using space
        result = line.split(" ")
        # you can access the name directly:
        #    name = line.split(" ")[3]
        # python array starts at 0
        # so by using [3], you access the 4th column.
        print result[3] 

Alternatively, you can use numpy to print just column 4 from your data dictionary:

import numpy
logFile = "list.txt"

data = []
with open(logFile) as f:
    lines = f.readlines()

    for line in lines:
        result = line.split(" ")
        data.append(result)

matrix = numpy.matrix(data)
print matrix[:,[3]]

You can read more about this here: StackOverflow Question Some matrix info

Dawid Czerwinski
  • 670
  • 5
  • 11