0

I am fairly new to Python, and I have been given an assignment in my research group to extract the important data from an output file. The output file is very large, containing data split into sections. Each section is headed with a title in all capital letters, such as "SURFACE TEMPERATURE," and the following 100-600 lines all contain relevant data. Essentially, I need to read in the file and search for the line that has the string that indicates the data. The number of rows for each data set is fixed, but the location in the text file is not. I then need to save the desired data to a different list. Any help or direction would be appreciated.

I have a decent idea about how to open and read the file in python, but I am at a loss when trying to figure out how to search for the section of data and save it to a new list/array.

Josteris
  • 11
  • 2
  • So you are searching for a line with a specific section title, and then you want to read in the lines in just that section, and to stop reading when you encounter the next section title or EOF? – President James K. Polk Jun 17 '19 at 00:04
  • See if this helps: https://stackoverflow.com/questions/4940032/how-to-search-for-a-string-in-text-files – Dileep Jun 17 '19 at 00:56
  • @JamesKPolk: Yes. That is essentially what I need to do. Each section is followed by a few hundred lines, each line being filled with a string of about 5-10 data points. – Josteris Jun 17 '19 at 16:07

1 Answers1

0

This is how I understand your data file to be structured:

TEST1
asdf
asdf
asdf

TEST2
asdf
asdf
asdf

DATA WE WANT
xxxx
xxxx
xxxx

To parse this, we would do the following:

# opening the datafile like this is a best practice
with open("tfile.txt") as infile:
    data = infile.readlines()

    # clean up the data
    data = [x.strip() for x in data]

    # set up the list we'll store the data in
    data_list = []

    # loop through the data
    saving_data = False
    for item in data:
        if item == "DATA WE WANT":
        # check if we're at the right header
            print("Data found")
            saving_data = True
            continue
        elif item == "":
        # check to see if line is empty
            saving_data = False
            continue
        elif item == item.upper():
        # check to see if the current item is a header
            print("Header:", item)
            saving_data = False
            continue
        elif saving_data:
            data_list.append(item)

    print(data_list)

It's important to check everything before saving the data, as with a file as large as yours it can be hard to tell if you are successful or not.

wolfy
  • 113
  • 1
  • 7