0

I have a text file that looks like this:

1
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
2
Subpop 0 best fitness of run: Fitness: Standardized=61.0 Adjusted=0.016129032258064516 Hits=28
3
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
4
Subpop 0 best fitness of run: Fitness: Standardized=70.0 Adjusted=0.014084507042253521 Hits=19
5
Subpop 0 best fitness of run: Fitness: Standardized=72.0 Adjusted=0.0136986301369863 Hits=17
6
Subpop 0 best fitness of run: Fitness: Standardized=67.0 Adjusted=0.014705882352941176 Hits=22
7
Subpop 0 best fitness of run: Fitness: Standardized=65.0 Adjusted=0.015151515151515152 Hits=24
8
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
9
Subpop 0 best fitness of run: Fitness: Standardized=78.0 Adjusted=0.012658227848101266 Hits=11
10
Subpop 0 best fitness of run: Fitness: Standardized=65.0 Adjusted=0.015151515151515152 Hits=24

I am trying to use Python to extract the number from the "Standardized" and "Hits" sections from each line and put these in their own separate lists but I am unfamiliar with reading from files in Python. What would be the best way to do this?

theguyty
  • 33
  • 4
  • what have you tried so far? perhaps start here:http://stackoverflow.com/questions/19508703/how-to-open-a-file-through-python – jprockbelly Nov 25 '16 at 03:05
  • Currently I can open the file and I have tried putting each lines contents into a list but from here things get reasonably complicated and I think a more elegant solution must exist. – theguyty Nov 25 '16 at 03:13

1 Answers1

2

We do not usually write code for people, but this looks it might not to be homework. I also want to state an important point.

A file is an iterable of newline-terminated strings. A list of newline-terminated strings is also such an iterable. So start with that for development, and switch to an opened file later, when the code works of the in-code list. Not doing this is, in my opinion, a big mistake and source of problems.

Next, iterate and toss 'junk' lines. Then parse payoff lines and do whatever processing of the extracted data. Parsing depends on the problem. I choose below to use splitlines and split methods.

file = '''\
1
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
2
Subpop 0 best fitness of run: Fitness: Standardized=61.0 Adjusted=0.016129032258064516 Hits=28
3
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
4
Subpop 0 best fitness of run: Fitness: Standardized=70.0 Adjusted=0.014084507042253521 Hits=19
5
Subpop 0 best fitness of run: Fitness: Standardized=72.0 Adjusted=0.0136986301369863 Hits=17
'''.splitlines(keepends=True)

stand = []
hits = []

for line in file:
    if len(line) < 50:
        continue
    fields = line.split('=')
    stand.append(float(fields[1].split()[0]))
    hits.append(int(fields[3].split()[0]))

print(stand)
print(hits)
# prints
# [73.0, 61.0, 73.0, 70.0, 72.0]
# [16, 28, 16, 19, 17]
Terry Jan Reedy
  • 18,414
  • 3
  • 40
  • 52