I have an output file from a program that I'm running (not one of my own creation), and some of the data that I need to access is on commented (leading #) lines within the output file. The segment of the output file that I want will always start and end with the same lines, but their location relative to the beginning of the file and to each other will not always be the same.
Let's say that my output file is called output.txt
. What I've tried to do for accessing the wanted lines within output.txt
is the following:
data_file = open("output.txt", "r")
block = ""
found = False
for line in data_file:
if found:
block += line
if line.strip() == "# This isn't the actual line either, but I want to stop here:": break
else:
if line.strip() == "# This isn't the actual line, but I'm making a working example:":
found = True
block = "# This isn't the actual line, but I'm making a working example:"
And that does indeed get me the lines that I want. However, what this leaves me with is something that I'm not sure how to use. All I want out of this are the columns of numerical values. I've thought about using the split()
command, but I don't want to break block
into strings... I want to keep the nice tab-delimited columns and put them into a NumPy array.
# This isn't the actual line, but I'm making a working example:
#
# point c[0] c[1] c[2]
# -0.473359 7161.325229 -609.475403 49128.219132
# -0.459864 7162.047233 -102.060363 1189.270542
# -0.404065 7160.055198 467.778393 -23832.885052
# -0.385952 7160.708981 0.675271 2.177786
#
# This isn't the actual line either, but I want to stop here:
So what I ultimately need is:
- a way to obtain the lines of
output.txt
that I want (if there is something better than what I'm doing at present); - a way to read only the lines from
block
that are numerical data, in such a way that they can be put into a NumPy array; - a way to accomplish 1 & 2 that (if possible) doesn't involve strings.
As a final note, I haven't been using numpy.genfromtxt()
because there are also data within this file that are not behind comments (#).
Any recommendations would be appreciated.