-1

I have some previous experience with Python, but it has been a while so I'm a bit rusty. I'm trying to figure out how to extract certain parts of a log file to an array.

Below is a sample (3 lines, 14 numerical entries each) of the log file:

       -3.440208377846361E-002 -3.640975490509869E-002   3.77129385321508       7.937315452622962E+040  1.067031475475027E-015  6.626932578094536E+039  2.637269012342617E+034  6.626906205404414E+039  2.008451522885638E+025   2426438437.29153        13424548.8207020       1013967360829.11        364214556916.216        1100.16964475087
       -3.442345778664616E-002 -3.643241462492964E-002   3.77129983957511       1.588956060345964E+041  2.136069984437443E-015  6.626924938142817E+039  1.056889619379146E+035  6.626819249180878E+039  8.048900417930891E+025   2426441623.69160        13424487.5716696       2029898474163.94        729111075239.864        1100.17676257806
       -3.447047146128363E-002 -3.644149740258100E-002   3.77129262754527       2.781765670453510E+041  3.739591232686748E-015  6.626924955173501E+039  3.239268437345529E+035  6.626601028329767E+039  2.466913157350972E+026   2426441630.05298        13424487.4034776       3553717920905.67        1276445706704.12        1100.17678094667

which continues on for up to hundreds of lines (depends on the situation). Currently I have it set to save 601 lines per data run, but that number cannot seem to be trusted because I have noticed the number of lines to vary from about 595-605. I think that I would have to first determine the number of lines used for this code.

I have used the following code to test reading the log file (similar to an answer from Iterating on a file using Python):

with open("output.log", 'r') as f:
     for line in f:
          print line

and that works fine (indenting may be wrong in the above block).

My issue is how do I extract say the 3rd number from each line and put that into an array? It would be more straightforward if the log files were named with letters as well as numbers (i.e. for 3rd element perhaps it could be "M_3.7729385321508", because then I could search for "M_" in each line and extract the 15 characters following the underscore to an array; see http://www.wellho.net/solutions/python-log-file-analysis-short-python-example.html), but that's not the case.

When I read the log file, it is formatted as a list containing strings. Each string corresponds to one line of the log file.

Any help would be greatly appreciated!

Community
  • 1
  • 1
Canada709
  • 41
  • 8

2 Answers2

1

If it's always going to be the third number on each line, this can be very easily accomplished with a str.split().

>>> for line in s.splitlines():
        print line.split()[2]


3.77129385321508
3.77129983957511
3.77129262754527
Blair
  • 6,623
  • 1
  • 36
  • 42
  • Hi Brien, I have it figured out now (with your comment guiding me). Thanks! I will update the question slightly because in trying your solution I learned something about the structure of the log file. – Canada709 Jun 29 '15 at 15:42
0

The code that I ended up using is (indenting below may be incorrect):

with open("output.log", 'r') as file:
    list_of_strings = file.readlines()

length_of_list = len(list_of_strings)
array = []

for i in range(length_of_list):
    s = list_of_strings[i]
    for line in s.splitlines():
        wanted_element = line.split()[2]
        wanted_element_numerical_value = float(wanted_element)
        array.append(wanted_element_numerical_value)

file.close()

This did what I was hoping to achieve.

Canada709
  • 41
  • 8