finding the latest modified table in a .txt file

Question

I'm new to python programming. I am working on a text file which was the result file of a software. Basically whenever we work on that software it write all the messages to the result text file(similar to a log file).

Now my problem is that the file has many tables like the one below:

it may have some million lines above
*  ============================== INERTIA ==============================
* File: /home/hamanda/transfer/cradle_vs30_dkaplus01_fwd_dl140606_fem140704_v00.bif
* Solver: Nastran
* Date: 24/09/14
* Time: 10:29:50
* Text: 
* 
* Area                               +1.517220e+06
* Volume                             +5.852672e+06
*   
* Structural mass                    +4.594348e-02
* MASS elements                      +0.000000e+00
* NSM on property entry              +0.000000e+00
* NSM by parts (VMAGen and MPBalanc) +0.000000e+00
* NSM by NSMCreate                   +0.000000e+00
* Total mass                         +4.594348e-02
* 
* Center of gravity
* in the global         +1.538605e+02  +3.010898e+00  -2.524868e+02
* coordinate system
* 
* Moments of inertia    +8.346990e+03  +6.187810e-01  +1.653922e+03
* about the global      +6.187810e-01  +5.476398e+03  +4.176218e+01
* coordinate system     +1.653922e+03  +4.176218e+01  +7.746156e+03
* 
* Steiner share         +2.929294e+03  +4.016500e+03  +1.088039e+03
* 
* Moments of inertia    +5.417696e+03  +2.190247e+01  -1.308790e+02
* about the center      +2.190247e+01  +1.459898e+03  +6.835397e+00
* of gravity            -1.308790e+02  +6.835397e+00  +6.658117e+03
*  ---------------------------------------------------------------------
some lines below and this table may repeat if user does any change to area and volume
values.----------

Now my question is how can i print the latest table on the console. i'm able to print the first occurence of the table and now i'm not able to get the latest occurence of the table.

I need the latest table to be printed on the console how can i do it? This is my code:

 input = open(fileName,'r')
    intable = False
    for line in input:
        if line.strip() == "*  ============================== INERTIA ==============================":
            intable = True
        if line.strip() == "*  ---------------------------------------------------------------------":
            intable = False
            break
        if intable and line.strip().startswith("*"):
            z1=(line.strip())
            print(z1)

You have a good start but it's not clear what you are stuck on. Parse out the dates and compare them to the newest so far. If this one is newer, keep it. At the end of the file, print the one you kept. Which part do you have trouble with? — tripleee, Sep 24 '14 at 05:21
If you can change the overall process, a better approach might be to save each table to an individual file. Even better, have the thing which generates these tables write them in some machine-readable format -- JSON is popular and easy to work with. — tripleee, Sep 24 '14 at 05:23
i can do that but some times it may not write the date and time because when the software is live and you make some changes in the area and volume it just writes the fresh table but it may not write the date and time it just creates the new table. i'm having trouble with this because i cannot demark them@tripleee — ayaan, Sep 24 '14 at 05:25
We will need significant examples of the actual problem cases in order to help you, beyond "that sounds challenging, good luck". — tripleee, Sep 24 '14 at 05:31

score 1 · Accepted Answer · edited Sep 24 '14 at 06:26

1

Try this:

f = open(fileName,'r')
content = f.readlines()
content.reverse()
for line in content:
    if line.strip() == "*  ============================== INERTIA ==============================":
        index = content.index(line)
        break
for line in content[index::-1]:
    print line

edited Sep 24 '14 at 06:26

sunkehappy

8,970
5
44
65

answered Sep 24 '14 at 05:24

Stephen Lin

4,852
1
13
26

score 1 · Answer 2 · edited May 23 '17 at 12:29

If you can use bash, then below is more efficient way.

RESULT_FILE="result_text_file_name"
START_LINE=$(grep -n "===== INERTIA ====" $RESULT_FILE | tail -1 | cut -d":" -f1)
END_LINE=$(grep -n " --------------" $RESULT_FILE | tail -1 | cut -d":" -f1)
LINE_COUNT=$(wc -l $RESULT_FILE | awk '{print $1}')
tail -n `expr $LINE_COUNT - $FIRST_LINE + 1` $RESULT_FILE | head -n `expr $END_LINE - $FIRST_LINE + 1`

still you want python, then read the post How to read lines from a file in python starting from the end

and I wrote the code by refering to above page! ( read lines a opposite way )

I assumed that Result file is "test.txt"

#!/usr/bin/env python
import sys
import os
import string

"""read a file returning the lines in reverse order for each call of readline()
This actually just reads blocks (4096 bytes by default) of data from the end of
the file and returns last line in an internal buffer.  I believe all the corner
cases are handled, but never can be sure..."""

class BackwardsReader:
  def readline(self):
    while len(self.data) == 1 and ((self.blkcount * self.blksize) < self.size):
      self.blkcount = self.blkcount + 1
      line = self.data[0]
      try:
        self.f.seek(-self.blksize * self.blkcount, 2) # read from end of file
        self.data = string.split(self.f.read(self.blksize) + line, '\n')
      except IOError:  # can't seek before the beginning of the file
        self.f.seek(0)
        self.data = string.split(self.f.read(self.size - (self.blksize * (self.blkcount-1))) + line, '\n')

    if len(self.data) == 0:
      return ""

    # self.data.pop()
    # make it compatible with python <= 1.5.1
    line = self.data[-1]
    self.data = self.data[:-1]
    return line + '\n'

  def __init__(self, file, blksize=4096):
    """initialize the internal structures"""
    # get the file size
    self.size = os.stat(file)[6]
    # how big of a block to read from the file...
    self.blksize = blksize
    # how many blocks we've read
    self.blkcount = 1
    self.f = open(file, 'rb')
    # if the file is smaller than the blocksize, read a block,
    # otherwise, read the whole thing...
    if self.size > self.blksize:
      self.f.seek(-self.blksize * self.blkcount, 2) # read from end of file
    self.data = string.split(self.f.read(self.blksize), '\n')
    # strip the last item if it's empty...  a byproduct of the last line having
    # a newline at the end of it
    if not self.data[-1]:
      # self.data.pop()
      self.data = self.data[:-1]


if(__name__ == "__main__"):
  f = BackwardsReader("test.txt")
  end_line = "---------------------------------------------------"
  start_line = "========= INERTIA ======="
  lines = []

  intable = False
  line = f.readline()
  while line:
    if line.find(end_line) >= 0:
      intable = True
    if intable:
      lines.append(line)
      if line.find(start_line) >= 0:
        break
    line = f.readline()

  lines.reverse()

  print "".join(lines)

and result of test!

[my server....]$ wc -l test.txt
34008720 test.txt

[my server....]$ time python test.py
*  ============================== INERTIA ==============================
* File: /home/hamanda/transfer/cradle_vs30_dkaplus01_fwd_dl140606_fem140704_v00.bif
* Solver: Nastran
* Date: 24/09/14
* Time: 10:29:50
* Text: 
* 
* Area                               +1.517220e+06
* Volume                             +5.852672e+06
*   
* Structural mass                    +4.594348e-02
* MASS elements                      +0.000000e+00
* NSM on property entry              +0.000000e+00
* NSM by parts (VMAGen and MPBalanc) +0.000000e+00
* NSM by NSMCreate                   +0.000000e+00
* Total mass                         +4.594348e-02
* 
* Center of gravity
* in the global         +1.538605e+02  +3.010898e+00  -2.524868e+02
* coordinate system
* 
* Moments of inertia    +8.346990e+03  +6.187810e-01  +1.653922e+03
* about the global      +6.187810e-01  +5.476398e+03  +4.176218e+01
* coordinate system     +1.653922e+03  +4.176218e+01  +7.746156e+03
* 
* Steiner share         +2.929294e+03  +4.016500e+03  +1.088039e+03
* 
* Moments of inertia    +5.417696e+03  +2.190247e+01  -1.308790e+02
* about the center      +2.190247e+01  +1.459898e+03  +6.835397e+00
* of gravity            -1.308790e+02  +6.835397e+00  +6.658117e+03
*  ---------------------------------------------------------------------


real 0m0.025s
user 0m0.018s
sys 0m0.006

Ern, why not just `awk '/INERTIA/ { i=1; delete a} { a[i++] = $0 } END { for (j=i; j<=i; ++j } print a[j] }' "$RESULT_FILE"` — tripleee, Sep 24 '14 at 06:46
I knew this way. but I'm not good at using awk! so I wrote the Non-elegant way! — han058, Sep 24 '14 at 07:17
thankx for the example @Han-youngPark i understood it very well — ayaan, Sep 24 '14 at 08:23

score 0 · Answer 3 · answered Sep 24 '14 at 05:53

0

You can also capture your file data in a list as shown below:

    delimiter = '*  ============================== INERTIA ==============================\n'
    filedata = open(filepath).read().split(delimiter)
    print filedata[-1]  # This will print your latest occurrence of table

I am not sure about code efficiency, but definitely it works. You can also list all the other occurrences of your table in case needed.

answered Sep 24 '14 at 05:53

panr

180
1
7

It's so fastest approach, but inefficient. if result file is too big, then it takes too long time. – han058 Sep 24 '14 at 06:09
..., @ayaan then I recommend using bash. Or python's best way is a little complicated. – han058 Sep 24 '14 at 06:29
@Han-youngPark in python if we want to do it, you said it is complicated do you mean that the code is bigger? or it is difficult to write? If you have any examples can you share it.... – ayaan Sep 24 '14 at 06:39
yeh~ code will be bigger than yours. so I said "complicated". I'l l try to write code and share, wait plz! – han058 Sep 24 '14 at 06:58

finding the latest modified table in a .txt file

3 Answers3