How to print the latest modified portion of a file?

Question

I have a file with the following contents:

--------------------------Thu Jun  7 12:00:01 UTC 2018 -----------------
"Ec2InstanceId":"i-0ec314eafd40e5ad5"
"Ec2InstanceId":"i-0200e84d07ff2c5ed"
"Ec2InstanceId":"i-00a46fde81549e56b"
"Ec2InstanceId":"i-02013e0f353f9aa79"
"Ec2InstanceId":"i-0f5c65a35ef4a7a39"
"Ec2InstanceId":"i-0bddc318b2a5d886b"
"Ec2InstanceId":"i-0e661050aadb9966c"
--------------------------Wed Jun 13 11:26:01 IST 2018 ------------------
"Ec2InstanceId": "i-0ec314eafd40e5ad5",
"Ec2InstanceId": "i-0200e84d07ff2c5ed",
"Ec2InstanceId": "i-00a46fde81549e56b",
"Ec2InstanceId": "i-0cd1f8f7a0c93f7a3",
"Ec2InstanceId": "i-07b291d818a31104b", 
"Ec2InstanceId": "i-003e928cf6faaa441",
"Ec2InstanceId": "i-084383a6edec97d31",
"Ec2InstanceId": "i-0a1ce363d8c8bd773",
"Ec2InstanceId": "i-018771107b26ddfc6",
"Ec2InstanceId": "i-055c6516e3b1fe03d",

Now I want to print only the latest modified part of this file, in this case, the following part from the file:

--------------------------Wed Jun 13 11:26:01 IST 2018 ------------------
"Ec2InstanceId": "i-0ec314eafd40e5ad5",
"Ec2InstanceId": "i-0200e84d07ff2c5ed",
"Ec2InstanceId": "i-00a46fde81549e56b",
"Ec2InstanceId": "i-0cd1f8f7a0c93f7a3",
"Ec2InstanceId": "i-07b291d818a31104b", 
"Ec2InstanceId": "i-003e928cf6faaa441",
"Ec2InstanceId": "i-084383a6edec97d31",
"Ec2InstanceId": "i-0a1ce363d8c8bd773",
"Ec2InstanceId": "i-018771107b26ddfc6",
"Ec2InstanceId": "i-055c6516e3b1fe03d",

I am a complete newbie in Python, and I haven't tried anything yet because I have no clue how to approach this problem.

Even if we down vote, you won't lose anything, you can always delete the question and moreover down voting should encourage you to at least make an attempt yourself. Aside from that, SO isn't a platform for people requesting advice and suggestions even before starting to code. Look for irc or slack channels for this kind of purpose. — BcK, Jun 13 '18 at 06:49
A quick bash solution which explains the core concept: `tac | sed '/^-/q' | tac` - print the file lines in reverse, print up to the first line starting with a `-`, reverse the lines again. — liborm, Jun 13 '18 at 07:05

score 0 · Answer 1 · answered Jun 13 '18 at 06:47

This is one approach use datetime and a simple iteration of your file.

Demo:

import datetime
cDate = datetime.datetime.now().strftime ("%a %b %d")

checkString = "--------------------------{0}".format(cDate)
flag = False
res = []
with open(filename, "r") as infile:
    for line in infile:
        if checkString in line:
            flag = True
        if flag:
            res.append(line)

print( "".join(res) )

Output:

--------------------------Wed Jun 13 11:26:01 IST 2018 ------------------
"Ec2InstanceId": "i-0ec314eafd40e5ad5",
"Ec2InstanceId": "i-0200e84d07ff2c5ed",
"Ec2InstanceId": "i-00a46fde81549e56b",
"Ec2InstanceId": "i-0cd1f8f7a0c93f7a3",
"Ec2InstanceId": "i-07b291d818a31104b", 
"Ec2InstanceId": "i-003e928cf6faaa441",
"Ec2InstanceId": "i-084383a6edec97d31",
"Ec2InstanceId": "i-0a1ce363d8c8bd773",
"Ec2InstanceId": "i-018771107b26ddfc6",
"Ec2InstanceId": "i-055c6516e3b1fe03d",

score 0 · Answer 2 · answered Jun 13 '18 at 06:48

If the file is not huge enough and can fit in your memory, you could simply read the file in reverse and seek out the first (from the end) line that contains ---------- for example. Since the whole file contents are basically a long string, this can be done with str.rindex, and then read from there until the end:

with open('myfile.txt') as f:
    contents = f.read()
    last_separator_index = contents.rindex('------')
    last_data = contents[last_separator_index:]
    print(last_data.strip('-'))

Output:

"Ec2InstanceId": "i-0ec314eafd40e5ad5",
"Ec2InstanceId": "i-0200e84d07ff2c5ed",
"Ec2InstanceId": "i-00a46fde81549e56b",
"Ec2InstanceId": "i-0cd1f8f7a0c93f7a3",
"Ec2InstanceId": "i-07b291d818a31104b", 
"Ec2InstanceId": "i-003e928cf6faaa441",
"Ec2InstanceId": "i-084383a6edec97d31",
"Ec2InstanceId": "i-0a1ce363d8c8bd773",
"Ec2InstanceId": "i-018771107b26ddfc6",
"Ec2InstanceId": "i-055c6516e3b1fe03d",

If however the file is too big for memory, you will have to read it from the end with a more complicated and efficient way, I'll leave that to you, but you can start here: Read a file in reverse order using python

score 0 · Answer 3 · answered Jun 13 '18 at 06:52

The way you want to approach this problem is to first read a file, like so:

with open("your_file.txt", "r") as f:
    # now you can do stuff with your file, like read the lines:
    lines = f.readlines()

This will allow you get all the lines in an array, with one line occupying one entry.
For the sake of simplicity, I am assuming that your file is sequentially written, i.e. that the latest output will always be to the bottom of the file. I will get to the case where it is not reversed later.

You can then simply reverse the order of the lines with

lines.reverse()

Now, we simply find everything up to the line which has the date in it (starting with a "-":

your_data = []
for line in lines:
    # the first time we encounter a line starting with "-", we have our most recent date, so we stop.
    if line[0] == "-":
        break
    # otherwise read this line and append it to the data you want.
    your_data.append(line)

If you want the data to be including the date line as well, simply add another your_data.append(line) in a line before the break command.

If you have data that goes in the other direction (i.e. your most recent date is to the top of the file), then you simply skip the lines.reverse() operation. Note that this is also a quick, but very hacky way, meaning it won't perform overly fast. This would only be noticeable for larger files, but should still be considered.

Also this does nothing in terms of processing each individual line; so you would have to do this yourself. If you need it in a special format (i.e. numpy array or similar), there is already plenty of literature out there.

Here is again the complete code:

with open("your_file.txt", "r") as f:
    # now you can do stuff with your file, like read the lines:
    lines = f.readlines()

your_data = []
for line in lines:
    # the first time we encounter a line starting with "-", we have our most recent date, so we stop.
    if line[0] == "-":
        break
    # otherwise read this line and append it to the data you want.
    your_data.append(line)
# now we can view the data!
print(your_data)

How to print the latest modified portion of a file?

3 Answers3