-1

I am writing a script which will open a txt file with contents as follows:

/1320  12-22-16   data0/impr789.dcm     sent
/1340  12-22-18   data1/ir6789.dcm      sent
/1310  12-22-16   data0/impr789.dcm
/1321  12-22-16   data0/impr789.dcm

I want to read lines only which are not tagged eg. in above txt file read line /1310 and then do some operation to send that data on cloud and tagg it as sent.. In the next iteration read from line /1321 and send it again and then tag it as sent at the end.

How should i do this?

Thanks!

2 Answers2

1
with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    for line in infile:
        end = line.strip().rsplit(None, 1)[-1]
        if end == "sent":
            outfile.write(line)
            continue
        doCloudStuff(line)
        outfile.write(line.rstrip() + '\tsent\n')
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
  • thank you for elegant way. But i want to tag as 'sent' in the same txt file itself instead of another output file.. Not sure if above code will help with that – learnningprogramming Jan 25 '16 at 18:51
  • Technically correct, but unusable in practice, since the output file will not contain unsent lines should the process be interrupted half-way through (which seems to be the point of what the OP wants to do). – spectras Jan 25 '16 at 18:51
  • 2
    @tryeverylanguage you have to rewrite the whole file to disk everytime you complete a line to do this. Is it what you want to do? – spectras Jan 25 '16 at 18:52
  • @spectras: I read somewhere that the `fileinput` module might solve that issue, but I like having versioned backups of my files, in case I have a bug in my code (and I typically work with large files and complex functions, so I don't want to go through the process to re-create the last file). So, I typically write to a new file anyway :P – inspectorG4dget Jan 25 '16 at 18:54
  • i am fine with re-writing the file.. only requirement is in next iteration i should read from lines which are untagged.. – learnningprogramming Jan 25 '16 at 18:54
  • 1
    @tryeverylanguage: I'll let you modify my code to figure out how to do that. Simply, this is what you need to do: open the file, find an untagged line, and add all previous lines to a list; `doCloudStuff` with that line, tag the line, and add the tagged line to the list; then add the remaining lines to the list. Close the file, reopen with `'w'`, write every line in that list to the file – inspectorG4dget Jan 25 '16 at 18:56
  • @inspectorG4dget i will give it a try.. just that i am new to programming and python it will take hours for me to figure it out,,, :D – learnningprogramming Jan 25 '16 at 19:01
  • 1
    You can use this answer mashed up with http://stackoverflow.com/a/5463419/1182891 to accomplish the in-place requirement – Josh J Jan 25 '16 at 19:16
1

You can do it this way:

    lines=[]
    with open('path_to_file', 'r+') as source:
        for line in source:
            line = line.replace('\n','').strip()
            if line.split()[-1] != 'sent':
                # do some operation on line without 'sent' tag 
                do_operation(line)
                # tag the line
                line += '\tsent'
            line += '\n'
            # temporary save lines in a list
            lines.append(line)
        # move position to start of the file
        source.seek(0)
        # write back lines to the file
        source.writelines(lines)
ironcladgeek
  • 1,130
  • 1
  • 14
  • 14
  • i am facing one wiered problem.. in "section do some operation on line without sent" i am caling function which uploads the huge files to cloud.. this process takes a bit time.. in this time while uploader is uploading does the for loops goes ahead to execute statements after calll to uploader function..? – learnningprogramming Jan 28 '16 at 23:17
  • Generally for loop won't execute next operation code until the control returns back from the function. Make sure your function waits for uploading file to be finished and you may use a return at the end. 'return None' for instance. – ironcladgeek Jan 29 '16 at 06:11
  • I added that return none ... but still it seems the control goes to next statement in for loop . also i am getting error `IndexError: list index out of range` – learnningprogramming Jan 29 '16 at 18:23
  • Understood the problem.. for the last line in txt file, code appends \n to it .. due to which it throws IndexError while writing it back to the list.. Txt file doesnt have that many number of lines... – learnningprogramming Jan 29 '16 at 19:46
  • getting an error `with open('/home/sdcme/exam_list.txt', 'r+') as source: TypeError: an integer is required` .. not sure why even when we are declaring `list[]` at the start.. any pointers? – learnningprogramming Feb 02 '16 at 17:12
  • 1
    You get TypeError Exception because probably you are importing something with *; for instance `from os import *` ... and we need to declare an empty list so we can append lines to that temporary data structure. @learnningprogramming – ironcladgeek Feb 03 '16 at 16:54