2

I am trying to loop over the lines of a text file which is verifiably non-empty and I am running into problems with my script. In my attempt to debug what I wrote, I figured I would make sure my script is properly reading from the file, so I am currently trying to print every line in it.

At first I tried using the usual way of doing this in Python i.e.:

with open('file.txt') as fo:
    for line in fo:
         print line

but my script is not printing anything. I then tried storing all of the lines in a list like so:

with open('file.txt') as fo:
    flines = fo.readlines()
print flines

and yet my program still outputs an empty list (i.e. []). I have also tried making sure that my file pointer is pointing to the beginning of the file using fo.seek(0) before attempting to read from it, yet that also does not work.

I have spent some time reading solutions to similar questions posted on here, but so far nothing I have tried has worked. I do not know how such an elementary I/O operation is giving me so much trouble, but I must be missing something basic and would thus really appreciate any help/suggestions.

EDIT: Here is the part of my script which is causing the problem:

import subprocess as sbp
with open('conf_15000.xyz','w') as fo:
    p1 =sbp.Popen(['head','-n', '300000','nnp-pos-1.xyz'],stdout=sbp.PIPE)
    p2 = sbp.Popen(['tail','-n', '198'],stdin=p1.stdout,stdout=fo)

with open('conf_15000.xyz','r') as fp:
    fp.seek(0)
    flines = fp.readlines()

print flines

And here is an exerpt from the nnp-pos-1.xyz file (all lines have the same format and there are 370642 of them in total):

 Ti        32.9136715924       28.5387609200       24.6554922872
  O        39.9997000300       35.1489480846       22.8396092714
  O        33.7314699265       30.3398473499       23.8866085372
 Ti        27.7756767925       31.3455930970       25.9779887743
  O        31.1520937719       29.0752315770       25.4786577758
  O        26.1870965535       32.4876155555       26.3346205619
 Ti        38.4478275543       25.5734609650       22.0654953429
  O        24.1328940232       31.3858060129       28.8575469919
  O        38.6506317714       27.3779871011       22.6552032123
 Ti        40.5617501289       27.5095900385       22.8436684314
  O        38.2400600469       29.1828342919       20.7853056680
  O        38.8481088254       27.2704154737       26.9590081202

When running the script, the file being read from (conf_15000.xyz) gets written to properly, however I cannot seem to be able to read from it at runtime.

EDIT-2: Following sudonym's recommendation I am using the absolute file path and am checking whether or not the file is empty before reading from it by adding the following unindented lines between the two with statements I wrote in my previous edit:

print os.path.isfile(r'full/path/to/file')
print (os.stat(r'full/path/to/file').st_size != 0)

The first boolean evaluates to True (meaning the file exists) while the second evaluates to False (meaning the file is empty). This is very strange because both of these lines are added after I close the file pointer fo which writes to the file and also because the file being written to (and subsequently read from with fp) is not empty after I execute the script (in fact, it contains all the lines it is supposed to).

EDIT-3: Turns out the reason why my script saw the file it needed to read as empty is because it did not wait for the subprocess (p2 in the example above) that writes to it to stop executing, meaning it would execute the lines after my first with statement before the file pointer was actually closed (i. e. before the file was done being written to). The fix was therefore to add the statement p2.wait() at the end of the first with statement like so:

import subprocess as sbp
with open('conf_15000.xyz','w') as fo:
    p1 =sbp.Popen(['head','-n', '300000','nnp-pos-1.xyz'],stdout=sbp.PIPE)
    p2 = sbp.Popen(['tail','-n', '198'],stdin=p1.stdout,stdout=fo)
    p2.wait()

with open('conf_15000.xyz','r') as fp:
    fp.seek(0)
    flines = fp.readlines()

print flines

Now everything works the way it is supposed to.

  • 3
    Please make sure that the code included in the question is self-contained enough that someone can copy-and-paste without modifications to see the problem themselves. If that means including code that writes to `file.txt` so we can be certain it isn't empty, for example, *doing that* in the content included in the question itself helps the "verifiable" element of the [mcve] definition be met. – Charles Duffy Jun 29 '18 at 21:07
  • 1
    I recommend trying the absolute path to `file.txt`. Depending on where the file is and where you are running the script from, it might be opening a new file that is empty. Note that this is intended to verify that you are opening exactly the `file.txt` you think you are, and not as the permanent solution. – MoxieBall Jun 29 '18 at 21:08
  • 1
    ...so, does the issue still reproduce if you run `f = open('file.txt', 'w'); f.write('line one\nline two\n'); f.close()` immediately before the code here? (Because it's creating the file with the same relative path used to open for read, that approach moots the concern MoxieBall raises in the other comment) – Charles Duffy Jun 29 '18 at 21:11
  • Hi, thank you for the prompt answers. I'm currently editing my question to make it more self-contained and to clarify certain things (I am indeed writing to the file in the same script as I am reading from it which is probably contributing to the bug). – Nicolas Gastellu Jun 29 '18 at 21:15
  • Could you modify the code to where the issue can reproduce with only the exact `nnp-pos-1.xyz` file you include here? (Obviously can't do that when you seek thousands of lines in before starting reading, when the sample given is only 12 lines long). The goal, again, is to let someone reproduce the problem on their own machine using only content directly copy-and-pasted from the question. – Charles Duffy Jun 29 '18 at 22:20
  • (That said, `head | tail` is reading the entire file from the front -- I don't see a compelling reason to do that using external commands rather than implementing it natively in Python). – Charles Duffy Jun 29 '18 at 22:22
  • That's fair, it just seemed to be the most straightforward and easiest to write way of doing it at the time... I just edited the code in my first edit to make it copy-pastable. – Nicolas Gastellu Jun 29 '18 at 22:34

2 Answers2

2

You probably need to flush() the buffers first (and maybe call os.fsync() too) - after writing and before reading.

See file.flush() and this post.

Danny_ds
  • 11,201
  • 1
  • 24
  • 46
-1

first, include the absolute path. Second, check if the file actually exists and is not empty:

import os

FILEPATH = r'path\to\file.txt' # full path as raw string
if os.path.isfile(FILEPATH) and (os.stat(FILEPATH).st_size != 0): 
   with open(FILEPATH) as fo:
        flines = fo.readlines()
        print flines
else:
    print FILEPATH, "doesn't exist or is empty"
sudonym
  • 3,788
  • 4
  • 36
  • 61
  • I have tried using the absolute path in the way you just suggested and the result is left unchanged. The script also correctly produces the file; I can `cat` it and verify that all the lines that need to be there have been written using `wc -l`. Thank you very much for your suggestion though! – Nicolas Gastellu Jun 29 '18 at 21:34
  • are you expecting content with any non-standard encoding? – sudonym Jun 29 '18 at 21:39
  • Actually, I just realised that you are right, the file is apparently empty when I want to read from it (even though I try reading from it after writing to it and closing the relevant file pointer); `(os.stat(FILEPATH).st_size != 0)` evaluates to `False` for some odd reason... – Nicolas Gastellu Jun 29 '18 at 21:45
  • No, the encoding is standard (AFAIK); the file contains only numbers and the letters `Ti` and `O`. – Nicolas Gastellu Jun 29 '18 at 21:46
  • Unfortunately it is not, I am really not sure why the file is empty after I write to it and close the file object that handles the writing. What's even weirder is that the after the script is done running, the file is not empty and contains everything it is supposed to... – Nicolas Gastellu Jun 29 '18 at 21:50
  • when you handle all files in an object-oriented way (with file) this should not happen. do you maybe have you whole logic under one with statement? This may explain that it the file content becomes apparent after that. – sudonym Jun 29 '18 at 21:52
  • At first I was doing everything under one `with` statement with the `w+` mode, however when I realised my script was not working properly, I separated the logic into two non-nested statements: one for writing and the other reading, as shown in my question. – Nicolas Gastellu Jun 29 '18 at 21:58