0

I am having trouble loading some data into python using the readlines() method. The strange thing is it seems to work perfectly fine in another directory, with the same size of file and same file name. The size of the file is quite large (about 2.6 GB) which is probably causing some problems. I've looked at the post: "OSError: [Errno 22] Invalid argument" when read()ing a huge file

but it only provides a solution for checking hashes, and uses a different method which is not really what I'm aiming for here.

As I said, doing basically the same thing (same file name, code, file size) in another directory works fine.

The error is as follows:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_19332\1902527957.py in <module>
----> 1 data=wf_loader('adb', state=2)
      2 #wf_loader('adb', state=1)
      3 #wf_loader('diab', state=1)
      4 #wf_loader('diab', state=2)

~\AppData\Local\Temp\ipykernel_19332\168813414.py in wf_loader(representation, state, save)
     22     with open(file) as di_data:
     23         print(di_data)
---> 24         lines = di_data.readlines()
     25 
     26     xlims, ylims, x_gridpoints, y_gridpoints = get_data_shape_fromlog()

OSError: [Errno 22] Invalid argument

And the part of the function 'wf_loader' that is not working is below, you can safely ignore the 'state' and 'save' arguments as they are used later in the function and probably not relevant.

 def wf_loader(representation='adb', state=2, save=True):
    '''
    Returns a numpy array for the wavefunction density progressing in time
    
    representation: 'adb' opens the adiabatic wf, 'diab opens the diabatic wf', default adb
    
    state: which state's wavefunction to look at 1 or 2, default 2
    
    save: saves the numpy array as a binary file for easier loading later
    '''

    cwd = os.getcwd()
    
    if representation == 'adb':
        file = os.path.join(cwd, 'adb2d_x_y')
    elif representation == 'diab':
        file = os.path.join(cwd, 'dens2d_x_y')
    else:
        print('Unkown representation')
        return
        
    #Loads file into python
    with open(file) as di_data:
        print(di_data)
        lines = di_data.readlines()

My file structure is as follows. I am running a jupyter notebook in the same directory as the 'adb2d_x_y' and 'dens2d_x_y' files. I only added the os.getcwd() and os.path.join() parts to see if it would make a difference, but it runs perfectly well (in the aforementioned directory where the code works) just using file = 'adb2d_x_y'.

Also 'adb2d_x_y' and 'dens2d_x_y' do not have file extensions as they were imported from some niche program running on Linux, but can be opened using a text editor.

Any help at all here? I feel like it could be some crazy memory error and I would be in way over my head getting into all that (not that it's not something to learn for another time).

Nemo
  • 1
  • 1
  • You forgot the "read" mode: `with open(file, "r") as di_data:`. – Guimoute Mar 21 '23 at 20:38
  • I don't know if this helps but I hope at least to gather information about the problem: Try to replace `lines = di_data.readlines()` by `lines = list(di_data)` – Michael Butscher Mar 21 '23 at 20:40
  • 1
    @Guimoute `"r"` is the default file mode if no mode given, see https://docs.python.org/3/library/functions.html?highlight=open#open – Michael Butscher Mar 21 '23 at 20:43
  • It's not necessary to join `cwd` -- files are looked up in the current directory by default. – Barmar Mar 21 '23 at 21:01
  • @michaelbutscher yeah that's what I thought. I'll try that and let you know – Nemo Mar 21 '23 at 22:00
  • @barmar I mentioned that yes – Nemo Mar 21 '23 at 22:01
  • Voting to re-open. The solution proposed in the "duplicate" is to split the file into small chunks. However, that is not trivial to implement if you eventually want to split the file into lines because the position of line ending characters within the file is not known a priori. – Joooeey Mar 22 '23 at 08:20
  • This question would benefit from using a stand-alone [mcve] code sample so we can test with minimum effort. The question is how to produce a sample file in the first place. Maybe something like this with `\n` thrown in would work: https://stackoverflow.com/a/45122730/4691830 – Joooeey Mar 22 '23 at 08:27

0 Answers0