0

I'm facing a dead kernel problem. I'm trying to save over 2000 .txt files into a list of lists. my_path contains paths to these 2000+ files. I tried try - except as below but it didn't help. The kernel seems to die randomly, i.e. I tried to find files where it breaks, but it seems to break on files which weren't a problem during a previous run.

my_list = []
for i in my_path:
    with open(i) as f:
        try:
            lines = f.read().splitlines()
            #print(f)
            my_list.append(lines)
            f.close() 
        except:
            print(f)

I also tried opening the files where kernel died separately and they seem to work fine. I assume something is wrong with my loop?

UPD. I'm using EndeavourOS, Jupyter in VSCode, RAM 16 GB. I split the paths and it looks like I'm running out of memory. I tried del ... and gc.collect(), but unsuccessful, it doesn't free the memory and once it's over 12 GB it crashes.

haven
  • 1
  • 3
  • It would be good if you could add more information to the question, e.g. memory usage, system you're running, etc. – Timus Oct 13 '21 at 08:18
  • Okay, then my guess would be that `my_list` just gets too big. What are you trying to do with `my_list`? Maybe a more _lazier_ approach would avoid the problem (like producing the lines through a generator when needed)? – Timus Oct 13 '21 at 14:53
  • @Timus thanks a lot, I think you are right, the final while gets way to large which overloads the system. I'm not quite sure what you mean by `producing the lines through a generator when needed`, but after some data inspection I'm just limiting it to the first 1000 lines and it seems to work okay for my sample. – haven Oct 15 '21 at 01:59
  • Sorry for the cryptic language. What I meant is: By using generators you can construct objects which can parse large amounts of data pretty memory-efficiently (_"lazy"_). Whether that option is available here or not depends on what you want to achieve, therefore my question regarding your intentions with `my_list`. If you're interested look [here](http://www.dabeaz.com/generators-uk/GeneratorsUK.pdf) for example (or [here](http://www.dabeaz.com/generators2/index.html?utm_source=pocket_mylist)). – Timus Oct 15 '21 at 07:42
  • ... or [this](https://stackoverflow.com/questions/17444679/reading-a-huge-csv-file) question and the _accepted_ answer. (No guarantee, though, that it is applicable in your use case.) – Timus Oct 15 '21 at 11:08

0 Answers0