1

This is my code. The file has like 1 million lines and 134mb. I don't think it is a big file but why I always failed loading it? It shows "out of memory" when reading at like 700,000 lines. Is there any lua mechanism I don't know? I use luajit.

function unsupervised_re.read_seq_ids(seq_path)
    local seq_ids = {}
    local file = io.open(seq_path, 'r')
    local count = 0
    while true do
        local line = file:read()
        count = count + 1
        print (count)
        if line == nil then break end
        local tokens = stringx.split(line, ' ')
        seq_ids[#seq_ids+1] = tokens
    end
    file:close()
    return seq_ids
end
hidemyname
  • 3,791
  • 7
  • 27
  • 41
  • Why do you need to copy `tokens` to `ids`? – Oleg V. Volkov Mar 14 '16 at 15:57
  • Just for organizing the data conveniently. – hidemyname Mar 14 '16 at 15:59
  • How? You're just making an exact copy. – Oleg V. Volkov Mar 14 '16 at 16:00
  • Yeah, I see. Maybe that's a flaw. But I don't think it will cause the "out of memory". Because when I put the task on the cluster, I assigned the cluster 128g RAM. There was still the error. I think 128G is enough for even 100 times bigger file size. – hidemyname Mar 14 '16 at 16:06
  • I modified the code and it stopped at 900,000 lines. Still the "out of memory". – hidemyname Mar 14 '16 at 16:09
  • 1
    There's hard limit of 2Gb that can be used by LuaJIT thanks to some GC quirks. See http://stackoverflow.com/questions/27015150/how-to-get-past-1gb-memory-limit-of-64-bit-luajit-on-linux. You need to either reduce overhead (just like that minimal change did, but there's not much else you can do), use better data structures or manage memory manually with FFI. – Oleg V. Volkov Mar 14 '16 at 16:12
  • Thanks.. It seems to be the problem. I just searched. Somebody told me maybe I could switch to lua 52. I will have a try. – hidemyname Mar 14 '16 at 16:14

0 Answers0