1

I'm trying to read a 3.6GB (3559267227 bytes) csv file with this command:

dataset = pd.read_csv("dataAnalysis/data/featureEngineering/weather_VCI_data_with_skyc1_reindex.csv")

In the terminal I get the "Killed" word when I am trying to run it. I think it might be a memory problem and that's why I ran this command.

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62927
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 62927
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

What should I change and how, to be able to load the data set?

Scoby
  • 148
  • 3
  • 17
  • Please provide the full error stack. I suggest you have a look at chunks to read a dataset in smaller pieces – Grall Apr 12 '22 at 10:33
  • 1
    What also did you test? Take ideas from here: https://stackoverflow.com/q/25962114/842935 – dani herrera Apr 12 '22 at 10:36
  • How can I provide the full error stack? If I use the same command and data set on Windows I don't get an error, so it's related to Linux. – Scoby Apr 12 '22 at 10:37
  • What size is your `weather_VCI_data_with_skyc1_reindex.csv` file? – K.Mat Apr 12 '22 at 10:59
  • The size of the weather_VCI_data_with_skyc1_reindex.csv file is 3.6GB (3559267227 bytes) . – Scoby Apr 12 '22 at 11:03

0 Answers0