I am working on AWS EMR Cluster. I have a data on my S3 storage. After I clean my data, i am sending to my S3 storage again via s3fs library. The code works with files which size are between 200-500 mb. However, when i am uploading between 2.0 and 2.5 gb size. The code gives a error which is "MemoryError". Do you guys any ideas or expericence about this issue?
import s3fs
bytes_to_write = nyc_green_20161.to_csv(None).encode()
fs = s3fs.S3FileSystem(key='#', secret='#')
with fs.open('s3://ludditiesnyctaxi/new/2016/yellow/yellow_1.csv', 'wb') as f:
f.write(bytes_to_write)