Below code is used to read csv file and write the output to a csv file. This code worked perfectly fine. But when the csv file size (number of rows) got increased this gives an error. I tried changing the Xms to 512m, Xmx to 2024m and XX:ReservedCodeCacheSize to 480m. But still getting the memory error.
Traceback (most recent call last):
File "/root/PycharmProjects/AppAct/statfile.py", line 5, in <module>
df = df.astype(float)
File "pandas/core/generic.py", line 5691, in astype
**kwargs)
File "pandas/core/internals/managers.py", line 531, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "pandas/core/internals/managers.py", line 402, in apply
bm._consolidate_inplace()
File "pandas/core/internals/managers.py", line 929, in _consolidate_inplace
self.blocks = tuple(_consolidate(self.blocks))
File "pandas/core/internals/managers.py", line 1899, in _consolidate
_can_consolidate=_can_consolidate)
File "pandas/core/internals/blocks.py", line 3149, in _merge_blocks
new_values = new_values[argsort]
MemoryError
import pandas as pd
all_df = pd.read_csv("/root/Desktop/Time-20ms/AllDataNew20ms.csv")
df = all_df.loc[:, all_df.columns != "activity"]
df = df.astype(float)
mask = (df != 0).any(axis=1)
df = df[mask]
recover_lines_of_activity_column = all_df["activity"][mask]
final_df = pd.concat([recover_lines_of_activity_column, df], axis=1)
final_df.to_csv("/root/Desktop/Dataset.csv", index=False)