0

I have a large dataframe that consists of around 19,000 rows and 150 columns. Many of these columns contain values with -1s and -2s. When I try to replace the -1s and -2s with 0 using the following code, Jupyter times out on me and says no memory left. So, I am curious if you can select a range of columns and apply the replace function. This way I can replace in batches since I cant seem to replace in one pass based on my available memory.

Here is the code a tried to use that timed out on me when first replacing the -2s:

df.replace(to_replace=-2, value="0").

Thank you for any guidance!

Sean

Seano
  • 1
  • 1

1 Answers1

0

Let's say you want to divide your columns in chunks of 10, then you should try something like this:

columns = your_df.columns
division_num = 10
chunks_num = int(len(columns)/division_num)

index = 0
for i in range(chunks_num):
    your_df[columns[index: index+10]].replace(to_replace=-2, value="0")
    index += division_num

If your memory keeps overflowing then maybe you can try with loc function to divide the data by rows instead of columns.