I am working on unpivoting a pandas dataframe and I'm running into a Memory error associated with the following line of code (in conjunction with a melt() operation preceding it):
delimited_table = df["value"].str.split(",", expand=True)
The dataframe looks a bit like this:
+----------+--------+--+
| ContactID| value | |
+----------+--------+--+
| pd.Data | A,C | |
| pd.Data | D,E,F | |
| pd.Data | G,H,I,K| |
| ... | ... | |
+----------+--------+--+
For kicks and giggles, here's the exact error code:
MemoryError: Unable to allocate array with shape (92, 12513354) and data type object
My problem is I can't delete rows because it's all necessary data, and the df is 12.5 million rows, so obviously taking the whole column and stacking it into my memory (even with 64-bit) is not feasible. What are some ways I can iterate row by row in a pandas df, apply the str.split method, and return it as delimited values while making sure the number of columns is consistent for all rows to accommodate expansion?