I've been working with AWS Glue and pysaprk dataframes for the past few days. Unfortunately I need to calculate the size of my dataframes but I have been getting very inconsistent results. Python sys.getszieof() on a dataframe that loaded a JSON file of size 1.7mb returns 56 bytes, and I am not sure how to exactly deal with this. Is there any suggestions on what to do?
Asked
Active
Viewed 237 times
0
-
@Florian unfortunately I have read that thread and the solutions don't work. – codingEnthusiast Jul 12 '18 at 01:33
-
Please elaborate than in your question exactly what you tried, and why it did not work. Did you run into errors? Did you get unexpected results? Otherwise it is difficult to help you. – Florian Jul 12 '18 at 05:31