Maximum number of index of a data-frame to plot any graph using matplotlib or seaborn

Asked Jan 17 '22 at 00:35

Active Jan 17 '22 at 07:23

Viewed 295 times

I am trying to plot a Seaborn Jointplot in Jupyter notebook. My dataset consists of 4,446,966 index (rows). but I can get output for the plot if selected around 5000 rows. If the complete data set is selected then it is processed for a long time but no response.

Python / Pandas /Seaborn / Matplotlib / Jupyter Notebook / Google Colabs / EDA / Feature Engineering Image_1 Image_2

edited Jan 17 '22 at 07:23

asked Jan 17 '22 at 00:35

Shaakir Ahamed

4.5 million rows x 29 columns is a pretty big dataframe. you can try increasing the default memory size as detailed [here](https://stackoverflow.com/questions/57948003/how-to-increase-jupyter-notebook-memory-limit) – Derek O Jan 17 '22 at 01:15
Another idea is to take some random subset of the data and plot those. You could start with 5000 random rows, and slowly increase that number, convincing yourself that the random subset is (or is not) a good representation of the complete data. See e.g. [this post](https://stackoverflow.com/questions/22258491/read-a-small-random-sample-from-a-big-csv-file-into-a-python-data-frame) about some approaches to take a random subset while reading a csv. – JohanC Jan 18 '22 at 09:59

Maximum number of index of a data-frame to plot any graph using matplotlib or seaborn

0 Answers0