0

By default, whenever I view a Series or DataFrame, it only gives me the first five rows and the last five rows as a preview. How do I view all the rows? Is there a method for that?

For example,

df[df["First Name"].duplicated()]
    First Name  Gender  Start Date  Last Login Time Salary  Bonus % Senior Management   Team
327 Aaron   Male    1994-01-29  2020-04-22 18:48:00 58755   5.097   True    Marketing
440 Aaron   Male    1990-07-22  2020-04-22 14:53:00 52119   11.343  True    Client Services
937 Aaron   NaN 1986-01-22  2020-04-22 19:39:00 63126   18.424  False   Client Services
141 Adam    Male    1990-12-24  2020-04-22 20:57:00 110194  14.727  True    Product
302 Adam    Male    2007-07-05  2020-04-22 11:59:00 71276   5.027   True    Human Resources
... ... ... ... ... ... ... ... ...
902 NaN Male    2001-05-23  2020-04-22 19:52:00 103877  6.322   True    Distribution
925 NaN Female  2000-08-23  2020-04-22 16:19:00 95866   19.388  True    Sales
946 NaN Female  1985-09-15  2020-04-22 01:50:00 133472  16.941  True    Distribution
947 NaN Male    2012-07-30  2020-04-22 15:07:00 107351  5.329   True    Marketing
951 NaN Female  2010-09-14  2020-04-22 05:19:00 143638  9.662   True    NaN
Brian Bergstrom
  • 173
  • 1
  • 8
  • See options and settings here: https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html – chappers Apr 23 '20 at 00:51

3 Answers3

0

You can change the viewing options for Jupyter like so:

pd.set_option('display.max_rows', df.shape[0])
Matthew Borish
  • 3,016
  • 2
  • 13
  • 25
0

An alternative to pd.set_option(). Create a custom loop. Loop through the dataframe in sets of 60 or whatever your max rows is for printing. This approach does not exclude column headers for each iteration of printing 60 rows, but it was a fun "alternative" to code and turns out to seemingly be a viable solution for printing large numbers of rows > 100,000 or so. I created a random dataframe of floats that is 100,000 rows long and took < 1 sec to run.

import numpy as np
import pandas as pd
import math
nrows=100000
df=pd.DataFrame(np.random.rand(nrows,4), columns=list('ABCD'))
i=0
for x in range(0,int(math.ceil(nrows/60))):
    print(df.iloc[i:i+60, :].tail(60))
    i+=60

The benefit of my approach depends on how many rows you want to show. I just tried the max number of rows with the pd.set_options method on 100,000 rows and when just calling df (instead of print(df)) my page became unresponsive. That is because, it creates such a long page (there is no scrollbar), but when you print you get a scrollbar, so it's way less intensive and better practice IMO for printing a large number of rows.

Okay, so calling df why wouldn't I just change to the max limit with pd.set_option('display.max_rows', None) and do print(df). Wouldn't that work?

Well that worked for 10,000 rows, but I received this error when doing 100,000 rows.

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

Perhaps, you want to adjust the NotebookApp.iopub_data_rate_limit, but then it gets more technical and you might have to go the command line and mess with config settings IOPub data rate exceeded in Jupyter notebook (when viewing image)

My solution allows you to print all rows without messing with pd.options or having to manually edit these limits in configuration files. Of course, again this really depends on how many rows you want to print in your terminals.

David Erickson
  • 16,433
  • 2
  • 19
  • 35
-1

This is explained in the following link.

https://thispointer.com/python-pandas-how-to-display-full-dataframe-i-e-print-all-rows-columns-without-truncation/

An excerpt from the link provides these 4 options.

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', -1)
BalooRM
  • 434
  • 1
  • 6
  • 15