21

I'm a beginner in Python and the Pandas library, and I'm rather confused by some basic functionality of DataFrame. I've got a pandas DataFrame as below:

>>>df.head()  
              X  Y       unixtime
0  652f5e69fcb3  1  1346689910622
1        400292  1  1346614723542
2  1c9d02e4f14e  1  1346862070161
3        610449  1  1346806384518
4        207664  1  1346723370096

However, after I performed some function:

def unixTodate(unix):
  day = dt.datetime.utcfromtimestamp(unix/1000).strftime('%Y-%m-%d')
  return day

df['day'] = df['unixtime'].apply(unixTodate)

I could no longer make use of the df.head() function:

>>>df.head()  

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 190648 to 626582
Data columns:
X              5  non-null values
Y              5  non-null values
unixtime       5  non-null values
day            5  non-null values
dtypes: int64(3), object(5)

I can't see why this is happening. Am I doing something wrong here? Any pointer is welcome! Thanks.

Vaisakh Rajagopal
  • 1,189
  • 1
  • 14
  • 23
S.zhen
  • 359
  • 1
  • 4
  • 13

3 Answers3

18

df.head(n) returns a DataFrame holding the first n rows of df. Now to display a DataFrame pandas checks by default the width of the terminal, if this is too small to display the DataFrame a summary view will be shown. Which is what you get in the second case.

Could you increase the size of your terminal, or disable autodetect on the columns by pd.set_printoptions(max_columns=10)?

rafaelc
  • 57,686
  • 15
  • 58
  • 82
Wouter Overmeire
  • 65,766
  • 10
  • 63
  • 43
  • Thanks Wouter - it works now. But my second data frame is actually one column wider than my first one, so I'm a bit surprised that it couldn't be displayed. Is there any documentation you could point me to? – S.zhen Oct 26 '12 at 13:28
  • there is not much ... http://pandas.pydata.org/pandas-docs/stable/basics.html#console-output-formatting – Wouter Overmeire Oct 26 '12 at 13:30
  • btw what to you get for pd.util.terminal.get_terminal_size()? This should be (terminal_width, terminal_height). If pandas can not autodetect it (80, 25) is returned by default. – Wouter Overmeire Oct 26 '12 at 13:35
  • I see. Thanks. My terminal size is (112, 24). – S.zhen Oct 26 '12 at 13:40
  • Seems big enough to display df.head() in both cases? – Wouter Overmeire Oct 26 '12 at 14:45
  • Indeed. The fact is the data frame above is just for demonstration - my actual data frame is slightly bigger - 7 columns to start with, then another column is added. It displayed df.head() fine with the 7 columns, but only gave a summary of the 8-column data frame, and refused to display even the original 7-column data frame since then. I found it rather confusing.. – S.zhen Oct 26 '12 at 15:27
  • But yeah, after adjusting the terminal size manually, df.head() works. – S.zhen Oct 26 '12 at 15:32
  • 1
    Just FYI, `pandas.set_printoptions` is now deprecated. See http://pandas.pydata.org/pandas-docs/stable/basics.html#working-with-package-options The max columns can be changed with `pandas.set_option`. – Aman Nov 24 '13 at 23:56
2

Try the below code segment:

from IPython.display import display
display(df.head())
Kokul Jose
  • 1,384
  • 2
  • 14
  • 26
1
 DataFrame.head(n=5)

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Parameters:

n : int, default 5

Number of rows to select.

Returns:

obj_head : type of caller

The first n rows of the caller object.

slfan
  • 8,950
  • 115
  • 65
  • 78