22

Is there a way I can apply df.describe() to just an isolated column in a DataFrame.

For example if I have several columns and I use df.describe() - it returns and describes all the columns. From research, I understand I can add the following:

"A list-like of dtypes : Limits the results to the provided data types. To limit the result to numeric types submit numpy.number. To limit it instead to object columns submit the numpy.object data type. Strings can also be used in the style of select_dtypes (e.g. df.describe(include=['O'])). To select pandas categorical columns, use 'category'"

However I don't quite know how to write this out in python code. Thanks in advance.

Mr-Programs
  • 767
  • 4
  • 20
Gitliong
  • 307
  • 1
  • 3
  • 7

5 Answers5

52

Just add column name in square braquets:

df['column_name'].describe()

Example:

enter image description here

To get a single column:

df['1']

To get several columns:

df[['1','2']]

To get a single row by name:

df.loc['B']

or by index:

df.iloc[o]

To get a specific field:

df['1']['C']
William Miller
  • 9,839
  • 3
  • 25
  • 46
ma-ku
  • 576
  • 4
  • 8
4
import pandas as pd
data=pd.read_csv('data.csv')
data[['column1', 'column2', 'column3']].describe()
  • 2
    While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. – β.εηοιτ.βε May 23 '20 at 22:22
2
import pandas as pd
data = pd.read_csv("ad.data", header=None)
data[111].describe()

or for example

lastindice = data[data .columns[-1]]
lastindice.describe()
Mr-Programs
  • 767
  • 4
  • 20
0

In Pyspark DataFrame you can describe for only one column like this:

df.describe("col1").toPandas()

or several columns like this:

df.describe(["col1", "col2"]).toPandas()
qatz
  • 163
  • 2
  • 12
0

to describe it as table

df[['column_name']].describe()

to describe it as data

df['column_name'].describe()