19

I have this dataframe in Spark I want to count the number of available columns in it. I know how to count the number of rows in column but I want to count number of columns.

val df1 = Seq(
    ("spark", "scala",  "2015-10-14", 10,"rahul"),
    ("spark", "scala", "2015-10-15", 11,"abhishek"),
    ("spark", "scala", "2015-10-16", 12,"Jay"),
    ("spark","scala",null,13,"Kiran"))
  .toDF("bu_name","client_name","date","patient_id","paitent _name")
df1.show

Can anybody tell me how I can count number of column count in this dataframe? I am using the Scala language.

Shaido
  • 27,497
  • 23
  • 70
  • 73
Rahul Pandey
  • 605
  • 1
  • 5
  • 17

6 Answers6

30

To count the number of columns, simply do:

df1.columns.size
Shaido
  • 27,497
  • 23
  • 70
  • 73
10

In python, the following code worked for me:

print(len(df.columns))
Onema
  • 7,331
  • 12
  • 66
  • 102
jillm_5
  • 171
  • 1
  • 6
5

data.columns accesses the list of column titles. All you have to do is count the number of items in the list. so

len(df1.columns)

works To obtain the whole data in a single variable, we do

rows = df.count()
columns = len(df.columns)
size = (rows, columns)
print(size)
Neville Lusimba
  • 827
  • 1
  • 8
  • 10
1

The length of the mutable indexed sequence also work.

df.columns.length
Kris
  • 1,618
  • 1
  • 13
  • 13
0

To count the columns of a Spark dataFrame:

len(df1.columns)

and to count the number of rows of a dataFrame:

df1.count()
Saeid SOHEILY KHAH
  • 747
  • 3
  • 10
  • 23
0

in Pyspark you can just result.select("your column").count()

KeepLearning
  • 517
  • 7
  • 10