How do I find how many rows are returned in a subset of a Pandas DataFrame when I'm selecting a column?
In subsetting a Pandas DataFrame and specifying a column, if the subset has more than one row, a Dataframe is returned, but if the subset returns only one row, it returns the value of the subset and I can't get the length of that.
>>> df1 = pd.DataFrame({'A':['A1','A2','A1'],'B':['B1','B2','B3']})
>>> df2 = df1.set_index('A')
>>> df3 = df1.iloc[:2,].set_index('A')
>>> df2
B
A
A1 B1
A2 B2
A1 B3
>>> df3
B
A
A1 B1
A2 B2
>>> df2.loc['A1','B'].shape
(2,)
>>> df3.loc['A1','B'].shape
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'shape'
This is because Pandas returns a pandas object if there is more than one row, and a scalar if it has only one row.
>>> df2.loc['A1','B']
A
A1 B1
A1 B3
Name: B, dtype: object
>>> df3.loc['A1','B']
'B1'