1

enter image description here

In python dataframe while getting category codes after assigning a column to variable(y=df.column) is giving attribute error.

enter image description here.

While same is working fine if we directoly pass df.column to Categorical function.

enter image description here

cs95
  • 379,657
  • 97
  • 704
  • 746
Vikash Yadav
  • 713
  • 1
  • 9
  • 29
  • 2
    Please don't add your code as pictures. See [creating good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and create a [mcve] – G. Anderson Jun 21 '19 at 18:13
  • Rolling back your edit because it completely muddles up your post. Please copy paste the exact code in your images into your question, or not at all. – cs95 Jun 21 '19 at 18:40
  • pd.Categorical(df.c1) gives `arrays.categorical.Categorical` object while `series.Series` object – camel_case Jan 09 '23 at 23:10

1 Answers1

3

The .cat attribute is a categorical accessor associated with categorical dtype Series:

s = pd.Series(['a', 'b', 'a']).astype('category') 
s                                                                                            
0    a
1    b
2    a
dtype: category
Categories (2, object): [a, b]

s.cat.codes                                                                                                                               
0    0
1    1
2    0
dtype: int8

OTOH, pd.Category returns a pandas.core.arrays.categorical.Categorical object, which has these attributes defined on the object directly:

pd.Categorical(['a', 'b', 'c'])                                                                                                           
# [a, b, c]

pd.Categorical(['a', 'b', 'c'])  .codes                                                                                                                                   
# array([0, 1, 2], dtype=int8)
cs95
  • 379,657
  • 97
  • 704
  • 746
  • while printing 'y' and 'df.c1' both seems to be as categorical type – Vikash Yadav Jun 21 '19 at 18:24
  • @VikashYadav Please carefully look at your code and images, they do not corroborate. Anyway, the whole point of my answer, is that `df['c1']` is a Series, while `pd.Categorical(...)` (and anything you assign it to) is a Categorical object. – cs95 Jun 21 '19 at 18:28