1

I have a Panda series 'ids' of only unique ids, which is a dtype of object.

data_df.id.dtype

returns dtype('O')

I'm trying to follow the example here to create a sparse matrix from my df: Efficiently create sparse pivot tables in pandas?

id_u= list(data_df.id.unique())
row = data_df.id.astype('category', categories=reviewer_u).cat.codes

and I get:

TypeError: data type "category" not understood

I'm not sure what this error means and I haven't been able to find much on it.

ctd25
  • 730
  • 1
  • 11
  • 22
  • Try `row = pd.Categorical(data_df['id'], categories=reviewer_u)` instead? – Chris Adams Oct 03 '18 at 16:49
  • Related: [Pandas: convert categories to numbers](https://stackoverflow.com/questions/38088652/pandas-convert-categories-to-numbers) – jpp Oct 03 '18 at 17:36

1 Answers1

2

Try instead:

row = pd.Categorical(data_df['id'], categories=reviewer_u)

You can get the codes using:

row.codes
Chris Adams
  • 18,389
  • 4
  • 22
  • 39