I was trying to run this code in a kaggle notebook (my laptop has no gpu). When running the cell with load_cat_in_the_dat, the kaggle notebook throws the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-41-4269c4a7bc86> in <module>
----> 1 def load_cat_in_the_dat() -> tuple[pd.DataFrame, pd.Series]:
2 """Assuming you have already downloaded the data into `input` directory."""
3
4 df_train = pd.read_csv("./input/cat-in-the-dat/train.csv")
5
TypeError: 'type' object is not subscriptable
- I've searched, but I couldn't find the meaning for the
->
in thedef load_cat_in_the_dat() -> tuple[pd.DataFrame, pd.Series]:
line of code. What is this? - Have you had any experience with directly using categorical features in an xgboost model? Does the performance improve significantly when compared with one-hot-encoding?