0

I was trying to run this code in a kaggle notebook (my laptop has no gpu). When running the cell with load_cat_in_the_dat, the kaggle notebook throws the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-41-4269c4a7bc86> in <module>
----> 1 def load_cat_in_the_dat() -> tuple[pd.DataFrame, pd.Series]:
      2     """Assuming you have already downloaded the data into `input` directory."""
      3 
      4     df_train = pd.read_csv("./input/cat-in-the-dat/train.csv")
      5 

TypeError: 'type' object is not subscriptable
  1. I've searched, but I couldn't find the meaning for the -> in the def load_cat_in_the_dat() -> tuple[pd.DataFrame, pd.Series]: line of code. What is this?
  2. Have you had any experience with directly using categorical features in an xgboost model? Does the performance improve significantly when compared with one-hot-encoding?
martineau
  • 119,623
  • 25
  • 170
  • 301
An old man in the sea.
  • 1,169
  • 1
  • 13
  • 30
  • 1
    The `->` is a type annotation. See [the documentation](https://docs.python.org/3/library/typing.html) You might be running an older version of python in which the `tuple[x, y]` syntax isn't supported yet. In this case you can do `from typing import Tuple` and then use `Tuple[pd.DataFrame, pd.Series]` instead. – kuropan Feb 09 '22 at 11:06
  • 1
    Bottom line: you need to use a newer Python version, at least 3.9+, to run this code. – deceze Feb 09 '22 at 11:06
  • `->` is used for type hinting: https://docs.python.org/3/library/typing.html – Ervin Szilagyi Feb 09 '22 at 11:06

0 Answers0