According to here, sklearn can not handle categorical variables. And use one-hot encoding to deal with these features are suggested. However, I do not understand how can one-hot encoding help? For example, country=USA or China or England is transformed into country=USA is true or false, the new feature 'country==USA' is still categorical after all (can only take 0 or 1). That does not change anything. Sklearn still treats 0 or 1 as numerical values.
For a real example here, I transformed data:
human,warm-blooded,hair,yes,no,no,yes,no,mammal
python,cold-blooded,scales,no,no,no,no,yes,reptile
salmon,cold-blooded,scales,no,yes,no,no,no,fish
whale,warm-blooded,hair,yes,yes,no,no,no,mammal
frog,cold-blooded,none,no,semi,no,yes,yes,amphibian
komodo dragon,cold-blooded,scales,no,no,no,yes,no,reptile
bat,warm-blooded,hair,yes,no,yes,yes,yes,mammal
pigeon,warm-blooded,feathers,no,no,yes,yes,no,bird
cat,warm-blooded,fur,yes,no,no,yes,no,mammal
leopard shark,cold-blooded,scales,yes,yes,no,no,no,fish
turtle,cold-blooded,scales,no,semi,no,yes,no,reptile
penguin,warm-blooded,feathers,no,semi,no,yes,no,bird
porcupine,warm-blooded,quills,yes,no,no,yes,yes,mammal
eel,cold-blooded,scales,no,yes,no,no,no,fish
salamander,cold-blooded,none,no,semi,no,yes,yes,amphibian
gila monster,cold-blooded,scales,no,no,no,yes,yes,
into
[[1. 0. 1. 0. 0. 0. 1. 0. 1. 0. 1. 1. 0. 0. 0. 1. 0. 0. 0.]
[1. 0. 1. 0. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 1. 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 1. 0. 1. 0. 1. 1. 0. 1. 0. 0. 0. 1. 0. 0. 0.]
[1. 0. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0. 0.]
[1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1.]
[0. 1. 1. 0. 0. 0. 1. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 0. 0.]
[0. 1. 1. 0. 0. 0. 1. 1. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0.]
[1. 0. 1. 0. 0. 0. 1. 0. 1. 0. 1. 1. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 1. 1. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 1. 0. 1. 0. 1. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 1. 0. 0. 1. 1. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0.]
[1. 0. 1. 0. 0. 0. 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 1. 0.]
[1. 0. 0. 0. 1. 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0. 0.]
[1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 1.]]
and build a decision tree like
decision tree(click to open) The split point is still ridiculous(like Give Birth=no<=0.5). So I do not think one-hot encoding can help deal with categorical data at all.