I have an excel sheet with 2 columns:
- Words 2. Language
There is only one word on each row and it is directly linked to a language
How would I format those words and languages into machine learning acceptable data?
I'm using scikit-learn and thought about bag of words but it seemed to me that indexation of every word wouldn't convey the characteristics of each word.