I have a pandas dataframe with about 3000 columns. The first column lists a category (the values can be repeated).
The second column all the way to the last column lists 1s and 0s (its somewhat of an indicator matrix). There are 20 or less 1s per row, so I am dealing with a sparse matrix.
I want to create a dictionary such that, when given a particular category, it gives you a matrix of the cosine distances of all the indicator vectors in the category (with the order from the data frame preserved). My data has about 100,000 rows as well, so I'm looking for an efficient way to do this.
Thanks