New to python here, I'm ultimately trying to create an MDS plot but am running into some issues applying the MDS to my data. Here's some mock data, my actual data set is much larger:
import pandas as pd
from sklearn.manifold import MDS
df = pd.DataFrame({'Gene1': ['43.5', '14.7', '0', '33.9', '89.7'],
'Gene2': ['54.5', '3.7', '77.8', '21.9', '8.7'],
'Gene3': ['9.5', '0', '65', '1.5', '87.4'],
'Tissue': ['--', 'root', 'leaf', 'leaf', 'seed']})
df.set_index('Tissue')
The index for my data is the Tissue column, which describes tissue types for each gene. Here's how I'm trying to apply the MDS:
mds = MDS(2,random_state=0)
df_2d = mds.fit_transform(df)
I get the error could not convert string to float: '--'
. How can I ignore the index column to run the MDS on only the gene columns? Or should I remove the Tissue column and add it back in after running MDS on the gene columns?