I have a dataframe with text and a list of labels that I, when added to the targets column, converted with .astype(str). When trying to send this data to a multi-label machine learning model, I got an error ValueError: too many dimensions 'str'. How do I convert it to a list or use a method from a library?
train_data = pd.DataFrame({'text':[i for i in X_train], 'target_1':[i for i in y_train["target_1"]], 'target_2':[i for i in y_train["target_2"]],
'target_3':[i for i in y_train["target_3"]], 'target_4':[i for i in y_train["target_4"]], 'target_5':[i for i in y_train["target_5"]],
'target_6':[i for i in y_train["target_6"]]})
train_data['targets'] = train_data[train_data.columns[1:]].apply(lambda x: ', '.join(x.dropna().astype(str)), axis=1)
train_data = train_data.drop(['target_1', 'target_2', 'target_3', 'target_4', 'target_5', 'target_6'], axis=1)
train_data['targets'] = train_data['targets'].str.split(',')
train_data.info()
DataFrame looks like
text targets
0 добрый день, никита. благодарю вас! добрый ден... [5.8, 6.2, 6.3, 5.5, 6.0, 5.0]
1 - добрый. напишите андрею кравцову, что мы об... [6.0, 6.2, 7.0, 5.8, 5.2, 5.0]
2 никита, добрый день. спасибо за доверие и ценн... [6.2, 6.4, 8.0, 5.5, 6.8, 5.5]
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 text 171 non-null object
1 targets 171 non-null object
dtypes: object(2)
Error when i convert to float
train_data["targets"].astype(float)
--> 997 return arr.astype(dtype, copy=True)
998
999 return arr.view(dtype)
ValueError: setting an array element with a sequence.