0

Embeddings of a drug dataset using macaw (Molecular Autoencoding AutoWorkAround) which is an Accessible Tool for Molecular Embedding and InverseMolecular Design. After that I convert the embeddings into pandas dataframe and the convert it into a .csv file which includes class labels of the main dataset.

But when I try to apply the smote algorithm on MLP or Logistic Regression Classifier the classification metrices named precision, recall, F1 score remains the same that means there is no improvement after applying the smote.

So, I think there is a problem in finding the embeddings. Please help.

The code which I applied, the dataset and the paper from where I got the idea are given below.

My source code:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.svm import SVR

from rdkit import SimDivFilters
from rdkit.Chem import rdMolDescriptors
import sys
sys.path.append('../')
import macaw
print(macaw.__version__)
from macaw import *
from google.colab import files
df=files.upload()
df=pd.read_csv("BBBP.csv")
smiles=df.smiles
print(len(smiles))
mcw = MACAW(random_state=42)
mcw.fit(smiles)
BBBP_embedding=mcw.transform(smiles)
print(BBBP_embedding)
hiv_embedding=pd.DataFrame(BBBP_embedding)
extracted_col=df["p_np"]
hiv_embedding=hiv_embedding.join(extracted_col)
hiv_embedding.to_csv("BBBP_embedding.csv")
from google.colab import files
files.download("BBBP_embedding.csv")

Dataset link: https://moleculenet.org/

Paper link: https://pubs.acs.org/doi/10.1021/acs.jcim.2c00229

I expect someone can find the code's mistake and help me to correct it. Thanks!

Zeitounator
  • 38,476
  • 7
  • 53
  • 66
  • You should upload your code as text per [Why should I not upload images of code/data/errors?](https://meta.stackoverflow.com/questions/285551/why-should-i-not-upload-images-of-code-data-errors) – DarrylG Dec 23 '22 at 13:41
  • Please check the meaning of pre-existing tags before using them: `molecule` has nothing to do with chemistry here. Thanks. – Zeitounator Dec 26 '22 at 08:54
  • "F1 score remains the same" please post the scores, is it exactly the same or slightly different? – JoshuaBox Jan 19 '23 at 10:28

0 Answers0