Questions tagged [pubchem]

Free database of chemical structures of small organic molecules and information on their biological activities

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains 162 Millions substance descriptions and small molecules. More than 80 database vendors contribute to the growing PubChem database.

35 questions
7
votes
3 answers

Generate 2d images of molecules from PubChem FTP data

Rather than crawl PubChem's website, I'd prefer to be nice and generate the images locally from the PubChem ftp site: ftp://ftp.ncbi.nih.gov/pubchem/specifications/ The only problem is that I'm limited to OSX and Linux and I can't seem to find a…
zachaysan
  • 1,726
  • 16
  • 32
3
votes
1 answer

Parse a remote xml.gz file of a database without downloading

I need to parse a Pubchem database to search for certain clues on the pages of compounds (Toxicity codes, to be exact, they look like 'H300'), and then add their CIDs to the correspondent lists The Database is…
Diana
  • 63
  • 4
3
votes
2 answers

CAS registry to Pubchem cid identifier conversion in R

An annoying problem many chemists are faced with is to convert CAS registry numbers of chemical compounds (stored in some commercial database that is not readily accessible) to Pubchem identifiers (openly available). Pubchem kind of supports…
Tom Wenseleers
  • 7,535
  • 7
  • 63
  • 103
2
votes
1 answer

SPARQL query of Turtle data returns no rows when using skos:prefLabel

I'm trying to query PubChem disease data from their Turtle file in GraphDb. The standard query returns all the ?s ?p ?o rows as expected. But I just want the ?s and skos:prefLabel. Example disease data:…
Sam B
  • 119
  • 2
  • 8
2
votes
1 answer

Has anyone used pubchemdb? Any similar API?

Update: The link in the answer is both interesting and useful, but unfortunately does not address the need for a java API, so I am still looking forward to any input. I'm building a database of chemical compounds. I need all the synonyms (IUPAC and…
Aleadam
  • 40,203
  • 9
  • 86
  • 108
2
votes
1 answer

Rvest web scrape returns empty character

I am looking to scrape some data from a chemical database using R, mainly name, CAS Number, and molecular weight for now. However, I am having trouble getting rvest to extract the information I'm looking for. This is the code I have so…
Matthew T
  • 33
  • 7
2
votes
2 answers

How to use httpClient to filer parts of an xml file in Java

I am currently doing a project in which I have to request data from the metabolite database PubChem. I am using Apache's HttpClient. I am doing the following: HttpClient httpclient = new DefaultHttpClient(); HttpGet pubChemRequest = new…
1
vote
2 answers

PubChem database into mysql

I want to download the pubchem substance database and put all information into a MySQL database. Is this possible, and if so how? Is there a script which automatically update the database?
bladepit
  • 853
  • 5
  • 14
  • 29
1
vote
1 answer

How to do a programmatically fuzzy search on pubchem using compound names

When I manually searched the pubchem web page using the keyword "1-(2-Hydroxyphenyl)-2-phenyl ethanone", I got the following results. Although no compounds exactly matched the above keywords, four compounds were found that partially matched the…
arron
  • 73
  • 5
1
vote
1 answer

How to get the name of an element in PubChemPy?

Can you please tell me if it is possible to get the name of an element in PubChemPy? I found in the documentation that can find the PubChem CID by name: result = pcp.get_compounds('Glucose', 'name') print(result) [Compound(5793)] But I need the…
Kate
  • 133
  • 7
1
vote
0 answers

How can I only get a numerical answer when applying a function to a dataframe?

This is the code I have until now: import pandas as pd import pubchempy import numpy as np df = pd.read_csv("Data.tsv.txt", sep="\t") from pubchempy import get_properties df['CID'] = df['CID'].astype(str).apply(lambda x:…
1
vote
0 answers

How to fix "TypeError: 'float' object is not iterable" when using the df.map function for a dataframe column (entire column)

I'm trying to apply a function to every element in a column but I keep getting this error and I'm not sure how to fix it. Code: import pandas as pd import pubchempy import numpy as np df = pd.read_csv("Data.tsv.txt", sep="\t") . . . df['CID'] =…
1
vote
3 answers

Getting the content of a table on the website with Selenium and Python

When I go to the web address in the code I don't get the contents from "Synonyms" section. It does the selection, but takes it as a list and does not output the text content. synonyms= [] driver= webdriver.Chrome() url =…
1
vote
1 answer

Getting a number from pubchem site with Selenium

I'm doing a search on the pubchem site with the code below. I need to get the "Compound CID:" number from the screen from the search result but I couldn't get it. I need help on this. driver = webdriver.Chrome() url =…
1
vote
1 answer

How to use searchtype option in PubChemPy

I tried to run the code pcp.get_compounds('CC', searchtype='superstructure', listkey_count=3) but, it didn't work. This code is exactly the same as one shown in the documentation…
hiro
  • 87
  • 4
1
2 3