I am trying to importxml from ENSEMBL using xpath copied from Chrome, but it is not working

Question

I want to use google sheet's IMPORTXML to extract the gene name (SLC3A1) and ensembl ID (ENSG00000138079) from this URL: http://asia.ensembl.org/Multi/Search/Results?q=SLC3A1;site=ensembl

I tried copying xpath from Chrome and also tried deriving it on my own step by step, but I am only getting a #NA.

My xpath: /html/body/div[1]/div/div[2]/div[1]/div[1]/div[2]/div[2]/div[4]/div/div/div[2]/div/div/div[3]/div[2]/div[1]/div[2]/div/div/div/div[1]/div/div[2]/span

From Chrome: //*[@id="solr_content"]/div/div/div[2]/div/div/div[3]/div[2]/div[1]/div[2]/div/div/div/div[1]/div/a

The idea is to extract gene name and ID to google sheets for any gene name I supply.

score 0 · Answer 1 · answered May 03 '22 at 18:56

0

This XPath worked on Firefox and should work on Chrome too since it's standard XPath.

$x('//a[ancestor::div[@id="solr_content"] and @class="table_toplink" and .="SLC3A1 (Human Gene)"]/following-sibling::div/span[@class="id"]/text()')

Result

Array [ #text
 ]

0: #text "ENSG00000138079"

answered May 03 '22 at 18:56

LMC

10,453
2
27
52

does not work... – player0 May 03 '22 at 20:24
@player0 sorry for the comment. Deleted. – LMC May 03 '22 at 21:15

score 0 · Answer 2 · answered May 03 '22 at 20:29

0

you are getting #N/A error due to importxml (or any other import) formula does not support the scrapping of JavaScript elements. you can test this always by disabling JS for a given site and what's left can be usually imported into google sheets

answered May 03 '22 at 20:29

player0

124,011
12
67
124

1

Thanks for the tip, did not know this. – ANIPON May 04 '22 at 03:58

I am trying to importxml from ENSEMBL using xpath copied from Chrome, but it is not working

2 Answers2