Extract email address from view-source of website page

Question

Trying to extract email address from page source of https://www.dice.com/jobs/detail/Process-Engineer-Lead-Kforce-Inc.-San-Antonio-TX-78288/kforcecx/ITWQG1496436?icid=sr1-1p&q=Senior%20Process%20Engineer&l=78288

I have list of links in columnA ColumnB =importxml(A1, "//a[@href]/text()[contains(.,'@')]") It is only extracting newdicesupport@dice.com not KJOHNSON@KFORCE.COM or any personal emails from page source.

Can you point out to me where is the wrong step in the code I took?.

score 0 · Answer 1 · edited May 23 '17 at 11:45

0

That e-mail is inside a comment, so you need to access the comment (Accessing Comments in XML using XPath). Otherwise you can consider also 'input' nodes (that email is contained in the source, two times).

edited May 23 '17 at 11:45

Community

1
1

answered Feb 26 '16 at 11:20

Fil

1,032
13
29

Extract email address from view-source of website page

1 Answers1