not able to extract links from the following html

Asked Oct 13 '15 at 04:43

Active Oct 13 '15 at 06:56

Viewed 149 times

I need to grab the link from certain link to perform crawling however I can't extract the link from the html no matter how many time I rewrite the xpath. Hence, I'm not able to find a way to extract the link from it. Please give some suggestion to me to solve the problem.

This is the html code for the link that I'm gonna to extract the link from:

<div class="" id="subject1" datacallname="主题_同类主题" params="{'catid':'12','sid':'336'}" isload="1" style="">
  <ul class="rail-list">
    <li>
      <cite class="start0" style="height:16px;">
      </cite>
      <a href="http://www.gorate.com.my/item-386.html">the Library&nbsp;@&nbsp;Leisure Ma
      </a>
    </li>

how I can extract the link "//*[@id="subject1"]/ul/li[1]/a/@href" and the website

I gonna to scrape the link form : http://www.gorate.com.my/item-336.html#.Vhx55BOqqkr

edited Oct 13 '15 at 06:56

LearnAWK

asked Oct 13 '15 at 04:43

user2130368

Do you want to extract all links in the html? – Qbyte Oct 13 '15 at 07:03
Yes @Qbyte, so do you have any idea on this? – user2130368 Oct 13 '15 at 08:31
Your xpath looks ok. ([screenshot](https://www.evernote.com/l/AOwS_Nup6ldLXqsAJRpnjwu62sg4Nd3CP5c)) – Eric Aya Oct 13 '15 at 08:43
but im not able to retrieve the link from there – user2130368 Oct 13 '15 at 12:03

not able to extract links from the following html

0 Answers0