0

I'm trying to scrape data from a website, which does not seem to have to many clases in the tags. However i'm still wondering whether it is possible to scrape the titles from today using xpath.

So that it only retrieve the titles which is from 09/4 - 2015?

url: http://www.hltv.org/?pageid=96

Peter Pik
  • 11,023
  • 19
  • 84
  • 142

1 Answers1

0

Since date is unique 10/4 - 2015, you might locate a b tag node using xpath's contents(), see html here:

//b[contains(., '10/4 - 2015')]

then based on this node you go to its parent and siblings, smth. like this (not tested):

//b[contains(., '10/4 - 25')]/parent::div/siblings::div

Update

Since the current date items go at the bottom, here accorting to the html all the following-sibling nodes pertain to this data (google xpath sibling after)

//b[contains(., '10/4 - 25')]/parent::div/following-sibling::div[@class='newsItem']

See test here. If you want to fetch divs inbetween, then explore this

Community
  • 1
  • 1
Igor Savinkin
  • 5,669
  • 8
  • 37
  • 69
  • It return the date div however i cant seem to make it work getting the article titles which is under that date – Peter Pik Apr 11 '15 at 12:54