2

I am trying to loop over a Xpath in Scrapy which looks like this:

for entry in response.xpath('normalize-space(//div[@id="Content"]//div[@id="programDetails"]//div[@id="selfReportedProgramDetails"]//div[@id="hoursOfOperation"]//span[@class="hoursItem"]//span[@class="times"]/text())'):

   print(entry.get())
   print(len(response.xpath('normalize-space(//div[@id="Content"]//div[@id="programDetails"]//div[@id="selfReportedProgramDetails"]//div[@id="hoursOfOperation"]//span[@class="hoursItem"]//span[@class="times"]/text())')))

The result looks like this

9:00 AM to 12:00 PM

1

The weird thing is, that I in my browser inspector tools, it shows me 7 childs, each weekday one child.

enter image description here

So why do I get only one result? I want to extract all weekdays. I don't understand my error, maybe you'll have a hint which brings me the right way.

Cheers!

//After the hint, I use the following code:

for entry in response.xpath('//div[@id="Content"]//div[@id="programDetails"]//div[@id="selfReportedProgramDetails"]//div[@id="hoursOfOperation"]//span[@class="hoursItem"]'):
   print(entry.xpath('normalize-space(//span[@class="times"])').get())

Now I get 7 results, but always the 9:00 AM to 12:00 PM which is the first one.

patrickgerard
  • 420
  • 4
  • 9

1 Answers1

3

this XPath:

'normalize-space(//div[@id="Content"]//div[@id="programDetails"]//div[@id="selfReportedProgramDetails"]//div[@id="hoursOfOperation"]//span[@class="hoursItem"]//span[@class="times"]/text())'):

will give only one result because of the normalize-space() function with all whitespace collapsed.

So to get the actual text-nodes for those spans remove the normalize-space around your XPath.

The second XPath starts with double slash, meaning, it will search from the root all nodes. To search from current context use the .

for more info on // vs .// see this good answer

Siebe Jongebloed
  • 3,906
  • 2
  • 14
  • 19