0

I'm new in the scrapy's world... Can somebody help me?

Do you know how to crawl just the first list of elements (i.e. just the price "Prix") in this code please? In fact, I just want a list with the prices and number of products by price but it gave me everything (prices, brands - I've removed this part, colors - I've removed this part too, stars, etc).

<div id="facetsList" class="mgFacetContent">
 <div class="jsFacetListing mgFacetListing mgFOpen">
  <div class="jsFacetTitle mgFTitle">

#just here --->

   <span>Prix</span>

#<-----

   <span class="mgFIcon"></span>
     </div>
    <div class="mgFAllList">
     <input type="hidden" name="FacetForm.SelectedFacets.Index" value="0" />
     <ul class="mgFList">
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]" value="f/7/[_1200]">
         <span title="&lt;10 € (276)"><10 € (276)</span>
        </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]" value="f/7/[800_2500]">
         <span title="10 &#224; 20 € (314)">10 à 20 € (314)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]"  value="f/7/[1900_5500]">
        <span title="20 &#224; 50 € (404)">20 à 50 € (404)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]"  value="f/7/[4800_10500]">
        <span title="50 &#224; 100 € (232)">50 à 100 € (232)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]"  value="f/7/[9500_21500]">
        <span title="100 &#224; 200 € (259)">100 à 200 € (259)</span>
       </label>
      </li>   
     </ul>
     <ul class="mgFListMore">
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]" value="f/7/[19000_51500]">
        <span title="200 &#224; 500 € (161)">200 à 500 € (161)</span>
       </label>
      </li>
      <li>
       <label><input type="checkbox"  name="FacetForm.SelectedFacets[0]" value="f/7/[48000_110000]">
        <span title="500 &#224; 1000 € (56)">500 à 1000 € (56)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[0]" value="f/7/[90000_]">
        <span title="1000 € et + (22)">1000 € et + (22)</span>
       </label>
      </li>
     </ul>
    </div>
    <div class="mvFLink mgFLinkSeeMore jsFLink">de choix</div>
   </div>
   <div class="jsFacetListing mgFacetListing mgFOpen">
    <div class="jsFacetTitle mgFTitle">
     <span>Avis clients</span>
     <span class="mgFIcon"></span>
    </div>
    <div class="mgFAllList">
     <input type="hidden" name="FacetForm.SelectedFacets.Index" value="3" />
     <ul class="mgFList">
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[3]" value="f/374/[300_500]">
        <span title="3 &#233;toiles et + (77)">3 étoiles et + (77)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[3]" value="f/374/[400_500]">
        <span title="4 &#233;toiles et + (63)">4 étoiles et + (63)</span>
       </label>
      </li>
      <li>
       <label>
        <input type="checkbox"  name="FacetForm.SelectedFacets[3]" value="f/374/[500_500]">
        <span title="5 &#233;toiles (30)">5 étoiles (30)</span>
       </label>
      </li>
     </ul>
     <ul class="mgFListMore"></ul>
    </div>
   </div>

I tried a lot of things with xpath like :

        if response.xpath('//div[@class="jsFacetListing mgFacetListing mgFOpen"]/div[@class="mgFAllList"]/ul/li/label/input[@name="FacetForm.SelectedFacets[0]"]'):
          nbproducts = response.xpath('/span/text()').re(r'\u20ac \s*(.*)')
          avgcost = response.xpath('../span/text()').re(r'\s*(.*)')

But I don't think it's working like that...

Thanks a lot

P.Postrique
  • 135
  • 1
  • 3
  • 12

1 Answers1

2

You can use indexes in your xpath expressions:

response.xpath('(//div[@class="jsFacetTitle mgFTitle"])[1]/span[1]/text()').extract()
['Prix']
Granitosaurus
  • 20,530
  • 5
  • 57
  • 82
  • Seriously?? So simple and I didn't thought about it... Thanks a lot – P.Postrique Jun 21 '17 at 08:08
  • Do you know how .re(r' ') is working? I don't know what the r' ' means... – P.Postrique Jun 21 '17 at 08:11
  • @P.Postrique `r` prefix for strings in python means raw string literal, see related question: https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-in-python-and-what-are-raw-string-l – Granitosaurus Jun 21 '17 at 08:27