3

I am working on web crawling (using axios, cheerio to get html from website)so as per requirement I have to fetch images from all section A,B but issue is due to this images are static.. sometimes section A contain 2 images and sometimes section B contain 3 images.. so requirement is keep section A images in different variable and section B images in different variable. I got stuck here how to do this... no idea how to distinguish this.

<div class="container headingWrapper">
   <h3 class="DetailSection_heading ">Section A:</h3>
</div>

<div class="DetailSection_content">
  <div class="container">
    <div class="css-0">
      <div class="row justify-content-center">
        <div class="col-lg-6">
          <img src="https://img_url_1" alt="" class="DetailSectionImage" data-index="0">
          <img src="https://img_url_2" alt="" class="DetailSectionImage" data-index="0">
        </div>
      </div>
    </div>
  </div>
</div>

<div class="container headingWrapper">
   <h3 class="DetailSection_heading ">Section B:</h3>
</div>

<div class="DetailSection_content">
  <div class="container">
    <div class="css-0">
      <div class="row justify-content-center">
        <div class="col-lg-6">
          <img src="https://img_url_3" alt="" class="DetailSectionImage" data-index="0">
            <img src="https://img_url_4" alt="" class="DetailSectionImage" data-index="0">
        </div>
      </div>
    </div>
  </div>
</div>

jquery Code:

const Fetched_imgs = $('div.DetailSection_content img').map(function() {
  return $(this).attr("src")
}).get();
faraz
  • 223
  • 2
  • 11
  • There doesn't seem to be any tags matching `div.DetailSection_content`. Your selector should just be `'img.DetailSectionImage'`. – kmoser Mar 01 '22 at 05:50
  • @kmoser updated jquery code... any solution to keep section A image separate and section B separate. – faraz Mar 01 '22 at 05:53
  • See my answer for an example of how to find the groups separately. – kmoser Mar 01 '22 at 05:59

2 Answers2

3

You can make this dynamic by finding the headings first then using sibling selectors to find the images

const images = $(".DetailSection_heading").map((_, h) => [ // 1️⃣
  $(h).closest(".headingWrapper") // move up to the wrapper
    .next(".DetailSection_content") // get the next sibling
    .find("img") // find the images
    .map((_, { src }) => src) // extract the src property
    .get() // get the result array
]).get()

console.log(images)
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.slim.min.js"></script>
<div class="container headingWrapper"> <h3 class="DetailSection_heading ">Section A:</h3></div><div class="DetailSection_content"> <div class="container"> <div class="css-0"> <div class="row justify-content-center"> <div class="col-lg-6"> <img src="https://img_url_1" alt="" class="DetailSectionImage" data-index="0"> <img src="https://img_url_2" alt="" class="DetailSectionImage" data-index="0"> </div></div></div></div></div><div class="container headingWrapper"> <h3 class="DetailSection_heading ">Section B:</h3></div><div class="DetailSection_content"> <div class="container"> <div class="css-0"> <div class="row justify-content-center"> <div class="col-lg-6"> <img src="https://img_url_3" alt="" class="DetailSectionImage" data-index="0"> <img src="https://img_url_4" alt="" class="DetailSectionImage" data-index="0"> </div></div></div></div></div>

This will create a nested array with each set of images grouped by the section, eg section A at index 0, section B at 1, etc

1️⃣ A hack to get around jQuery's .map() automatically flattening arrays

Phil
  • 157,677
  • 23
  • 242
  • 245
3

To find the images in the first group: $('div.DetailSection_content').eq(0).find('img.DetailSectionImage')

To find the images in the second group: $('div.DetailSection_content').eq(1).find('img.DetailSectionImage')

Example:

// First group:
const Fetched_imgs_1 = $('div.DetailSection_content').eq(0).find('img.DetailSectionImage').map(function() {
  return $(this).attr("src")
}).get();

// Second group:
const Fetched_imgs_2 = $('div.DetailSection_content').eq(1).find('img.DetailSectionImage').map(function() {
  return $(this).attr("src")
}).get();

console.log(Fetched_imgs_1); // Array [ "https://img_url_1", "https://img_url_2" ]
console.log(Fetched_imgs_2); // Array [ "https://img_url_3", "https://img_url_4" ]

Reference: find first occurrence of class in div

kmoser
  • 8,780
  • 3
  • 24
  • 40
  • I tried your answer but getting empty array – faraz Mar 01 '22 at 06:00
  • it is not fetching any image src, so getting empty array – faraz Mar 01 '22 at 06:03
  • 3
    You must be doing something wrong. I'm trying your HTML with my JS and `console.log(Fetched_imgs_1)` shows `Array [ "https://img_url_1", "https://img_url_2" ]`, and `console.log(Fetched_imgs_2)` shows `Array [ "https://img_url_3", "https://img_url_4" ]`. – kmoser Mar 01 '22 at 06:05
  • 1
    You could optimise this by not repeating the `div.DetailSection_content` query – Phil Mar 01 '22 at 06:24
  • @Phil Agreed, you could also put it in a loop from 0 to 1, which would only require writing the selector once. All these implementation details are left as an exercise for the reader :) – kmoser Mar 01 '22 at 06:38