Need help scraping images from a slideshow with bs4 & python

Question

I'm trying scrape listing information from Craigslist, unfortunately I can't seem to get the images since they are in a slideshow.

import requests
from bs4 import BeautifulSoup as soup

url = "https://newyork.craigslist.org/search/sss"
r = requests.get(url)
souped = soup(r.content, 'lxml')

Since the images aren't even in the html file requested, do I need to somehow dynamically load the page or something. If so can I keep it only in python, I don't want any other dependencies. Thanks in advance, pretty new to this so any help would be helpful.

As you can see you have the links to the images, I suggest you extract the URLs and then use `requests` to download the image using those URLs. See [this post](https://stackoverflow.com/questions/13137817/how-to-download-image-using-requests) for downloading images with that module — Plopp, Feb 06 '19 at 12:55
Thanks but I'm not looking to download the images just want the links. I have a loop that gets the title, location, price, etc of listings to a CSV file, I just want it to add the link(s) of the images to it as well. And sorry I'm a noob at python so a simple solution would be helpful. — Da Jankee, Feb 06 '19 at 13:09

score 2 · Accepted Answer · edited Aug 12 '20 at 19:06

Look for the A tags with classes result-image gallery. Each of those tags have a data-ids attribute which olds part of the names of the images files.

<a href="https://newyork.craigslist.org/mnh/fuo/d/new-york-city-3-piece-shaped-ikea-couch/6812749499.html" class="result-image gallery" data-ids="1:00707_iRUU5VKwkWi,1:00H0H_6AIBqK2iQDU">
           ....
</a>

Now, if you want to get the urls, first get that attribute and parse the partial image's names (on that example, 00707_iRUU5VKwkWi and 00H0H_6AIBqK2iQDU).

And now you can build the urls with the host and, the suffix (_300x300) and the extension:

https://images.craigslist.org/00707_iRUU5VKwkWi_300x300.jpg
https://images.craigslist.org/00H0H_6AIBqK2iQDU_300x300.jpg

Thank you! Just what I was looking for. – Da Jankee Feb 06 '19 at 13:15 — Da Jankee, Feb 06 '19 at 13:15

Need help scraping images from a slideshow with bs4 & python

1 Answers1

Linked