-1

I have multiple links that ends with .png and .jpg extension.

I want to print and save in txt file only links that ends with .jpg extension.

I tried this code but only save the first result:

for item in soup.find_all('img'):
    hotel_image = (item['src'])
    print(hotel_image)
    file1 = open("myfile.txt", "w")
    file1.writelines(hotel_image)
    file1.close()  # to change file access modes

Ex of links:

https://cf.bstatic.com/images/hotel/max300/288.jpg

https://cf.bstatic.com/static/img/flags/12/eg.png

https://cf.bstatic.com/images/hotel/max300.jpg

https://cf.bstatic.com/static/img/review.png

what i want:

https://cf.bstatic.com/images/hotel/max300.jpg

https://cf.bstatic.com/images/hotel/max300/288.jpg

Any kind of help please?

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Abdullah Md
  • 151
  • 15

2 Answers2

1

Make sure hotel_image is string otherwise convert it to string and use endswith function.

try this:

with open("myfile.txt", "w") as fp:
    for item in soup.find_all('img'):
        hotel_image = (item['src'])
        if hotel_image.endswith('.jpg'):
            fp.writelines(hotel_image)
Amit Nanaware
  • 3,203
  • 1
  • 6
  • 19
0

Try using a RegEx inside the "find_all" method, as in:

soup.find_all('img', alt="" ,src=re.compile(".jpg"))