I am trying to pass a website as a parameter. It works if the website does not have a "/" in it. For example: http://192.168.1.156:2434/www.cookinglight.com scrapes cooking light for all the images on it's page; however, if I pass in http://192.168.1.156:2434/https://www.cookinglight.com/recipes/chicken-apple-butternut-squash-soup then an I get an invalid response. Here is my current code:
import json
from flask import Flask, render_template
from imagescraper import image_scraper
app = Flask(__name__)
@app.route("/", methods = ['GET'])
def home():
return render_template('index.html')
@app.route("/<site>", methods = ['GET'])
def get_image(site):
return json.dumps(image_scraper(site))
if __name__ == '__main__':
app.run(host='0.0.0.0', port=2434, debug=True)
import requests
from bs4 import BeautifulSoup
def image_scraper(site):
"""scrapes user inputed url for all images on a website and
:param http url ex. https://www.cookinglight.com
:return dictionary key:alt text; value: source link"""
search = site.strip()
search = search.replace(' ', '+')
website = 'https://' + search
response = requests.get(website)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img')
# create dictionary to add image alt tag and source link
images = {}
for img in img_tags:
try:
name = img['alt']
link = img['src']
images[name] = link
except:
pass
return images
I tried urrllib but did not have any success. Any help would be greatly appreciated! I am a student so still learning!!
UPDATE:
I believe this is the issue as described in the stackoverflow post