0

I'm building a web-scraping application using the Django framework. I need some tips on how to speed up my application. As of right now, it takes almost a minute to load the page just parsing through 3 urls which is a problem. I'm going to need to run a lot faster as I want to parse through up to 10 urls on my webpage. As you can see, I'm only targeting one div with my code which is why my application is running so slowly. I'm thinking I could try targeting multiple divs to narrow down my "soup" but I've had difficulty with that in the past so I'm hoping to get some pointers.

def stats(request):
    if 'user_id' not in request.session:
        return redirect('/')
    this_user = User.objects.filter(id = request.session['user_id'])
    this_stock = Stock.objects.filter(user_id = request.session['user_id'])
    progress_dict = []
    for object in this_stock:
        URL = object.nasdaq_url
        page = requests.get(URL)
        soup = BeautifulSoup(page.content, 'html.parser')
        progress = soup.find_all('div', class_='ln0Gqe')
        for number in progress:
            progress_dict.append(number.text)
    context = {
            "current_user" : this_user[0].first_name,
            "progress_dict": progress_dict,
            "this_stock": this_stock,
        }
    return render(request, "nasdaq.html", context)

1 Answers1

0

You can use threading for scraping multiple page simultaneously. Look here , here for more information about it.

And also using lxml can speed up your webscraping. You can check here for more information.

imxitiz
  • 3,920
  • 3
  • 9
  • 33