I have written code for a django web page that has a form for user input. When the user enters text into the form and clicks the submit button, a celery task which runs a scrapy spider needs to be started. The form actually takes the name of a band which is to be passed as an argument to the spider and concatenated to the start url. So far, from the code I have, whenever the commands python manage.py celery worker --loglevel=info or python manage.py runserver, the log for the scrapy spider starts to execute, but never actually shows the webpages being crawled as it normally does. However, when I submit the form request, the scrapy spider is not being run. What is the proper way to run the celery task when the submit button is clicked. I was following the solution from this SO post, but Scrapy and celery have since been updated and the solution doesn't seem to be working now. The code for the relevant files is below:
tasks.py
from celery.registry import tasks
from celery.task import Task
from django.template.loader import render_to_string
from django.utils.html import strip_tags
from django.core.mail import EmailMultiAlternatives
from ticket_city_scraper.ticket_city_scraper.spiders.tc_spider import spiderCrawl
from celery import shared_task
@shared_task
def crawl():
return spiderCrawl()
Edit:
As can be seen in the views file, the crawl method is only called in the choice view, but every time the a new page is visited, the spider log starts
views.py
from django.shortcuts import render
from .forms import ContactForm, SignUpForm, BandForm
from tasks import crawl
def choice(request):
title = 'Welcome'
form = SignUpForm(request.POST or None)
context = {
"title" : title,
"form" : form,
}
if form.is_valid():
instance = form.save(commit = False)
full_name = form.cleaned_data.get("full_name")
if not full_name:
full_name = "New full name"
instance.full_name = full_name
# if not instance.full_name:
# instance.full_name = "A name"
instance.save()
context = {
"title" : "Thank you",
}
crawl.delay()
return render(request, "home.html", context)
terminal window when running server
-------------- celery@elijah-VirtualBox v3.1.18 (Cipater)
---- **** -----
--- * *** * -- Linux-3.13.0-54-generic-x86_64-with-Ubuntu-14.04-trusty
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: default:0x7faaebc80410 (djcelery.loaders.DjangoLoader)
- ** ---------- .> transport: amqp://guest:**@localhost:5672//
- ** ---------- .> results: database
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. comparison.tasks.crawl
[2015-08-21 23:15:21,076: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2015-08-21 23:15:21,186: INFO/MainProcess] mingle: searching for neighbors
[2015-08-21 23:15:22,244: INFO/MainProcess] mingle: all alone
/home/elijah/Desktop/trydjango18/trydjango18/local/lib/python2.7/site-packages/djcelery/loaders.py:136: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '
[2015-08-21 23:15:22,331: WARNING/MainProcess] /home/elijah/Desktop/trydjango18/trydjango18/local/lib/python2.7/site-packages/djcelery/loaders.py:136: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '
[2015-08-21 23:15:22,333: WARNING/MainProcess] celery@elijah-VirtualBox ready.
[2015-08-21 23:15:24,294: INFO/MainProcess] Received task: comparison.tasks.crawl[d930a0e8-7d63-4d55-ba85-53bb174f98f4]
[2015-08-21 23:15:24,296: INFO/MainProcess] Received task: comparison.tasks.crawl[37187368-cfd1-4b9e-9a2e-8e14266947ef]
[2015-08-21 23:15:24,298: INFO/MainProcess] Received task: comparison.tasks.crawl[d5aa8448-2ee5-47f9-8b6e-5112201665ef]
[2015-08-21 23:15:24,300: INFO/MainProcess] Received task: comparison.tasks.crawl[d8ae8663-3fe1-484b-b43b-d54f173fd85e]
[2015-08-21 23:15:24,301: INFO/MainProcess] Received task: comparison.tasks.crawl[1eb42061-ec5a-4697-9df8-9b07c62f04f9]
[2015-08-21 23:15:24,302: INFO/MainProcess] Received task: comparison.tasks.crawl[d3a7619f-2fcc-4105-93f8-b2ac9004593b]
[2015-08-21 23:15:24,303: INFO/MainProcess] Received task: comparison.tasks.crawl[2b06afd0-24ab-4198-a49e-b32dfe0ca804]
[2015-08-21 23:15:24,505: ERROR/MainProcess] Task comparison.tasks.crawl[37187368-cfd1-4b9e-9a2e-8e14266947ef] raised unexpected: NameError("global name 'MySpider' is not defined",)