1

I am working on a project where urls are put into a Django model called UrlItems. The models.py file containing UrlItems is located in the home app. I typed scrapy startproject scraper in the same directory as the models.py file. Please see this image to better understand my Django project structure.

I understand how to create new UrlItems from my scraper but what if my goal is to get and iterate over my Django project's existing UrlItems inside my spider's def start_requests(self) function?

What I have tried:

1) I followed the marked solution in this question to try and see if my created DjangoItem already had the UrlItems loaded. I tried to use UrlItemDjangoItem.objects.all() in my spider's start_requests function and realized that I would not be able to retrieve my Django project's UrlItems this way.

2) In my spider I tried to import my UrlItems like this from ...models import UrlItem and I received this error ValueError: attempted relative import beyond top-level package.

Update

After some consideration I may end up having the Scrapy spider query my Django application's API to receive a list of the existing Django objects in JSON.

kas
  • 857
  • 1
  • 15
  • 21
  • 1
    Could you post `tree` structure of your project directory? – Granitosaurus Dec 07 '16 at 15:31
  • Here is an image of the entire tree structure of my project and `home` app. `settings folder` contains project settings python files and `scraper folder` contains the Scrapy scraper project http://i.imgur.com/cKTIayM.png – kas Dec 07 '16 at 15:46
  • @Granitosaurus Added an update – kas Dec 07 '16 at 17:50

0 Answers0