I have a project deployed on Scrapinghub, I do not have any copy of that code at all.
How can I download the whole project's code on my localhost from Scrapinghub?
I have a project deployed on Scrapinghub, I do not have any copy of that code at all.
How can I download the whole project's code on my localhost from Scrapinghub?
I was able to download project code using
shub fetch-eggs project_id_here
Where project_id_here
can be grabbed from browser URL when project is opened.
The resultant file will be a *.egg
just extract it like a ZIP file using WinRAR or any other tool you use.
Additional notes: - SHUB does not have user-friendly errors, once I was logged into shub using a different account and was trying to download project of a another different account, so please make sure you are logged into the same scrapinghub account in which the project exists you are trying to download.
As far as I know, there's currently no public API for retrieving your project source code on Scrapy Cloud. (Correct me if wrong.)
But it's indeed possible to retrieve your project source code without additional privileges.
When a job is running, the project-related files locate in the /app
path:
job-<some-job-id>:/app$ ls -la /app
total 48
drwxr-xr-x 5 root root 4096 Jul 27 17:13 .
drwxr-xr-x 82 root root 4096 Jul 28 04:09 ..
-rw-r--r-- 1 root root 26695 Jul 27 17:13 __main__.egg
drwxr-xr-x 2 nobody nogroup 4096 May 23 07:34 addons_eggs
drwxr-xr-x 2 nobody nogroup 4096 Jul 24 14:27 python
-rw-r--r-- 1 root root 14 Jul 24 14:27 requirements.txt
Where the file __main__.egg
contains all your project source code.
Thus you may:
.egg
file somewhere you may retrieve later, e.g. curl http://IP-address-of-your-own-server:8888/retrieve-file --data-binary @/app/__main__.egg
(assuming you have prepared the service for receiving the data).Alternatively, I suppose you could always contact Scrapinghub support for help.