what I would like to do is pretty basic I think but I couldn't find a way to implement it.
I am trying to use the FilesPipeline in scrapy in order to download a file (ex. Image1.jpg) and save it on a path relative to the item which placed that request in the first place (ex. item.name).
It is pretty similar with this question here, though I want to pass as an argument the item.name or item.something field, in order to save each file in a custom path depending on the item.name.
The path is defined in the persist_file
function, but that function does not have access to the item itself, just the file request and response.
def get_media_requests(self, item, info): return [Request(x) for x in item.get(self.FILES_URLS_FIELD, [])]
I can also see above, that the request is made here in order to process the files into the pipeline, but is there a way to pass an extra argument in order to later use it on the
file_downloaded
and afterwards onpersist_file
function?
As a final solution, it would be pretty simple to rename/move the file after it has been downloaded in one of the following pipelines but it seems sloppy, isn't it?
I am using the code implemented here as a custom pipeline.
Can anyone help please? Thank you in advance :)