1

I am using Scrapy to gather images. I would like to simulate a post onto a PHP script with multiple files. Similar to when someone uploads 10 files and they get processed by a PHP script using $_FILES['name']. I would also like to pass $_POST data as well.

Here is my Python.

  post_array={
   'parse':'listing'
  }

  files_array=response.xpath(root+'/photos//url/text()').extract()

  returned=requests.post(php-script.php,data=post_array,files=files_array).text
  pprint(returned)

So this is suppose to create a $_POST variable and a $_FILES variable with multiple files. How can I convert the list of URLs in files_array to become a $_FILES array in the php-script.php?

Python data input:

  post_array={
   'parse':'listing'
  }
  files_array=['https://example.co/123.jpg','https://example.co/124.jpg','https://example.co/125.jpg']]

into PHP data output inside php-script.php (desired result):

$_POST=['parse'=>'listing'];
$_FILES=['images'=>[
[0] => Array
    (
        [name] => 123.jpg
        [type] => image/jpeg
        [tmp_name] => /tmp/php/php6hst32
        [error] =>
        [size] => 98174
    )
[1] => Array
    (
        [name] => 124.jpg
        [type] => image/jpeg
        [tmp_name] => /tmp/php/php6hst32
        [error] =>
        [size] => 98174
    )
[2] => Array
    (
        [name] => 125.jpg
        [type] => image/jpeg
        [tmp_name] => /tmp/php/php6hst32
        [error] =>
        [size] => 98174
    )
]];

I have also tried this:

returned=requests.post(triggers,data=post_array,files={'images':[url for url in files_array requests.get(url).content]}).text
pprint(returned)
Maciek Semik
  • 1,872
  • 23
  • 43

2 Answers2

1

The only way to convert list of URLs to a $_FILES array in PHP script is to actually upload these files (via POST request with enctype="multipart/form-data").

Here it's how you can do it with requests:

files_array = [('images', ('123.jpg', open('123.jpg', 'rb'), 'image/jpeg')),
               ('images', ('124.jpg', open('124.jpg', 'rb'), 'image/jpeg')),
               ('images', ('125.jpg', open('125.jpg', 'rb'), 'image/jpeg'))]
r = requests.post(url, data=post_array, files=files_array)

You can find detailed example in Advanced Usage documentation for Requests

krlv
  • 2,310
  • 1
  • 12
  • 15
  • Perfect, is it possible to create a dynamic files_array. I don't know how many images are in the array. Something like a [for image in images] – Maciek Semik Mar 21 '19 at 17:21
  • I am creating a dynamic list of files when I do post request I got the error. `ValueError: not enough values to unpack (expected 2, got 1)` – sam Jul 14 '21 at 06:11
0

Scrapy does not yet have file upload support, so you have to build such requests manually, which might not be trivial for you.

Adding file upload support to Scrapy has been requested, and there is an unfinished implementation that you could try out, or even try to finish.

Whatever approach you decide to follow, mind that you won’t be able to build such requests based on file URLs. To upload a file, you must have it in your computer; if you do not have it, you must download it.

Gallaecio
  • 3,620
  • 2
  • 25
  • 64