2

I am trying store the html code into variable called response using cmdline.execute as shown in below code ,but it is unavailable to store and program code breaks at scrapy shell, can anyone tell me how to store raw html to variable

import scrapy

from scrapy import cmdline

linkedinnurl = "https://stackoverflow.com/users/5597065/adnan-stab=profile"

response = cmdline.execute("scrapy shell https://stackoverflow.com/users/5597065/adnan-s?tab=profile".split()))

print(response)

  • Possible duplicate of [Saving response from Requests to file](https://stackoverflow.com/questions/31126596/saving-response-from-requests-to-file) – vezunchik May 16 '19 at 11:55
  • @vezunchik Clearly not a duplicate. The linked question seeks to store the value of `requests.post`, whereas this question seeks to store the result of an operation initiated by `cmdline.execute`. Completely different scenario. – Brian Warshaw May 16 '19 at 11:58
  • Hm, yes, my fault. Thank you. – vezunchik May 16 '19 at 12:00

1 Answers1

2

You can do like this to store raw html to variable:

 class MySpider(scrapy.Spider):
        def parse(self, res):
            with open(dynamic_file_name_function(res.url), 'w') as f:
                f.write(res.body)

if you don't need dynamic file name then just do :

 class MySpider(scrapy.Spider):
        def parse(self, res):
            with open(your_file_path, 'w') as f:
                f.write(res.body)
Amrit
  • 2,115
  • 1
  • 21
  • 41
  • that was a function to create dynamic filename.You can remove that if you don't need dynamic file name. I have updated the answer – Amrit May 17 '19 at 05:40