0

When I try to save the responses to a file, the actual response is not saved even though it shows in the console. The result that is saved in the file is None. See examples below

from concurrent.futures import ThreadPoolExecutor
import requests
#from timer import timer


#########  create test file

URLsTest = '''
https://en.wikipedia.org/wiki/NBA
https://en.wikipedia.org/wiki/NFL
'''.strip()

with open('input.txt', 'w') as f:
    f.write(URLsTest)
    
####################

with open('input.txt', 'r') as f:
    urls=f.read().split('\n')    # url list

def fetch(tt):  # received tuple
    session, url = tt
    print('Processing')
    with session.get(url) as response:
        print(response.text)

#@timer(1, 5)
def main():
    with ThreadPoolExecutor(max_workers=100) as executor:
        with requests.Session() as session:  # for now, just one session
            results = executor.map(fetch, [(session, u) for u in urls])  # tuple list (session, url), each tuple passed to function
            executor.shutdown(wait=True)
    # write all results to text file
    with open('output.txt', 'w') as f2:
        for r in results:  # tuple (url, html)
            f2.write("%s\n" % r)
            
main()

Response file - output.txt

None    
None
mjbaybay7
  • 99
  • 5
  • 1
    What is `fetch` returning? – Carcigenicate Aug 19 '20 at 23:45
  • 1
    Just to say though, for opening input.txt you don't have to use `urls=f.read().split('\n')` you can use `f.readlines()` and it returns a list of lines – hedy Aug 19 '20 at 23:46
  • 1
    `executor.map` is (semi) lazy, and the results it computes may be tied to the lifetime of the executor. Try putting the `with open` block inside the scope of the `with ThreadPoolExecutor` block (and don't shut it down at all; the `with` block is handling that for you anyway). – ShadowRanger Aug 19 '20 at 23:47
  • @Carcigenicate It is returning the response of the url, in this case the whole contents of the each page that is called – mjbaybay7 Aug 19 '20 at 23:50
  • @ShadowRanger can you please provide an example? I'm not understanding – mjbaybay7 Aug 19 '20 at 23:51
  • 3
    @mjebay7 `fetch` isn't returning anything though; it's printing. If you don't return anything from a function, `None` is automatically returned... and that `None` is ending up in `results`. You need to explicitly `return` to return a value from a function. – Carcigenicate Aug 19 '20 at 23:51
  • @Hedy so just urls=f.readlines() will work here? – mjbaybay7 Aug 19 '20 at 23:51
  • 1
    Yes if what you are trying to get is a list of lines – hedy Aug 19 '20 at 23:52
  • @Carcigenicate I see. That's why the console is showing that response being printed. How can I modify this so that the results are printed into a file? – mjbaybay7 Aug 19 '20 at 23:56
  • 1
    `return response.text` after the existing `print` in the function. – Carcigenicate Aug 19 '20 at 23:57

2 Answers2

1

First of all, you could avoid printing the html since you are saving that output to a file. That way you can avoid using resources to print the results.

Then, your fetch is not returning anything for the results. Therefore you should change your print for a return So instead of printing return the response.text

# print(response.text)
return response.text
kachus22
  • 439
  • 1
  • 4
  • 10
-2

The ideal practice is not to print the html, it is so as you have to save the work or output to a file which disables you from printing the entire results in their original shape.

  • 1
    This sounds sorta like you might know the problem, but it's failing to actually make clear what they need to do to fix it, and it's too ambiguous describing the problem for them to figure it out. – ShadowRanger Aug 20 '20 at 02:16