0
def fetch_html(url):
    # ungood idea to assume its UTF-8. Try to read header
    try:
        fp = urllib.request.urlopen(url)
        fpbytes = fp.read()
        html = fpbytes.decode("utf8")
        fp.close()
        print("Success! {} chars found".format(len(html)))
        return html

    except:
        print("Failed to extract html, retrying again in a few seconds")
        time.sleep(3.5)
        fetch_html(url)


url = "https://i.reddit.com/r/AskReddit/top/.compact?sort=top&t=day"
html = fetch_html(url)
print(html)

html is still None despite it giving 70000 in len(html), What gives? I tried switching the order, placing fp.close() after return html, but it still gives the same error.

I have searched for this in google, though their issue comes from not using return on their values, which is different in this question.

SOLVED: https://gist.github.com/carcigenicate/ff1523fa66602a1c47b7c5ae4d6f1e92

def fetch_html(url):
while True:
    try:
        fp = urllib.request.urlopen(url)
        fpbytes = fp.read()
        html = fpbytes.decode("utf8")
        fp.close()
        print("Success! {} chars found".format(len(html)))
        return html

    except:
        print("Failed to extract html, retrying again in a few seconds")
        time.sleep(3.5)
Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
rn0mandap
  • 333
  • 1
  • 8
  • 1
    "I have searched for this in google, though their issue comes from not using return on their values, which is different in this question." Actually, it's the same problem. You aren't returning here in the recursive case. – Carcigenicate Apr 30 '20 at 15:01
  • 1
    Does this answer your question? [Why does my recursive function return None?](https://stackoverflow.com/questions/17778372/why-does-my-recursive-function-return-none) – Carcigenicate Apr 30 '20 at 15:02
  • 2
    there's no `return` in the exception handler. If that code is hit the function will return `None`. – AbbeGijly Apr 30 '20 at 15:02
  • yeah but it will call the function again instead of ending the function in the except – rn0mandap Apr 30 '20 at 15:04
  • Yes, but you need to return in the recursive case as well. `return html` returns `html` to the last recursive call, but then the recursive calls need to return `html` up to other, previous recursive calls, and ultimately the caller. You need to return all the way down the recursive call chain. – Carcigenicate Apr 30 '20 at 15:05
  • what i am trying to do is to get html from a website, it keeps saying too many requests so i made it recursive so that it could keep calling itself until it gets the html ... i dont get it, can you elaborate again? it only returns if it gets the html, in the try block. Yes i used print on the html, wow, yes it showed the html but then None many times – rn0mandap Apr 30 '20 at 15:11
  • What do I do to get my html and avoid the multiple Nones getting returned? can you show my please an example code? :)) Wow! – rn0mandap Apr 30 '20 at 15:12
  • Again, just `return fetch_html(url)`. You need to return in all cases. – Carcigenicate Apr 30 '20 at 15:13
  • 2
    I'll note though, recursion is a bad idea here. Just sticking everything inside a `while` loop would be significantly better, like [this](https://gist.github.com/carcigenicate/ff1523fa66602a1c47b7c5ae4d6f1e92). With recursion, your code will eventually consume all available stack space and crash if it fails too many times. – Carcigenicate Apr 30 '20 at 15:16
  • not working, it just loops, it found the html, but did the function again instead of returning – rn0mandap Apr 30 '20 at 15:17
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/212870/discussion-between-rn0mandap-and-carcigenicate). – rn0mandap Apr 30 '20 at 15:17
  • if you post an answer with the code and if it works, i will mark it as solved :)) – rn0mandap Apr 30 '20 at 15:18
  • wow thanks, it worked! – rn0mandap Apr 30 '20 at 15:24
  • @rn0mandap I'm glad it worked. I'm not going to post an answer though because this is a duplicate of the post I linked to. You can mark it as solved by accepting (and reading) that answer. The answer to your original question was to return in all cases, which the duplicate answers. – Carcigenicate Apr 30 '20 at 15:31

0 Answers0