2

using the following code, i'm unable to break from the loop:

import requests
from bs4 import BeautifulSoup


link = "http://www.wuxiaworld.com/"

html = requests.get(link)
soup = BeautifulSoup(html.content, "html.parser")

all_aside = soup.find_all("aside", id="recent-posts-2")
for aside in all_aside:
    for li in aside.find_all("li"):
        for a in li.find_all("a"):
            if "ATG" in a.text:
                print(a.text)
                break

i wanted to make a script that would notify me in case of the release of a new chapter of a story. I've just began making it but i got stuck here (it prints all the new releases containing the string "ATG" within the recent posts).

I made some research and found out that i should break of all the loops, yet when i try this bit of code, nothing gets printed to the console:

import requests
from bs4 import BeautifulSoup


link = "http://www.wuxiaworld.com/"

html = requests.get(link)
soup = BeautifulSoup(html.content, "html.parser")

all_aside = soup.find_all("aside", id="recent-posts-2")
for aside in all_aside:
    for li in aside.find_all("li"):
        for a in li.find_all("a"):
            if "ATG" in a.text:
                print(a.text)
                break
        break
    break

Thank you.

Omar El Atyqy
  • 154
  • 1
  • 1
  • 10

4 Answers4

4

Actually, they are working, exactly as designed. That's not necessarily the same as working the way some people may think they should work :-)

The break statements for the outer and middle loops will be executed regardless of whether the inner break has happened. In other words, they are unconditional, so those loops will only execute once.

You can get around this with something like:

foundIt = False
for aside in all_aside:
    for li in aside.find_all("li"):
        for a in li.find_all("a"):
            if "ATG" in a.text:
                print(a.text)
                foundIt = True
                break
        if foundIt: break
    if foundIt: break

Another possibility (possibly "cleaner") would be to refactor it into a function:

def printOne(asides):
    for aside in asides:
        for li in aside.find_all("li"):
            for a in li.find_all("a"):
                if "ATG" in a.text:
                    print(a.text)
                    return

and call it with:

printOne(all_asides)
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
1

In your first case, you break at the end of a for loop, but the next for loop will still run. In your second case, you break immediately after the first iteration of the first for loop, and leave all the loops at once.

Another answer mentions using a series of if statements for the breaks You can also use exception handling to perform multilevel breaks.

class BreakException(Exception):
    pass
try:
    for aside in all_aside:
        for li in aside.find_all("li"):
            for a in li.find_all("a"):
                if "ATG" in a.text:
                    print(a.text)
                    raise BreakException
except BreakException:
    pass

see here for more ways to break multiple levels.

Matthew Ciaramitaro
  • 1,184
  • 1
  • 13
  • 27
0

A bit of an ugly hack (... okay, fine, a complete and utter abomination), for all but the outermost loop:

for ...:
   ...
else:
  continue
break
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
0

Maybe above answers are right as well but there can also be another rare case.

If you are debugging using PyCharm, it is also possible that you have edited the code and the line of code is shifted upward/downward visually but it will not be interpreted/executed until the debugging session is restarted.

So apparently, restarting the debugging session works well...

Abdullah
  • 93
  • 14