1

I am building a web scraper to scan the inventory of webpages. The script first reads a CSV file and imports items that have been scraped in previous sessions. Then it parses the webpages infinitely until I need it to stop. Which is not known beforehand. I have written a While True loop to achieve this. After this loop, it should add any new items that have been scraped to the existing CSV and save them.

However, I have not found a way to properly exit this infinite loop without losing any of the data it has scraped. I have tried everything on this topic: Ending an infinite while loop but it has not worked in my case.

I run my program in Powershell, and whenever I try ctrl + c, it just does not respond. ctrl + pause/break seems to work, but it stops the whole script immediately, without saving anything.

Edit: added code, the code runs just fine, I just can't exit it.

2nd Edit: Added minimal version of code.

from threading import Thread

def looping():
  while True:
    try:
      print("looping")
    
    except KeyboardInterrupt:
      print("Interrupted")
      break


threads = []

for i in range(3):
  t = Thread(target=looping)
  t.start()
  threads.append(t)

for thread in threads:
    thread.join()
Vin
  • 968
  • 2
  • 10
  • 22
mtedu
  • 41
  • 7
  • 3
    Are you using a bare `except`? IIRC, a bare `except` will catch `KeyboardInterrupt` if I'm not mistaken. – ddejohn Feb 25 '21 at 23:02
  • With the given indentation, the code will not run – Thomas Weller Feb 25 '21 at 23:05
  • Thank you for pointing that out, It somehow copied it without the first indentation. – mtedu Feb 25 '21 at 23:07
  • @blorgon, I have tried that as well, it does not work. MarkM mentions in the other thread the following: the try-except only defines a way to handle a KeyboardInterrupt once it managed to actually interrupt the program. But the program can still be blocking and not register your interrupt signal. However, I have tried his way as well which is not working either. – mtedu Feb 25 '21 at 23:11
  • Do you know which call is blocking to stop the interrupt? Could it be the time.sleep()? How long is that set to? Did you try breaking it down into a loop of many smaller sleeps? It's difficult to replicate the error without a [minimal, complete example](https://stackoverflow.com/help/minimal-reproducible-example). This code is using plenty of other functions and globals not shown here. – Paul Rooney Feb 25 '21 at 23:27
  • @Paul Rooney, thank you for pointing out those guidelines, if have created a minimal version copy of my code. I found out that the threads are blocking the interrupt. I have no idea why though. – mtedu Feb 25 '21 at 23:42
  • Anecdotally, I've often found designing your program to allow setting threads to be daemons (`daemon=True`) and never joining them will make your life easier! – ti7 Feb 26 '21 at 06:26
  • This doesn’t look very similar to your original code. What you could use is a message queue between the main thread and the child threads. The main thread needs to catch the KeyboardInterrupt and then send messages to the child threads to stop. – Paul Rooney Feb 26 '21 at 06:32

0 Answers0