0

I am a beginner to python and I have a txt file which contains list of URLs, and when I want to scan the txt file, I got error in the last which says KeyboardInterrupt. This is my code
if name == "main":

# Directory that contains panafapi.py
SCRIPT_DIRECTORY = "/Users/kiya/Downloads/panpython/bin"

RES_DIRECTORY = "/Users/kiya/Desktop/result/"
TODAY = datetime.date.today().strftime("%Y%m%d")

CHECK_DATE, NOW_DATE = "2018-01-01", "2018-01-31"    
try:
    url_file = sys.argv[1]
except Exception:
    print("Usage: python3 {} [url file]".format(sys.argv[0]))
    sys.exit()

url_file = str(url_file)
df=open(url_file,"r",encoding="utf-8", errors='replace')
data=df.readlines();df.close()
urlPattern = re.compile(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')

url_list = []
for url in data:
    dt = url.strip()
    dt = re.findall(urlPattern,dt)# get URLs
    if dt:
        url_list.append(dt[0])
print("Read {} Url(s)".format(len(url_list)))
group_i = 0
  • Don't construct your command in the middle of the `check_output()` call. Construct the string outside of the function call, e.g. `cmd = "....".format(....)`, then `checkoutput(cmd)`. Why? Because when your command doesn't work, then you can print cmd, and try to determine what is wrong with the command, like trying the printed out command on the command line to see if it even works. – 7stud Feb 18 '18 at 06:10

1 Answers1

0

I think the issue is you redirect output to a path which doesn't exist

/Users/kiya/Desktop/result/result767_http://blogimg.goo.ne.jp/.json

try a simple filename like

/Users/kiya/Desktop/result/result767_blogimg.goo.ne.jp.json
alingo
  • 16
  • 2
  • My suspicion as well. It's pretty obvious the url got tacked onto the end of the real path. – 7stud Feb 18 '18 at 06:24