I'm writing a script to get information regarding buildings in NYC. I know that my code works and returns what i'd like it to. I was previously doing manual entry and it worked. Now i'm trying to have it read addresses from a text file and access the website with that information and i'm getting this error:
urllib.error.HTTPError: HTTP Error 400: Bad Request
I believe it has something to do with the website not liking lots of access from something that isn't a browser. I've heard something about User Agents but don't know how to use them. Here is my code:
from bs4 import BeautifulSoup
import urllib.request
f = open("FILE PATH GOES HERE")
def getBuilding(link):
r = urllib.request.urlopen(link).read()
soup = BeautifulSoup(r, "html.parser")
print(soup.find("b",text="KEYWORDS IM SEARCHING FOR GO HERE:").find_next("td").text)
def main():
for line in f:
num, name = line.split(" ", 1)
newName = name.replace(" ", "+")
link = "LINK GOES HERE (constructed from num and newName variables)"
getBuilding(link)
f.close()
if __name__ == "__main__":
main()