I am a beginner at python and I have developed a program that is is meant to crawl a website (that sells things) and print out the frequency of different words in the titles of the different items on sale.
There are three functions in my program: 1) A function that takes the text of the website and refines it to make a string 2) A function that takes that string and cleans it up, getting rid of things like brackets, commas, asterisks etc. 3) A function that then takes this string and sorts the words by how many times they are written on the website
I had an error in this program with my BeautifulSoup4 module, this other post helped me get rid of it: How to get rid of BeautifulSoup user warning? Although this made two more errors in my program: 1) An error with the link I put into the first function
File "/Users/lowryj1/PycharmProjects/untitled2/Jaer.py", line 39, in <module>
start('https://hongkong.asiaxpat.com/classifieds/glassware/')
And this is the code that is wrong (The link is the website I'm crawling):
start('https://hongkong.asiaxpat.com/classifieds/glassware/')
2) This in an error with my line of code where I try to split the string in the first function and make all of the characters lowercase, this just makes this error:
File "/Users/lowryj1/PycharmProjects/untitled2/Jaer.py", line 11, in start
words = content.lower().split()
AttributeError: 'NoneType' object has no attribute 'lower'
And this is the code that is wrong:
words = content.lower().split()
This is the area I have the error (url is where my website url comes in):
def start(url):
word_list = []
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html5lib")
for post_text in soup.findAll('a', {'target': '_blank'}):
content = post_text.string
**words = content.lower().split()**
I have tried my best to solve these problems, most solutions I've tried only make the issues worse. Please help me solve these errors, as I was unable to find adequate solutions to this problem via research.