5

I have been using Python Beautifulsoup to scrape data. So far have beeen successfully scraped. But stuck with the following website.

Target Site: LyricsHindiSong

My goal is scrape song lyrics from the mentioned website. But all the time it gives blank result or Nonetype object has no attribute kind error.

Have been struggling since last 15 days and could not able to figure out where was the problem and how to fix it?

Following is the code which is I am using.

import pymysql
import requests
from bs4 import Beautifulsoup

r=requests.get("https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html")
soup=Beautifulsoup(r.content,'html5lib')
pageTitle=soup.find('h1').text.strip()
targetContent=soup.find('div',{'style':'margin:25px; color:navy;font-size:18px;'})
print(pageTitle)
print(targetContent.text.strip())

It prints error nonetype object has no text error. If I check in the inspect window, element has both the elements present. Unable to understand where is the problem. Atleast it should have printed the title page.

Hope you understand my requirement. Please guide me. Thanks.

1 Answers1

4

You made a mistake in class name from bs4 lib and used find method instead of find_all

Full code:

import requests
from bs4 import BeautifulSoup


url = "https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html"
response = requests.get(url)

soup = BeautifulSoup(response.content,'html5lib')

title = soup.find('h1').text.strip()
content = soup.find_all('div',{'style':'margin:25px; color:navy;font-size:18px;'})

print(title)

for line in content:
    print(line.text.strip())

Result:

python answer.py
Chnda Re Chnda Re Chhupe Rahana
चंदा रे, चंदा रे, छुपे रहनासोये मेरी मैना, लेके मेरी निंदिया रे
फूल चमेली धीरे महको, झोका ना लगा जाये नाजुक डाली कजरावाली सपने में मुस्काये लेके मेरी निंदिया रे
हाथ कहीं है, पाँव कहीं है, लागे प्यारी प्यारी ममता गाए, पवन झुलाये, झूले राजकुमारी लेके मेरी निंदिया रे  
cl0wzed
  • 101
  • 6
  • hi there - many thanks for this nice solution - which is pretty convincing. -i get back `Chnda Re Chnda Re Chhupe Rahana Traceback (most recent call last): File "/home/martin/dev/vscode/hindisong.py", line 16, in print(line.text.strip()) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)[Finished in 8.26s]` This is due to the different characterset - and here in my ATOM - setup this endoding is not set.... Allthough , many thanks for your solution. – zero Apr 11 '20 at 22:51
  • Change `line.text.strip()` to `line.text.encode('utf-8').strip()`. Check this answer: https://stackoverflow.com/a/9942822/13287399 – cl0wzed Apr 12 '20 at 20:17