I'm working on a web-scraping task and I can already collect the data in a very rudimentary way.
Basically, I need a function to collect a list of songs and artists from the Allmusic.com and then add the data in df. In this example, I use this link: https://www.allmusic.com/mood/tender-xa0000001119/songs
So far, I managed to accomplish most of the objective, however, I had to perform two different functions (def get_song() and def get_performer()).
I would like, if possible, an alternative to join these two functions.
The codes used are below:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/31.0'}
link = "https://www.allmusic.com/mood/tender-xa0000001119/songs"
# Function to collect songs (title)
songs = []
def get_song():
url = link
source_code = requests.get(url, headers=headers)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for td in soup.findAll('td', {'class': 'title'}):
for a in td.findAll('a')[0]:
song = a.string
songs.append(song)
# Function to collect performers
performers = []
def get_performer():
url = link
source_code = requests.get(url, headers=headers)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for td in soup.findAll('td', {'class': 'performer'}):
for a in td.findAll('a'):
performer = a.string
performers.append(performer)
get_song(), get_performer() # Here, I call the two functions, but the goal, if possible, is to use one function.
df = pd.DataFrame(list(zip(songs,performers)), columns=['song', 'performer']) # df creation