0

I've adopted the code from this stackoverflow.com link, How can I download music files from websites using #Python, to work with Starcraft2 sounds from this following website: https://nuclearlaunchdetected.com/. When I execute the code, all the files turn out to be 360bytes and corrupt. But when downloading the files manually, they are fine. Here is the code I'm working with currently. Any help would be appreciated!

import requests
from bs4 import BeautifulSoup
import os

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}

url = 'https://nuclearlaunchdetected.com/'
response = requests.get(url, verify=False)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a')
download_dir = 'C:/Users/SC2 Sounds/'
counter = 0


for link in links:
    if link['href'].endswith('.mp3'):
        file_url = url + '/' + link['href']
        file_name = os.path.basename(file_url)
        print('Downloading:', file_name)
        with open(download_dir + file_name, 'wb') as f:
            f.write(requests.get(file_url, headers=headers).content)
        counter += 1
        if counter == 10:
            break

I ran the code above fine but the files turned out corrupt. I was expecting non-corrupt playable fines similar to what is to be had when downloading them manually. I also set it to only download the first 10 files for testing purposes

Victor
  • 1

1 Answers1

0

The file_url part was incorrect. It should be file_url = link['href'] and not file_url = url + '/' + link['href']. The url should not be appended to the file_url. It helped debugging using print(file_url). Here is the modified code:

import requests
from bs4 import BeautifulSoup
import os

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}

url = 'https://nuclearlaunchdetected.com/'
response = requests.get(url, verify=False)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a')
download_dir = 'C:/Users/SC2 Sounds/'
counter = 0


for link in links:
    if link['href'].endswith('.mp3'):
        file_url = link['href']
        file_name = os.path.basename(file_url)
        print('Downloading:', file_name)
        with open(download_dir + file_name, 'wb') as f:
            f.write(requests.get(file_url, headers=headers).content)
        counter += 1
        if counter == 10:
            break
Victor
  • 1