0

Hi~ I am a crazy beginner in python,I recently want to crawl the singer names together with song names from my own favorite list by selenium (find_elements_by_selector)

Website:https://www.xiami.com/favorite/88955424

however I try ,it failed~ the return select list is empty ,I don’t know why

The music website is base in ajax

Below is what the empty select look like in the console, I am so sad

[]
[]
[]
situation(song amount)(singer amount)(album amount): 0 0 0

And this is my original script

from selenium import webdriver
import mysql.connector
import time

class xiami():
   def __init__(self):
       self.url='https://www.xiami.com/favorite/88955424'

   def turn_on_url(self):
       self.browser = webdriver.Chrome()
       self.browser.get(self.url)
       self.browser.maximize_window()
       self.browser.implicitly_wait(8)

   def get_page_data(self):#get infos of singers and songs and albums

       self.song_names=self.browser.find_elements_by_css_selector('div[class="song-name em"] a[data-spm-anchor-id="a2oj1.12028340.0.0"]')#song name
       self.singers=self.browser.find_elements_by_css_selector('div[class="singers"] a[data-spm-anchor-id="a2oj1.12028340.0.0"]')
       self.albums=self.browser.find_elements_by_css_selector('div[class="album"] a[data-spm-anchor-id="a2oj1.12028340.0.0"]')
       print(self.song_names)
       print(self.singers)
       print(self.albums)
       print('situation(song amount)(singer amount)(album amount):',len(self.song_names),len(self.singers),len(self.albums))

if __name__=='__main__':
   xiami=xiami()
   xiami.turn_on_url()
   xiami.get_page_data()


undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Super-ilad
  • 101
  • 11

1 Answers1

0

To crawl the Singer Names together with Song Names and Albums from my own favorite list through Selenium you can use the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument("start-maximized") 
    driver = webdriver.Chrome(options=chrome_options)
    driver.get("https://www.xiami.com/favorite/88955424")
    song_names = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table//tbody//tr[@class='odd' or @class='even']//div[contains(@class, 'song-name')]/a")))]
    singers = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table//tbody//tr[@class='odd' or @class='even']//div[@class='singers']/a")))]
    albums = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table//tbody//tr[@class='odd' or @class='even']//div[@class='album']/a")))]
    for a,b,c in zip(song_names, singers, albums):
        print("Song {} is by {} from {} album.".format(a, b, c))
    
  • Console Output:

    Song Reckless is by Arin Ray from Platinum Fire (Deluxe) album.
    Song Grey Area is by Jerry Paper from Like a Baby album.
    Song Open Up the Door is by Weyes Blood from Truelove's Gutter album.
    Song Looking For Your Love is by Richard Hawley from Looking For Your Love album.
    Song Blue Lips is by HUM?NIGHTM?RE from Invitation to Her's album.
    Song Nicolo Paganini: Introduction and Variations on Nel cor piu non mi sento from Paisiello's La molinar is by Her's from Paganini: In cor più non mi sento; 3 Duetti; Divertimenti carnevaleschi album.
    Song Layin Low is by Niccolò Paganini from MFSB album.
    Song Don't Start Givin' Up is by Stefan Milenkovic from Flashes Of Life album.
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Hi~ i later discover that there is one problem of your code,if you run it right now , you will find singer list show :`selenium.common.exceptions.TimeoutException: Message: ` ,it is time out ,i dont know why,could you please check it again?BTW,i dont really understand 'innerHTML',why you put it under 'get_attribute' method?Thanks! – Super-ilad Aug 05 '19 at 07:43
  • @Super-ilad Checkout the updated answer and let me know the status. – undetected Selenium Aug 05 '19 at 09:46
  • sorry, it doesn't work again, it may not be the problem of code but my computer, well, I cant sure it, so I post my error in console:(below)Thanks – Super-ilad Aug 05 '19 at 10:48
  • .Oh,yeah,i try to replace singer line 'visibility' with 'presence' ,then it succeeds!well,anyway,thanks a lot! good man ! – Super-ilad Aug 05 '19 at 10:56