1

I use the following block of code to scrape a website

driver = webdriver.Chrome(executable_path=r'C:/Users/USER/Downloads/chromedriver_win32/chromedriver.exe')
url = 'https://mamikos.com/cari/ugm/all/bulanan/0-15000000'
driver.get(url)

kamar = driver.find_elements_by_class_name('kost-rc__content')

for desc in kamar :
    nama = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
    kecamatan = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
    harga = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
    print(nama, kecamatan, harga)

After running it, the output only seems to print the first result of that page. I've tried to change the xpath to this

for desc in kamar :
    nama = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
    kecamatan = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
    harga = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
    print(nama, kecamatan, harga)

But it only gives out an error, please help.

Side note : google chrome Version 95.0.4638.69 (Official Build) (64-bit) and driver used was ChromeDriver 95.0.4638.69

2 Answers2

0

To scrape the Name, Info and Price information you can use the Locator Strategies:

Code Block:

driver.get("https://mamikos.com/cari/ugm/all/bulanan/0-15000000")
names = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__name')]")))]
infos = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__location')]")))]
prices = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='rc-price__real']//span[contains(@class, 'rc-price__text')]")))]
for i,j,k in zip(names, infos, prices):
    print(f"Name:{i} Title:{j} Price:{k}")
driver.quit()

Console Output:

Name:Kost Singgahsini Sakura Karanggayam Sleman Yogyakarta Title:Kecamatan Depok Price:Rp1.370.000
Name:Kost Singgahsini Granada UGM Yogyakarta Title:Kecamatan Depok Price:Rp1.790.000
Name:Kost Kurnia Terban Tipe A UGM Yogyakarta RMZ Title:Kecamatan Gondokusuman Price:Rp606.000
Name:Kost Singgahsini Maleo UGM Kaliurang Yogyakarta Title:Kecamatan Depok Price:Rp1.973.000
Name:Kost AB-AE Tipe B Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.710.000
Name:Kost AB-AE Tipe A Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.425.000
Name:Kost Pogung Familia Tipe C Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.900.000
Name:Kost Pogung Familia Tipe B Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.710.000
Name:Kost Pogung Familia Tipe A Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.425.000
Name:Kost Hanung Tipe B UGM Yogyakarta RMZ Title:Mlati Price:Rp736.000
Name:Kost Apik Tapak Dara Tipe B Deresan Yogyakarta Title:Depok Price:Rp1.620.000
Name:Kost Singgahsini Putri Maoni Tipe A Gejayan Yogyakarta Title:Depok Price:Rp1.520.000
Name:Kost Singgahsini Omah Khiar Tipe F Karang Gayam Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Apik Tapak Dara Tipe C Deresan Yogyakarta Title:Kecamatan Depok Price:Rp2.205.000
Name:Kost Singgahsini Putri Maoni Tipe B Gejayan Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Wisma Yudhistira Tipe C Mlati Sleman Yogyakarta Title:Mlati Price:Rp2.250.000
Name:Kost Pondok Bugenvil 3 Caturtunggal Depok Sleman Title:Depok Price:Rp1.800.000
Name:Kost Pranasmara 34C Tipe B Depok Sleman Title:Depok Price:Rp1.200.000
Name:Kost Pondok Bugenvil 2 Caturtunggal Depok Sleman Yogyakarta Title:Depok Price:Rp1.800.000
Name:Kost Rahayu Residence Tipe C Depok Sleman Yogyakarta Title:Depok Price:Rp1.150.000
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • 1
    Hi, thanks for the code, helped me a lot. Quick question, do you have any recommendations about where to start learning web scraping using selenium ? since I'm new to this and would like to know the basics before advancing any further with web scraping. Thank you in advance – Christo Ray Nov 13 '21 at 04:11
  • The best way is to start with [Frequent 'selenium' Questions](https://stackoverflow.com/questions/tagged/selenium?tab=Frequent) within StackOverflow. – undetected Selenium Nov 20 '21 at 07:08
0

Here is the complete code in c# to solve your problem. You can adapt it to your language, especially the xpath part.

var els = driver.findElements(By.Xpath("//div[@class='kost-rc__content']"));

foreach(var el in els){
var nama = el.findElement(By.Xpath(".//span[@class='rc-info__name bg-c-text bg-c-text--title-4 ']"));
console.log("nama:"+nama.Text());

var kecamatan = el.findElement(By.Xpath(".//span[@class='rc-info__location bg-c-text bg-c-text--body-1 ']"));
console.log("kecamatan:"+kecamatan.Text());
}
Herahadi An
  • 198
  • 14