0

I'm trying to put data in a table and convert it to an excel file but i can't get the table i want, the excel file contains just the last hotel in my list if anyone can help or explain what's the cause of it, this is the final output i want to get

enter image description here

this the data i'm printing

Byzance Nabeul : Chambre Double - {'All Inclusive soft': ('208', '166', '25%'), 'Demi Pension': 138}
Palmyra Club Nabeul Nabeul : Double Standard - {'All Inclusive soft': ('225', '180', '25%')}

and here is my code

#!/usr/bin/env python
# coding: utf-8
import time
from time import sleep
import ast
import xlwt
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait, Select
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd


driver = webdriver.Chrome("C:\\Users\\marketing2\\Documents\\chromedriver.exe")
driver.get('https://tn.tunisiebooking.com/')

def exists(xpath):
        try:
            driver.find_element_by_id(xpath);
        except NoSuchElementException:
            return "false"
        else:
            return "true"

# params to select
params = {
    'destination': 'Nabeul',
    'date_from': '11/09/2021',
    'date_to': '12/09/2021',
    'bedroom': '1'
}

# select destination
destination_select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, 'ville_des'))))
destination_select.select_by_value(params['destination'])

# select bedroom
bedroom_select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'select_ch'))))
bedroom_select.select_by_value(params['bedroom'])

# select dates
script = f"document.getElementById('checkin').value ='{params['date_from']}';"
script += f"document.getElementById('checkout').value ='{params['date_to']}';"
script +=  f"document.getElementById('depart').value ='{params['date_from']}';"
script += f"document.getElementById('arrivee').value ='{params['date_to']}';"
driver.execute_script(script)

# submit form
btn_rechercher = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="boutonr"]')))
btn_rechercher.click()
    
#----------   
if (exists('plus_res')=="true"):
   
    btn_plus = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'plus_res')))
    btn_plus.click()
    sleep(10)
else :
    pass
    
urls = []    
records = []
hotels = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'produit_affair')]")))

for hotel in hotels:
    link = hotel.find_element_by_xpath(".//span[@class='tittre_hotel']/a").get_attribute("href")
    urls.append(link)

for url in urls:
    driver.get(url)
       
    def existsElement(xpath):
        try:
            driver.find_element_by_id(xpath);
        except NoSuchElementException:
            return "false"
        else:
            return "true"
   
    if (existsElement('result_par_arrangement')=="false"):
   
        btn_t = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="moteur_rech"]/form/div/div[3]/div')))

        btn_t.click()
        sleep(10)
    else :
        pass
               
    
    try:
        name = str(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div[@class='bloc_titre_hotels']/h2"))).text)
        arropt = driver.find_element_by_xpath("//div[contains(@class,'line_result')][1]")
        opt = str(arropt.find_element_by_tag_name("b").text)
        num = len(arropt.find_elements_by_tag_name("option"))
        optiondata = {}
        achats = {}
        marges= {}
        selection = Select(driver.find_element_by_id("arrangement"))

        for i in range(num):
            try:
                selection = Select(driver.find_element_by_id("arrangement"))
                selection.select_by_index(i)
                time.sleep(2)

                arr = driver.find_element_by_xpath("//select[@id='arrangement']/option[@selected='selected']").text
                prize = driver.find_element_by_id("prix_total").text

                optiondata[arr] = (int(prize))

                btn_passe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="resultat"]/div/form/div/div[2]/div[1]/div[2]/div[2]/div')))
                btn_passe.click()



                # params to select
                params = {
                            'civilite_acheteur': 'Mlle',
                            'prenom_acheteur': 'test',
                            'nom_acheteur': 'test',
                            'e_mail_acheteur': 'test@gmail.com',
                            'portable_acheteur': '22222222',
                            'ville_acheteur': 'Test',
                        }

                # select civilite
                civilite_acheteur = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, 'civilite_acheteur'))))
                civilite_acheteur.select_by_value(params['civilite_acheteur'])

                # saisir prenom 
                script  = f"document.getElementsByName('prenom_acheteur')[0].value ='{params['prenom_acheteur']}';"
                script += f"document.getElementsByName('nom_acheteur')[0].value ='{params['nom_acheteur']}';"
                script += f"document.getElementsByName('e_mail_acheteur')[0].value ='{params['e_mail_acheteur']}';"
                script += f"document.getElementsByName('portable_acheteur')[0].value ='{params['portable_acheteur']}';"
                script += f"document.getElementsByName('ville_acheteur')[0].value ='{params['ville_acheteur']}';"
                driver.execute_script(script)

                # submit form
                btn_agence = driver.find_element_by_id('titre_Nabeul')
                btn_agence.click()

                btn_continuez = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'boutonr')))
                btn_continuez.click()

                achat = str(int(driver.find_element_by_xpath('/html/body/header/div[2]/div[1]/div[1]/div[4]/div[2]/div[2]').text.replace(' TND', '')))

                achats[arr]=achat

                marge =str(int(((float(prize) - float(achat)) / float(achat)) * 100))+"%";
                marges[arr]=marge
                optiondata[arr]=prize,achat,marge
                
                
                driver.get(url)
                btn_display = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="moteur_rech"]/form/div/div[3]/div')))

                btn_display.click()
                sleep(10)
               

            except StaleElementReferenceException:
                pass

            

    except NoSuchElementException:
        pass
    

    s="{} : {} - {}".format(name, opt, optiondata)
    

    ds = []

    for l in s.splitlines():
            d = l.split("-")

            if len(d) > 1:
                df = pd.DataFrame(ast.literal_eval(d[1].strip()))
                ds.append(df)
    for df in ds:
            df.reset_index(drop=True, inplace=True)

    df = pd.concat(ds, axis= 1)

    cols = df.columns

    cols = [((col.split('.')[0], col)) for col in df.columns]

    df.columns=pd.MultiIndex.from_tuples(cols)

    print(df.T)
    df.to_excel("v.xlsx")

it's displaying "Palmyra club Nabeul Nabeul" only How can i solve the probleme please this the final and most important part and thank you in advance.

enter image description here

HiFAR
  • 48
  • 1
  • 13
  • 1
    You're printing a transposed DataFrame with `df.T`, but then writing the non-transposed DataFrame to excel. Use `df.T.to_excel()` to write the version you're printing. – gshpychka Aug 30 '21 at 13:03

1 Answers1

0

Bonjour HiFAR, Answer below:

#!/usr/bin/env python
# coding: utf-8

import pandas as pd


driver = webdriver.Chrome("C:\\Users\\marketing2\\Documents\\chromedriver.exe")
driver.get('https://tn.tunisiebooking.com/')

def exists(xpath):
        try:
            driver.find_element_by_id(xpath);
        except NoSuchElementException:
            return "false"
        else:
            return "true"

# params to select
params = {
    'destination': 'Nabeul',
    'date_from': '11/09/2021',
    'date_to': '12/09/2021',
    'bedroom': '1'
}

# select destination
destination_select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, 'ville_des'))))
destination_select.select_by_value(params['destination'])

# select bedroom
bedroom_select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'select_ch'))))
bedroom_select.select_by_value(params['bedroom'])

# select dates
script = f"document.getElementById('checkin').value ='{params['date_from']}';"
script += f"document.getElementById('checkout').value ='{params['date_to']}';"
script +=  f"document.getElementById('depart').value ='{params['date_from']}';"
script += f"document.getElementById('arrivee').value ='{params['date_to']}';"
driver.execute_script(script)

# submit form
btn_rechercher = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="boutonr"]')))
btn_rechercher.click()
    
#----------   
if (exists('plus_res')=="true"):
   
    btn_plus = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'plus_res')))
    btn_plus.click()
    sleep(10)
else :
    pass
    
urls = []    
records = []
ds = []

hotels = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'produit_affair')]")))

for hotel in hotels:
    link = hotel.find_element_by_xpath(".//span[@class='tittre_hotel']/a").get_attribute("href")
    urls.append(link)

for url in urls:
    driver.get(url)
       
    def existsElement(xpath):
        try:
            driver.find_element_by_id(xpath);
        except NoSuchElementException:
            return "false"
        else:
            return "true"
   
    if (existsElement('result_par_arrangement')=="false"):
   
        btn_t = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="moteur_rech"]/form/div/div[3]/div')))

        btn_t.click()
        sleep(10)
    else :
        pass
               
    
    try:
        name = str(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div[@class='bloc_titre_hotels']/h2"))).text)
        arropt = driver.find_element_by_xpath("//div[contains(@class,'line_result')][1]")
        opt = str(arropt.find_element_by_tag_name("b").text)
        num = len(arropt.find_elements_by_tag_name("option"))
        optiondata = {}
        achats = {}
        marges= {}
        selection = Select(driver.find_element_by_id("arrangement"))

        for i in range(num):
            try:
                selection = Select(driver.find_element_by_id("arrangement"))
                selection.select_by_index(i)
                time.sleep(2)

                arr = driver.find_element_by_xpath("//select[@id='arrangement']/option[@selected='selected']").text
                prize = driver.find_element_by_id("prix_total").text

                optiondata[arr] = (int(prize))

                btn_passe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="resultat"]/div/form/div/div[2]/div[1]/div[2]/div[2]/div')))
                btn_passe.click()



                # params to select
                params = {
                            'civilite_acheteur': 'Mlle',
                            'prenom_acheteur': 'test',
                            'nom_acheteur': 'test',
                            'e_mail_acheteur': 'test@gmail.com',
                            'portable_acheteur': '22222222',
                            'ville_acheteur': 'Test',
                        }

                # select civilite
                civilite_acheteur = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, 'civilite_acheteur'))))
                civilite_acheteur.select_by_value(params['civilite_acheteur'])

                # saisir prenom 
                script  = f"document.getElementsByName('prenom_acheteur')[0].value ='{params['prenom_acheteur']}';"
                script += f"document.getElementsByName('nom_acheteur')[0].value ='{params['nom_acheteur']}';"
                script += f"document.getElementsByName('e_mail_acheteur')[0].value ='{params['e_mail_acheteur']}';"
                script += f"document.getElementsByName('portable_acheteur')[0].value ='{params['portable_acheteur']}';"
                script += f"document.getElementsByName('ville_acheteur')[0].value ='{params['ville_acheteur']}';"
                driver.execute_script(script)

                # submit form
                btn_agence = driver.find_element_by_id('titre_Nabeul')
                btn_agence.click()

                btn_continuez = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'boutonr')))
                btn_continuez.click()

                achat = str(int(driver.find_element_by_xpath('/html/body/header/div[2]/div[1]/div[1]/div[4]/div[2]/div[2]').text.replace(' TND', '')))

                achats[arr]=achat

                marge =str(int(((float(prize) - float(achat)) / float(achat)) * 100))+"%";
                marges[arr]=marge
                optiondata[arr]=prize,achat,marge
                
                
                driver.get(url)
                btn_display = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="moteur_rech"]/form/div/div[3]/div')))

                btn_display.click()
                sleep(10)
               

            except StaleElementReferenceException:
                pass

            

    except NoSuchElementException:
        pass
    

    s="Byzance Nabeul : Chambre Double - {'All Inclusive soft': ('208', '166', '25%'), 'Demi Pension': 138}\n Palmyra Club Nabeul Nabeul : Double Standard - {'All Inclusive soft': ('225', '180', '25%')}"
    


    cols = []
    for l in s.splitlines():
            d = l.split("-")

            if len(d) > 1:
                df = pd.DataFrame(ast.literal_eval(d[1].strip()))
                cols = df.columns
                cols = [((d[0], col)) for col in df.columns]
                df.columns=pd.MultiIndex.from_tuples(cols)
                ds.append(df)

        
for df in ds:
    df.reset_index(drop=True, inplace=True)

df = pd.concat(ds, axis= 1)

#df.columns=pd.MultiIndex.from_tuples(cols)

print(df.T)
df.T.to_csv(r"v.csv")

This provides the following output:excel screenshot with the data below:

hotel1 = "Byzance Nabeul : Chambre Double - {'All Inclusive soft': ('208', '166', '25%'), 'Demi Pension': 138}\n Palmyra Club Nabeul Nabeul : Double Standard - {'All Inclusive soft': ('225', '180', '25%')}"
hotel2 = "Hotel Fancy Nabeul : Chambre Ten - {'All Inclusive soft': ('221', '150', '25%'), 'Demi Pension': 138}\n Palmyra Club Germany Nabeul : Double single - {'All Inclusive soft': ('221', '180', '25%')}"

Gregory

GPRSmith
  • 299
  • 1
  • 7
  • thank you for your response , this the data : - Byzance Nabeul | Chambre Double : {'All Inclusive soft': ('208', '166', '25%'), 'Demi Pension': ('138', '110', '25%'), 'Petit Dejeuner': ('103', '82', '25%'), 'Pension Complete': ('173', '138', '25%')} – HiFAR Aug 28 '21 at 10:33
  • 1
    Thanks HiFAR - can you also share the values you have for: name, opt, optiondata and ast please? :) – GPRSmith Aug 28 '21 at 10:42
  • Hi HiFAR, I have amended and this should resolve your issue. Please note the string s="" was used to make this reproduceable. – GPRSmith Aug 30 '21 at 19:17
  • Hi @HiFAR - can you run the code and confirm the answer meets your ask? :) – GPRSmith Aug 31 '21 at 09:08
  • Hi @GPRSBirkbeck ! it's working just when we put the the string " s "manually but when i run the code it's still showing the last hotel only i'm trying to fix the problem to put the correct answer and confirm yours too – HiFAR Aug 31 '21 at 09:17
  • Hi @HiFAR - can you please add the following line of code after: " s="{} : {} - {}".format(name, opt, optiondata)": print(s) -> that should help debug, as I can't run your earlier code. If you share the output we can get to the bottom of this. – GPRSmith Aug 31 '21 at 09:56
  • thank you i really need help i'm supposed to give the work to my manager today :( i've added image to the output , to explain more i'm printing "s" in a foor loop every time it's showing an hotel so when i open a xlsx file it copies just the last one in the loop – HiFAR Aug 31 '21 at 10:27
  • Hi HiFAR, move the final dataframe (df) outside the for loop - so add the creation of df=[] above or below urls = [] or records = []. Then - unindent df.to_excel("v.xlsx") so it's not in the for loop - do you need me to type this up in the answer? – GPRSmith Aug 31 '21 at 10:51
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/236599/discussion-between-gprsbirkbeck-and-hifar). – GPRSmith Aug 31 '21 at 11:05