2

I'm making my first experience with Selenium and doing a few tutorials on a well-known video platform. It works quite reliable most of the time. However, I had a problem with a few pages that the CSV is created but no export of the data is made. The CSV is "touched", but it does not export the data that is displayed in a normal print.

Can anyone help me where the problem is with this script?

#_*_coding: utf-8_*_


from selenium import webdriver
import selenium
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import csv
import os

os.chdir("C:\Selenium")
PATH = "chromedriver.exe"
driver = webdriver.Chrome(PATH)

driver.get("https://twitter-trends.iamrohit.in/")

try: 
    main = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.CLASS_NAME, "panel-body"))
    )
 
    main = (main.text)
 
    f = open('twitter.csv', 'wb')
    print(main, file = f)
    f.close()
    
    #print(main)

except:
    driver.quit()

driver.quit()

Python Version 3.7.4, Selenium Versionm 3.141.0, Windows 10

Forelli
  • 19
  • 3

3 Answers3

1

Debugging your code you are getting the data correctly.

   main = (main.text)
   print(main)
   f = open('twitter.csv', 'wb')

So the error is when you are writing to the output file. Replacing your code for

main = (main.text)

with open('twitter.txt', 'wb', encoding='utf-8') as file1:
# Writing data to a file
    file1.writelines(main)

will work, if you check the print you have Chinese characters that will make fail the writing in the output file.

Enrique Benito Casado
  • 1,914
  • 1
  • 20
  • 40
1

To scrape the Twitter Trends - Worldwide table you can use DataFrame from Python Pandas and write it to a csv file using the following Locator Strategies:

Code Block:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd

driver.get("https://twitter-trends.iamrohit.in/")
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//b[text()='Note:']"))))
headers = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "thead > tr > th")))]
ranks = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody//tr//descendant::th[1]")))]
topics = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody/tr//descendant::th[2]/a")))]
volumes = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody/tr//descendant::th[3]")))]
df = pd.DataFrame(data=list(zip(ranks, topics, volumes)), columns=headers)
df.to_csv(r'C:\Data_Files\output_files\twitter.csv', index=False)
driver.quit()

CSV Snapshot:

twitter


References

You can find a couple of relevant detailed discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC import pandas as pd

driver.get("https://twitter-trends.iamrohit.in/") driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//b[text()='Note:']")))) headers = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "thead > tr > th")))] ranks = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody//tr//descendant::th1")))] topics = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody/tr//descendant::th[2]/a")))] volumes = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='twitter-trends']//tbody/tr//descendant::th[3]")))] df = pd.DataFrame(data=list(zip(ranks, topics, volumes)), columns=headers) df.to_csv(r'C:\Data_Files\output_files\twitter.csv', index=False) driver.quit() May this useful Twitter Trend

  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 22 '22 at 12:25
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/32532399) – Leonid Pavlov Aug 25 '22 at 19:06