1

I am trying to go back as far as I can in the tweets history of a twitter account (technical blogger account that I would like to read since its inception).

For that I have two options:

-buy access to Search APIs from Twitter (NO!!)

-use Selenium and scroll down through the tweets of that account and collect the messages in a file, read them later

I did read this StaleElementReference Exception in PageFactory

Below is the code. My issue is that I get a StaleElementReference Exception which I understand that it is due to page changes (refresh).

Since I am scrolling down I am not sure how I can prevent that from happening. Any suggestions on how I can improve the code while still achieving what I want ?

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

driver = webdriver.Chrome('c:/Utils/ChromeDriver/chromedriver.exe')
driver.get("https://twitter.com/realpython/with_replies")
driver.implicitly_wait(0) 

time.sleep(10)  #wait for the chrome window to show up

SCROLL_PAUSE_TIME = 1.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
tweets=[]
tweets_file=open("tweets.txt",'a',encoding="utf-8")
while True:
    # Scroll down to bottom
    if i==0:
        SCROLL_PAUSE_TIME = 3  # give it more time in the first iteration

    else:
        SCROLL_PAUSE_TIME = 1
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    elements=driver.find_elements_by_tag_name("article")
    for element in elements:
        tweets_file.write(element.text)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height


tweets_file.close()
0m3r
  • 12,286
  • 15
  • 35
  • 71
MiniMe
  • 1,057
  • 4
  • 22
  • 47

1 Answers1

0

Try to increase sleep time on scroll pause time

SCROLL_PAUSE_TIME = 3
0m3r
  • 12,286
  • 15
  • 35
  • 71