All topics in twitter can be found in this link I would like to scrape all of them with each of the subcategory inside.
BeautifulSoup doesn't seem to be useful here. I tried using selenium, but I don't know how to match the Xpaths that come after clicking the main category.
from selenium import webdriver
from selenium.common import exceptions
url = 'https://twitter.com/i/flow/topics_selector'
driver = webdriver.Chrome('absolute path to chromedriver')
driver.get(url)
driver.maximize_window()
main_topics = driver.find_elements_by_xpath('/html/body/div[1]/div/div/div[1]/div[2]/div/div/div/div/div/div[2]/div[2]/div/div/div[2]/div[2]/div/div/div/div/span')
topics = {}
for main_topic in main_topics[2:]:
print(main_topic.text.strip())
topics[main_topic.text.strip()] = {}
I know I can click the main category using main_topics[3].click()
, but I don't know how I can maybe recursively click through them until I find only the ones with Follow
on the right.