Python selenium multiple click and back with same xpath

Question

I try to get articles under same date with different page, there are three' Corp,FIG,SSA', I need to click one and back and click the other, but the xpath for element is same, so I am wondering is there some 'smart' way to do that instead of copying again and again?

And I also want the website be back if there is no article in the page, should I use 'try'?

Surprisingly, I got the results twice in the csv file, like aabb... no idea why

driver.get('https://www.globalcapital.com/Asia/Bonds')
Corp = driver.find_element_by_link_text("Corp")
Corp.click()
driver.implicitly_wait(10)
links=[link.get_attribute('href') for link in driver.find_elements_by_xpath("//div[contains(text(),'28 Jan 2021')]/preceding::a[2]")]
titles = [link.text for link in driver.find_elements_by_xpath("//div[contains(text(),'28 Jan 2021')]/preceding-sibling::h3/a")]
for link in links:
    for title in titles:
        dataframe = pd.DataFrame({'col1':title,'col2':link},index=[0])
        dataframe.to_csv('hi.csv',mode='a+',header=False,index=False,encoding='utf-8-sig')
driver.back()
FIG = driver.find_element_by_link_text("FIG")
FIG.click()
driver.implicitly_wait(10)
links=[link.get_attribute('href') for link in driver.find_elements_by_xpath("//div[contains(text(),'28 Jan 2021')]/preceding::a[2]")]
titles = [link.text for link in driver.find_elements_by_xpath("//div[contains(text(),'28 Jan 2021')]/preceding-sibling::h3/a")]
for link in links:
    for title in titles:
        dataframe = pd.DataFrame({'col1':title,'col2':link},index=[0])
        dataframe.to_csv('hi.csv',mode='a+',header=False,index=False,encoding='utf-8-sig')
driver.back()
SSA = driver.find_element_by_link_text("SSA")
SSA.click()
driver.implicitly_wait(10)

score 0 · Answer 1 · answered Feb 02 '21 at 14:55

0

You're iterating over titles multiple times (one time for each link). You need to iterate over link, title pairs:

for link, title in zip(links, titles):
    dataframe = pd.DataFrame({'col1':title,'col2':link},index=[0])
    dataframe.to_csv('hi.csv',mode='a+',header=False,index=False,encoding='utf-8-sig')

answered Feb 02 '21 at 14:55

JaSON

4,843
2
8
15

Thanks Jason, it works! could it be possible if I wanna reduce steps in this question? – Joyce Feb 03 '21 at 01:06
Hi Jason, somehow the href return :Japascipe: not the hrefs, how does it happen.. – Joyce Feb 03 '21 at 07:12
@Cathy `@href` might not contain URL. it can be something like [`javascript:void(0)`](https://stackoverflow.com/questions/1291942/what-does-javascriptvoid0-mean) – JaSON Feb 03 '21 at 08:46
I used`driver.get('http://www.chinamoney.com.cn/chinese/zjfxzx/?tbnm=%E6%9C%80%E6%96%B0&tc=null&isNewTab=1') links=[link.get_attribute('href') for link in driver.find_elements_by_xpath("//a[contains(@title,'中期票据') and not(contains(@title,'申购说明')) and not(contains(@title,'公告'))]")]` think it do contain href, but somehow returns java sth – Joyce Feb 03 '21 at 09:15
@Cathy I can't check it - I can't find any node with title that contains `"中期票据"` on provided page – JaSON Feb 03 '21 at 09:20
let me change to `links=[link.get_attribute('href') for link in driver.find_elements_by_xpath("//a[contains(@title,'同业存单') and not(contains(@title,'申购说明')) and not(contains(@title,'公告'))]")]` – Joyce Feb 03 '21 at 09:25
@Cathy here it is `华融湘江银行股份有限公司2021年第009期同业存单` . No URL – JaSON Feb 03 '21 at 09:35
er right... is it possible to get the href link? or if I click into it, should it be possible to return the link? – Joyce Feb 03 '21 at 09:42
@Cathy I don't think it's possible. You can only try to click it – JaSON Feb 03 '21 at 09:56

Python selenium multiple click and back with same xpath

1 Answers1