0

I'm trying to crawl Instagram comments through Selenium.

And i have to press the reply buttons to crawl all the comments. ("답글 보기")

"답글 보기" button

So, i tried to find the xpath of the button, but it's failing.

Here is my attempts

reply_links = driver.find_elements(By.XPATH, "//button[@class='_a9yi']")
reply_links_v2 = driver.find_elements(By.XPATH, '//button[@class="_acan _acao _acas _aj1- _a9yi"]')
reply_links_v3 = driver.find_elements(By.XPATH, '//button[contains(text(),"답글 보기")]')
reply_links_v4 = driver.find_elements(By.XPATH, '//button[@class="_acan _acao _acas _aj1-"]')

output

[]
[]
[]
[<selenium.webdriver.remote.webelement.WebElement (session="4322025fb4271b9a32de3eb510cde52c", element="431ff43b-673a-40d7-9f42-dc91d3ed09e2")>, <selenium.webdriver.remote.webelement.WebElement (session="4322025fb4271b9a32de3eb510cde52c", element="39f44102-6c24-47eb-b8fa-94d61bb4cdc3")>, <selenium.webdriver.remote.webelement.WebElement (session="4322025fb4271b9a32de3eb510cde52c", element="a8a4dc3b-fce0-4633-bc39-27200f4ea40a")>]

The first, second, third Xpath expression failed to locate the element, resulting in an empty list being returned. Last one returns values and clicks work, but it contains buttons that I don't want (for example, follow button).

Here is my button's HTML

<div class="_ab8w  _ab94 _ab99 _ab9f _ab9k _ab9p _abcm">
<button class="_acan _acao _acas _aj1-" type="button">
<div class="_a9yh">
</div><span class="_a9yi">답글 보기(1개)</span></button></div>

It's my first time posting a question on Stack Overflow, so I looked it up, but I don't know if I wrote it correctly.

Please let me know if there is anything missing. Thank you!

Jebi
  • 1
  • 1
  • it looks like you want the span: `//span[contains(text(),"답글 보기")]` – pguardiario Feb 27 '23 at 04:59
  • omg... I've been understanding that if I bring the button tag, I'll bring the span underneath it. I think I asked too easy question because of my poor understaning of html.. thank you! its working – Jebi Feb 27 '23 at 05:45
  • yes it will but the span text won't be part of the button's text() in the xpath expression (that will be empty) – pguardiario Feb 27 '23 at 05:51
  • I thought I left a comment, but it's gone! Thanks for your kind reply, I understand what the problem was, and the code is working. – Jebi Feb 27 '23 at 12:43

1 Answers1

0

I don't know how to solve this problem via Selenium codes, but selenium could call JavaScript codes to do something, so you can use JavaScript to solve this problem, first you can get the outter div element via var outterdiv = document.querySelector("div[class='_ab8w _ab94 _ab99 _ab9f _ab9k _ab9p _abcm']") ,then you can get the button via this JavaScript code: var button = outterdiv.querySelector("button").You can call the click function of the button: button.click().

It looks like those classes will be changed everytime you refresh the page, so there is a way to use xpath in JavaScript, you can use it to replace the first JavaScript code. Is there a way to get element by XPath using JavaScript in Selenium WebDriver?

Hope my answer could help you.

Gray-Ice
  • 23
  • 5
  • thanks for your response! (I'm not familiar with JavaScript, so I can misunderstand your response) The problem I have is, there are several buttons with a class called "_acano_acas_aj1" and they appear differently from time to time(maybe same at '_ab8w _ab94 _ab99 _ab9f _ab9k _ab9p _abcm']"). Last time, I used the click method from the first index of class _acano_acas_aj1 (to exclude the follow button), and in some cases there was another button before the follow button. So, I'm looking for a way to specify a specific button. (include _a9yh class or something can distinguishable) – Jebi Feb 27 '23 at 04:22
  • @Jebi That's diffcult to solve this problem if you are not familiar with JavaScript, because in JavaScript you can first locate the parent element of that button, then get all buttons under the parent element,This step could help you to get closely to your target button. Maybe you can see if there is the same operation in selenium. Then, to solve the second problem "in some cases there was another button before the follow button", if the quantity of other buttons is fixed, you can write python code to determine if the number of buttons was plus one, if it is, you can plus one to the subscript – Gray-Ice Feb 27 '23 at 09:35
  • @Jebi I used to worked as a RPA programmer to write automation tools, during my work I met this problem so many times and in the end I found that there is almost no way to get my target button simply. So I suggest you try to solved this problem from other ways. If you found the way to solve this problem simply, hope you can tell me, thank you~ – Gray-Ice Feb 27 '23 at 09:41
  • As you can see in the comments above, my problem was that I tried to find the button I wanted in the high tag (button contains with text "답글 보기"). It was solved by writing the same code by use the sub tag span! reply_links = driver.find_elements(By.XPATH, '//span[contains(text(),"답글 보기")]') – Jebi Feb 27 '23 at 13:18
  • @Jebi This solution is really cool, thank you to tell me this solution! – Gray-Ice Feb 28 '23 at 01:08