2

Im using Python + Selenium + Splinter + Firefox to create an interactive web crawler.

The python script offers the options, then Selenium opens Firefox and sends some orders.

Right now, I need to let the python script know the web element that the user wants to interact with.

The method I currently use is:

Right-click the item in the website (Firefox), click 'inspect element', then click in the Firefox inspector, click 'copy HTML', then feed it manually to the script, which will then be able to go on.

But for obvious reasons I feel this process is far from perfection.

I know nothing of javascript, but after reading other questions I get the feeling that javascript could actually be the solution.

Splinter allows to run javascript and pick up the returning values into the python script, so, theoretically:

Would it be possible to run a javascript code that would return the html code of the next element the user clicks? So the named method would only be right-clicking the desired element?


Clarification for Amey's comment:

The python script opens a Firefox window, which control is still retained from the script. And with splinter, javascript code can be executed and waited upon completion / information return. This means that the python script can ask the user to click or right-click in the Firefox window that it owns, so the aim would be to launch a javascript that would "catch" which element the user clicks on.

Would that be enough for javascript to catch the desired element?

Community
  • 1
  • 1
I want badges
  • 6,155
  • 5
  • 23
  • 38
  • 1
    Javascript would still need a way to locate the "next desired element", to return identifiers that you could use with Selenium. From what I understand of your question, you could retrieve the entire HTML of a page(using JS or Selenium), and parse it using some HTML parser, and scrape what you need. – Amey Jan 22 '14 at 22:42
  • Thanks for your answer. I expanded the description, since Im not quite sure the situation was clear. If I retrieved the entire HTML and parsed the resulting code, I wouldnt be able to do what I want: to catch a user click on an element and use that code directly, without having to specifically search for it through ID, name, css, etc. – I want badges Jan 22 '14 at 22:57
  • This looks similar yet used in a different context: http://stackoverflow.com/questions/17157342/pure-js-detect-if-im-clicking-an-element-within-an-element Is that the short of solution I should adapt to my code? – I want badges Jan 23 '14 at 00:06
  • 1
    I do see what you are trying to achieve, but I dont see why. As a web crawler I would imagine minimum user interaction. But I am sure you have your reasons. With that in mind, I personally do not see a better way than the link you have provided. Basically a click listener and then a click handler to return desired output. – Amey Jan 23 '14 at 00:12

1 Answers1

1

This was an interesting question. My strategy was to use Javascript to add listeners to the elements you're targeting. Since you didn't specify what types of elements, I used links. This could easily be adapted though.

When an element is clicked, the listener creates a new page element with an ID you specify and sets the value attribute to the relevant information.

Then, assuming you've set driver.implicitly_wait, you can just wait for the element to appear.

driver.execute_script("for(var i = 0; i < document.links.length; i++){document.links[i].onclick = function clicked(){var e = document.createElement('a'); e.setAttribute('id','myUniqueID'); e.setAttribute('value', this); document.getElementsByTagName('body')[0].appendChild(e);};}")

clicked = driver.find_element_by_id('myUniqueID').get_attribute('value')
irrelephant
  • 366
  • 3
  • 10
  • Amazing!! Tested and works, exactly solving the question! My intention is to use it for any item in the website, for being able to reproduce those actions later (I do things once, the scraper is automatically-created =) So I need to match all the items in the document, but that adaptation will be a joke compared to creating the script you just shared. Thanks! – I want badges Jan 23 '14 at 01:02
  • I edited the script you provided to include all the elements in a document, but its failing to get click on some of them. Could you please give me a pointer on what is happening? Its here: http://stackoverflow.com/questions/21316003/how-to-capture-any-element-where-the-user-clicked-with-javascript – I want badges Jan 23 '14 at 23:36