3

My task is to get a list of named functions from any web-page using Python.

I have a script written using JavaScript. It does what I need.

When page is loaded I can run the script from JS console (e.g. from dev-tools in GoogleChrome). I have the array of names of the functions as the result. Well, but I go to the page and execute the script from browser manually. But the question is to do the same from Python. It can look something like this:

def get_named_functions_list(url):
    myscript = settings.get_js_code()  # here I get script that I told above

    tool.open(url)

    while not tool.document.READY: # here I wait while the page will completely loaded
        pass

    js_result = tool.execute_from_console(myscript)

    return list(js_result.values())

So, is there a tool in Python that helps to solve the problem automatically?

UPDATE: To be more clear I can divide the task to the list of subtasks (in Python):

  1. Request to the given url
  2. Waiting for document.ready(function...) will finished.
  3. Execute my JS-code (like in browser).
  4. Getting of result the JS-code returns.
Bogdan
  • 558
  • 2
  • 8
  • 22
  • Your question is unclear. may be you need scrapping – binu.py May 16 '17 at 19:39
  • For this task, you'll probably need to use an [HTML parser](http://stackoverflow.com/questions/2782097/python-is-there-a-built-in-package-to-parse-html-into-dom) and a [JavaScript parser](http://stackoverflow.com/questions/390992/javascript-parser-in-python). – Anderson Green May 16 '17 at 19:53
  • @binu.py, I've updated the topic to be more clear. Maybe it will help. As for scrapping, I does not need to get data from page. The key task is to execute JS in browser scope. I think, it should works like a simple python non-GUI browser or something like this. – Bogdan May 16 '17 at 19:57
  • If you want to do it in backend i do not think it is possible. If you want to execute some function on load you may need to check with the templating language you are using. If this is for testing you will need python selenium to do your task – binu.py May 16 '17 at 20:06
  • @binu.py, for example, I have [google.com](https://www.google.com/), [facebook](https://www.facebook.com/) for checking. I want to get information about what named JS functions the domain uses. So, I run my script, it makes request to the urls above and gives me two lists of strings. Every list contains from names of functions that available in JS scope. – Bogdan May 16 '17 at 20:30

1 Answers1

9

I have solved the problem with the using of selenium.

Then I have downloaded the PhantomJS driver to use selenium without a browser window and added it to PATH.

Finally, I use the following Python script:

from selenium import webdriver
    
myscript = settings.get_js_code() # here I get content of *.js file
driver = webdriver.PhantomJS()
driver.get(url)
result = driver.execute_script(myscript)
driver.quit()

Note: your script have to return something to get the result.

Bogdan
  • 558
  • 2
  • 8
  • 22
  • 1
    Hi, I search for a way to use selenium to run a js file script from a URL, can you explain please what this line does: "myscript = settings.get_js_code()", thank you in advance – Moez Ben Rebah Jul 28 '22 at 15:40
  • It only reads file content like this `with open(path) as file: content = file.read()`, where content is a Javascript function which returns something. – Bogdan Aug 03 '22 at 08:38