0

I'm wondering if it is possible to, in Java, detect whether or not an HTML file would open an alert dialog if opened in the browser. Preferably headlessly. For example, a file with the below contents were parsed, it would return true.

<html><script>alert("hey")</script></html>

and the below would return true also

<html><iframe src="javascript:alert(1)" onload="alert(2)"></iframe></html>

but the below would return false because it would not open an alert dialog if it were opened in the browser (because none of the code is syntactically correct, and the part that is isn't in a tag).

<html><script>alert;,(123w)</script>alert(1)</html>

I have thought of a way to approach this problem, but it is flawed. Basically, you see if the stringalert(1) is in the file, etc. The problem with this is that it wouldn't work in cases where that code isn't inside of script tags or tags that make it execute. An example of where it wouldn't work is: The following would return true, even though it wouldn't actually open a popup <html>alert(1)</html>.

This isn't Android by the way. Appreciate your help!

Aaron Esau
  • 1,083
  • 3
  • 15
  • 31

1 Answers1

6

You will need to not only verify if the Alert function is there but check if the JavaScript function would even run. An example of this is if there is a script with an Alert function inside a function that never runs. The Alert function would be there but it would never run. This would give a false positive. So the in the best case you should run the JavaScript in some way to validate the code and to see if the function would ever run.

As Louis pointed out in the comments Option 2 is better in this case as you will need to account for both the DOM and JavaScript's behaviour as both can change if the Alert function runs and how it runs.

Option 1 : Run the JavaScript with Script Engine

You would need some way of separating the HTML from the JavaScript but once you have that you can do this method.

You can run the JavaScript in Java using ScriptEngine. https://docs.oracle.com/javase/8/docs/technotes/guides/scripting/prog_guide/api.html

If you read the API there is a way to create variables and communicate between your Java Program and the JavaScript you are Running.

To capture the context of the Alert you can create a custom JavaScript function that overwrites the Alert function. Inside this custom function you can send the arguments of the function back to your Java Program.

Option 2 : Headless Browser

You can also try to use a headless browser like JBrowserDriver and as you can see you have an Alert interface with getText as a function. For async issue the headless browser has a default amount of time for waiting for the script to complete. If this default amount is not enough you can use the setScriptTimeout to handle it. http://machinepublishers.github.io/jBrowserDriver/

Michael Warner
  • 3,879
  • 3
  • 21
  • 45
  • Thanks a lot for this answer! I'm going to try this out. I'll mark it as the answer if it works. I'm going to wait a bit before awarding the bounty though, just in case there are other good answers. Appreciate it! – Aaron Esau Dec 08 '16 at 23:11
  • 1
    The first option will not work with anything more than absolutely trivial scenarios. Following the instructions above, these won't work: in `ScriptEngine` with the instructions given above `window.alert("foo")`, `document.defaultView.alert("foo")`.. They'd work in a browser though. Then there are those more complicated cases like pages that raise an alert only if some state of affair in the DOM exists. Since `ScriptEngine` does not provide a DOM, it won't work in these cases either. The 2nd option is better but shows now awareness of how to handle asynchronous alerts. (They do happen.) – Louis Dec 12 '16 at 12:21
  • 1
    You are right Louis. I will say option 2 is the best in this case. As for the asynchronous alerts normally headless browsers have a wait till X has finished or wait X amount of time before throwing an error. With jBrowserDriver they have a timeout setting with setScriptTimeout, you can find this on the index page. These Docs were not made for the Web and don't change the URI when you click on index of it's context so I can't send you a URL. – Michael Warner Dec 12 '16 at 13:46
  • As this answer specifies, you have to execute the code to know is an alert will appear. The script could be unicode escaped so "\u0061\u006c\u0065\u0072\u0074\u0028\u0022\u0068\u0065\u0079\u0022\u0029\u003b" is another way of writing "alert("hey")", it is also possible to have it as a string and run it through eval(). Executing is the only way to know what a script will do. I'm wondering why you want to know about alert specifically, if this is security related you need to find another approach, because alert is the least of your problems. – Klaus Groenbaek Dec 13 '16 at 09:33
  • If you are wondering about security you could request a different program to run this check in a virtual machine or something like it and return only the answer. – Michael Warner Dec 13 '16 at 10:32
  • @MichaelWarner Nice, you got 123 reputation from posting an answer! This post helped me with my project, thanks! I think option 2 is the most reliable, so I'll probably use that. If you can think of any other options, I'd love to hear them! :) – Aaron Esau Dec 14 '16 at 07:33
  • No problem man and I'm am shocked on how much I got from this answer. I'm glad I could help everyone. – Michael Warner Dec 14 '16 at 12:47