2

I've got a UI automation framework that launches tests using TestNG and runs through pages using Selenium/WebDriver. Oftentimes the pages I'm testing make AJAX calls that modify the DOM upon returning. In these cases I use Selenium explicit waits to declare a DOM condition that I want to be met before the automation can proceed (IE: some button gets enabled).

Internally Selenium's FluentWait.until method handles this by polling the DOM for my ExpectedCondition every 500ms and calling Thread.sleep() in-between these checks.

When I run two tests back to back in a TestNG suite this works perfectly fine for the first test, but starts to fail with an InterruptedException about halfway through each subsequent test. This is consistent. The exceptions look like this:

Associated Throwable Type: class org.openqa.selenium.WebDriverException Associated Throwable Message: java.lang.InterruptedException: sleep interrupted

The strange thing is that there's no multi-threading going on here. I've disabled Selenium Grid, BrowserMob Proxy, and every other bit of code that could be conflicting. I've read both of these questions:

https://stackoverflow.com/questions/24495176/why-is-thread-sleep-being-interrupted - Closed for not providing enough detail, but one of the proposed answers states that one should override the Thread.interrupt method for debugging.

Who interrupts my thread? - Accepted answer also states that one should override the Thread.interrupt method for debugging.

My problem with this solution is that placing a breakpoint inside the existing Thread.interrupt method does not reveal any calls around the time that the thread is interrupted. This includes calls from all of my third party dependencies (IE: TestNG and Selenium). Whatever is calling this thread interrupt appears to be external to my framework.

I've also tried calling Thread.currentThread.isInterrupted() at every point prior to the FluentWait.until call and it consistently returns false. I've even used IntelliJ's evaluate function to check for isInterrupted inside the Selenium code itself. This thread is only being interrupted once the Thread.sleep call occurs inside FluentWait.until.

I've seen this happen on multiple Windows build servers as well as on my Macbook, so this does not appear to be machine specific.

I thought for a while that this might be caused by a TestNG timeout, but reducing the TestNG timeout in my suite yielded a different behavior than these interruptions.

Currently I'm working around this issue with the following code which swallows the exception and resumes the explicit wait:

public static boolean waitForElementStatus(Stuff)
{

  /* snip - setup for ExpectedCondition (change) */

  long startSeconds = new Date().getTime() / 1000;
  long currentSeconds = startSeconds;
  long remainingSeconds = maxElementStatusChangeSeconds;
  WebDriverWait waitForElement = new WebDriverWait(driver, maxElementStatusChangeSeconds);
  boolean changed = false;
  boolean firstWait = true; // If specified time is 0 we still want to check once.
  out:while(firstWait || remainingSeconds > 0)
  {
    firstWait = false;
    Boolean exceptionThrown = false;
    try
    {
      waitForElement.until(change);
    }
    catch(Throwable t)
    {
      exceptionThrown = true;
      if(t.getCause()) != null
      {
        t = t.getCause(); // InterruptedException is wrapped inside a WebDriverException
      }
      if(t.getClass().equals(InterruptedException.class))
      {
        Thread.interrupted(); // clear interrupt status for this thread
        currentSeconds = new Date().getTime() / 1000;
        remainingSeconds = startSeconds + maxElementStatusChangeSeconds - currentSeconds;
        if(remainingSeconds > 0)
        {
          String warning = String.format("Caught unidentified interrupt inside Selenium " +
          "FluentWait.until call.  Swallowing interrupt and repeating call with [%s] seconds " + 
          "remaining.", remainingSeconds);
          CombinedLogger.warn(warning);
          waitForElement = new WebDriverWait(driver, remainingSeconds);
        }
        else
        {
          // If a timeout exception would have been thrown instead of the interruption then
          // we'll allow the WebDriverWait to execute one last time so it can throw the
          // timeout instead.
          waitForElement = new WebDriverWait(driver, 0);
        }
      }
      else if(haltOnFailure) // for any other exception type such as TimeoutException
      {
        CombinedLogger.error(stuff + "...FAILURE(HALTING)", t);
        break out;
      }
      else // for any other exception type such as TimeoutException
      {
        CombinedLogger.info(stuff + "...failure(non-halting)");
        break out;
      }
    }
    if(!exceptionThrown)
    {
      changed = true;
      CombinedLogger.info(stuff + "...success ");
      break out;
    }
  }
  return changed;
}

This workaround does function, and fortunately these mystery interrupts are only occurring sporadically afterwards (they don't happen repeatedly), so the tests are able to proceed. However, I understand that swallowing InterruptedException is bad form. If possible, I'd like to determine where and why these interrupts are taking place so that I can put an end to them instead of using this hack.

Simply propagating the exceptions is not an option since these tests need to continue running instead of obediently crashing.

Are there any known utilities, JVM arguments, or libraries that I could use which would help me track down Java thread interruptions that are caused by code which is out of my control?

Update 12/10/2014: I've captured two thread dumps. One is from immediately before the interrupt and one is from immediately after it. The only difference between the two is the line number of the interrupted thread (it goes from the try block to the catch block after being interrupted). Not sure what this tells me, but here's the data:

Full thread dump (immediately before interrupt)

"TestNG@1359" prio=5 tid=0xc nid=NA runnable
  java.lang.Thread.State: RUNNABLE
   at org.openqa.selenium.support.ui.FluentWait.until(FluentWait.java:232)
   /* snip - company stuff */
   at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
   at org.testng.internal.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:46)
   at org.testng.internal.InvokeMethodRunnable.run(InvokeMethodRunnable.java:37)
   at org.testng.internal.MethodInvocationHelper.invokeWithTimeoutWithNoExecutor(MethodInvocationHelper.java:240)
   at org.testng.internal.MethodInvocationHelper.invokeWithTimeout(MethodInvocationHelper.java:229)
   at org.testng.internal.Invoker.invokeMethod(Invoker.java:724)
   at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
   at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
   at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
   at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
   at org.testng.TestRunner.privateRun(TestRunner.java:767)
   at org.testng.TestRunner.run(TestRunner.java:617)
   at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
   at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
   at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
   at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)

"main@1" prio=5 tid=0x1 nid=NA waiting
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Unsafe.java:-1)
   at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
   at java.util.concurrent.FutureTask.get(FutureTask.java:199)
   at java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:289)
   at org.testng.internal.thread.ThreadUtil.execute(ThreadUtil.java:72)
   at org.testng.SuiteRunner.runInParallelTestMode(SuiteRunner.java:367)
   at org.testng.SuiteRunner.privateRun(SuiteRunner.java:308)
   at org.testng.SuiteRunner.run(SuiteRunner.java:254)
   at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
   at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
   at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224)
   at org.testng.TestNG.runSuitesLocally(TestNG.java:1149)
   at org.testng.TestNG.run(TestNG.java:1057)
   at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
   at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
   at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
   at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:125)

"Thread-8@2432" daemon prio=5 tid=0x15 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
   at java.io.FileInputStream.readBytes(FileInputStream.java:-1)
   at java.io.FileInputStream.read(FileInputStream.java:272)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
   - locked <0xe08> (a java.lang.UNIXProcess$ProcessPipeInputStream)
   at java.io.FilterInputStream.read(FilterInputStream.java:107)
   at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
   at java.lang.Thread.run(Thread.java:745)

"Thread-7@2431" daemon prio=5 tid=0x14 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
   at java.io.FileInputStream.readBytes(FileInputStream.java:-1)
   at java.io.FileInputStream.read(FileInputStream.java:272)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
   - locked <0xe09> (a java.lang.UNIXProcess$ProcessPipeInputStream)
   at java.io.FilterInputStream.read(FilterInputStream.java:107)
   at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
   at java.lang.Thread.run(Thread.java:745)

"Thread-6@2424" prio=5 tid=0x13 nid=NA waiting
  java.lang.Thread.State: WAITING
   at java.lang.Object.wait(Object.java:-1)
   at java.lang.Object.wait(Object.java:503)
   at java.lang.UNIXProcess.waitFor(UNIXProcess.java:261)
   at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:347)
   at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:46)
   at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:188)

"process reaper@2008" daemon prio=10 tid=0x10 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
   at java.lang.UNIXProcess.waitForProcessExit(UNIXProcess.java:-1)
   at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
   at java.lang.UNIXProcess$4.run(UNIXProcess.java:225)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)

"ReaderThread@645" prio=5 tid=0xb nid=NA runnable
  java.lang.Thread.State: RUNNABLE
   at java.net.SocketInputStream.socketRead0(SocketInputStream.java:-1)
   at java.net.SocketInputStream.read(SocketInputStream.java:152)
   at java.net.SocketInputStream.read(SocketInputStream.java:122)
   at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
   at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
   at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
   - locked <0xe0b> (a java.io.InputStreamReader)
   at java.io.InputStreamReader.read(InputStreamReader.java:184)
   at java.io.BufferedReader.fill(BufferedReader.java:154)
   at java.io.BufferedReader.readLine(BufferedReader.java:317)
   at java.io.BufferedReader.readLine(BufferedReader.java:382)
   at org.testng.remote.strprotocol.BaseMessageSender$ReaderThread.run(BaseMessageSender.java:245)

"Finalizer@2957" daemon prio=8 tid=0x3 nid=NA waiting
  java.lang.Thread.State: WAITING
   at java.lang.Object.wait(Object.java:-1)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
   at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

"Reference Handler@2958" daemon prio=10 tid=0x2 nid=NA waiting
  java.lang.Thread.State: WAITING
   at java.lang.Object.wait(Object.java:-1)
   at java.lang.Object.wait(Object.java:503)
   at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

"Signal Dispatcher@2956" daemon prio=9 tid=0x4 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
Community
  • 1
  • 1
Alex Jansen
  • 1,455
  • 5
  • 18
  • 34
  • "The strange thing is that there's no multi-threading going on here." Well clearly *that's* not true, since your sleep is being interrupted. Your first step should be to try to isolate your code and replicate the failure with as few moving parts as possible. First and foremost, if you write a simple `main()` method that calls your `@Test` methods directly (bypassing TestNG) do you still get interrupted? – dimo414 Dec 10 '14 at 14:05
  • That's a fair point. There's always some degree of multi-threading going on. Isolating this failure will probably take me a few days. I'll see if I can do this soon. – Alex Jansen Dec 10 '14 at 23:25

2 Answers2

0

There is not much that can be inferred from the thread dump as in what caused it. But in reality you cannot rely on Thread.sleep() too much ,it might be interrupted for known/unknown reason.OS might be the reason in the later case.

Thread.sleep() is one of the few methods which takes interrupt seriously. As a thread cannot handle InterruptedException while it is sleeping ,you need to handle it.

What you are doing right now might not be a workaround but a way to go in such cases,where we cannot do without Thread.sleep().

abhati
  • 309
  • 1
  • 6
  • My options were: #1-This workaround (already in place). #2-Abandon my assigned tasks to chase down mysterious interrupts with no proper tools or logging available to determine why they're happening... Unsurprisingly I chose #1. – Alex Jansen Jan 21 '15 at 03:50
0

A bit outdated but I have similar problem and with the help of your previously posted link (https://stackoverflow.com/a/2476246) I put a breakpoint into the Thread.interrupt() method.

It reveals that the interruption was made by StoryManager.waitUntilAllDoneOrFailed() method that triggers future.cancel() method after the timeout set on whole story.

My whole setup is:

page.getPageObject().withTimeoutOf(convertDuration(duration)).waitFor(by);

where duration is about 60 secs. (the minute is due to some async stuff)

and

configuredEmbedder().embedderControls().useStoryTimeouts("30");

And the stackTrace is:

at java.util.concurrent.FutureTask.cancel(FutureTask.java:174)
      at org.jbehave.core.embedder.StoryManager.waitUntilAllDoneOrFailed(StoryManager.java:184)
      at org.jbehave.core.embedder.StoryManager.performStories(StoryManager.java:121)
      at org.jbehave.core.embedder.StoryManager.runStories(StoryManager.java:107)

and that interrupts later Thread.sleep() method in ThucydidesFluentWait.doWait() (basically in the underneath Sleeper instance method sleep())

Increasing the story timeout or proper setup of waitFor(...) timeout vs. story timeout solves the problem on my side.

Community
  • 1
  • 1
jan
  • 11
  • 1