19

In my Android application, I use a SurfaceView to draw things. It has been working fine on thousands of devices -- except that now users started reporting ANRs on the following devices:

  • LG G4
    • Android 5.1
    • 3 GB RAM
    • 5.5" display
    • 2560 x 1440 px resolution
  • Sony Xperia Z4
    • Android 5.0
    • 3 GB RAM
    • 5,2" display
    • 1920 x 1080 px resolution
  • Huawei Ascend Mate 7
    • Android 5.1
    • 3 GB RAM
    • 6.0" display
    • 1920 x 1080 px resolution
  • HTC M9
    • Android 5.1
    • 3 GB RAM
    • 5.0" display
    • 1920 x 1080 px resolution

So I got an LG G4 and was indeed able to verify the problem. It's directly related to the SurfaceView.

Now guess what fixed the issue after hours of debugging? It is replacing ...

mSurfaceHolder.unlockCanvasAndPost(c);

... with ...

mSurfaceHolder.unlockCanvasAndPost(c);
System.out.println("123"); // THIS IS THE FIX

How can this be?

The following code is my render thread that has been working fine except for the mentioned devices:

import android.graphics.Canvas;
import android.view.SurfaceHolder;

public class MyThread extends Thread {

    private final SurfaceHolder mSurfaceHolder;
    private final MySurfaceView mSurface;
    private volatile boolean mRunning = false;

    public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
        mSurfaceHolder = surfaceHolder;
        mSurface = surface;
    }

    public void setRunning(boolean run) {
        mRunning = run;
    }

    @Override
    public void run() {
        Canvas c;
        while (mRunning) {
            c = null;
            try {
                c = mSurfaceHolder.lockCanvas();
                if (c != null) {
                    mSurface.doDraw(c);
                }
            }
            finally { // when exception is thrown above we may not leave the surface in an inconsistent state
                if (c != null) {
                    try {
                        mSurfaceHolder.unlockCanvasAndPost(c);
                    }
                    catch (Exception e) { }
                }
            }
        }
    }

}

The code is, in parts, from the LunarLander example in the Android SDK, more specifically LunarView.java.

Updating the code to match the improved example from Android 6.0 (API level 23) yields the following:

import android.graphics.Canvas;
import android.view.SurfaceHolder;

public class MyThread extends Thread {

    /** Handle to the surface manager object that we interact with */
    private final SurfaceHolder mSurfaceHolder;
    private final MySurfaceView mSurface;
    /** Used to signal the thread whether it should be running or not */
    private boolean mRunning = false;
    /** Lock for `mRunning` member */
    private final Object mRunningLock = new Object();

    public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
        mSurfaceHolder = surfaceHolder;
        mSurface = surface;
    }

    /**
     * Used to signal the thread whether it should be running or not
     *
     * @param running `true` to run or `false` to shut down
     */
    public void setRunning(final boolean running) {
        // do not allow modification while any canvas operations are still going on (see `run()`)
        synchronized (mRunningLock) {
            mRunning = running;
        }
    }

    @Override
    public void run() {
        while (mRunning) {
            Canvas c = null;

            try {
                c = mSurfaceHolder.lockCanvas(null);
                synchronized (mSurfaceHolder) {
                    // do not allow flag to be set to `false` until all canvas draw operations are complete
                    synchronized (mRunningLock) {
                        // stop canvas operations if flag has been set to `false`
                        if (mRunning) {
                            mSurface.doDraw(c);
                        }
                    }
                }
            }
            // if an exception is thrown during the above, don't leave the view in an inconsistent state
            finally {
                if (c != null) {
                    mSurfaceHolder.unlockCanvasAndPost(c);
                }
            }
        }
    }

}

But still, this class does not work on the mentioned devices. I get a black screen and the application stops responding.

The only thing (that I have found) that fixes the problem is adding the System.out.println("123") call. And adding a short sleep time at the end of the loop turned out to provide the same results:

try {
    Thread.sleep(10);
}
catch (InterruptedException e) { }

But these are no real fixes, are they? Isn't that strange?

(Depending on what changes I make to the code, I'm also able to see an exception in the error log. There are many developers with the same problem but unfortunately none does provide a solution for my (device-specific) case.

Can you help?

Community
  • 1
  • 1
caw
  • 30,999
  • 61
  • 181
  • 291
  • 1
    Creating the thread from `surfaceCreated()` should work fine. See e.g. https://github.com/google/grafika/blob/master/src/com/android/grafika/HardwareScalerActivity.java . For some additional notes on SurfaceView and Activity interaction, see https://source.android.com/devices/graphics/architecture.html#activity – fadden Dec 17 '15 at 01:09
  • Thanks! Yes, I've read all those already. This is why I can eliminate most of the possible causes. Must be some weird race condition or issue with the thread locking that doesn't occur on all the other devices. I thought it could be the high display resolution of these devices. But on tablets, these high resolutions have been common already, and it's working on devices such as the Nexus 7 (2013) which has a high resolution as well. Might be a higher pixel density then, though, if this makes a difference. Maybe because I don't have any of the used drawables in `xxhdpi` or `xxxhdpi`. – caw Dec 17 '15 at 06:41
  • My render thread's `run()` loop is executing continuously in the background and the called `doDraw()` method is running about every 40ms. So there really doesn't seem to be anything running too long or keeping the `Canvas` busy for too long. Still, the main thread is freezing and the screen is black. – caw Dec 17 '15 at 06:51

3 Answers3

9

What is currently working for me, although not really fixing the cause of the problem but fighting the symptoms superficially:

1. Removing Canvas operations

My render thread calls the custom method doDraw(Canvas canvas) on the SurfaceView subclass.

In that method, if I remove all calls to Canvas.drawBitmap(...), Canvas.drawRect(...) and other operations on the Canvas, the app does not freeze anymore.

A single call to Canvas.drawColor(int color) may be left in the method. And even costly operations like BitmapFactory.decodeResource(Resources res, int id, Options opts) and reading/writing to my internal Bitmap cache is fine. No freezes.

Obviously, without any drawing, the SurfaceView is not really helpful.

2. Sleeping 10ms in the render thread's run loop

The method that my render thread executes:

@Override
public void run() {
    Canvas c;
    while (mRunning) {
        c = null;
        try {
            c = mSurfaceHolder.lockCanvas();
            if (c != null) {
                mSurface.doDraw(c);
            }
        }
        finally {
            if (c != null) {
                try {
                    mSurfaceHolder.unlockCanvasAndPost(c);
                }
                catch (Exception e) { }
            }
        }
    }
}

Simply adding a short sleep time within the loop (e.g. at the end) fixes all freezing on the LG G4:

while (mRunning) {
    ...

    try { Thread.sleep(10); } catch (Exception e) { }
}

But who knows why this works and if this really fixes the problem (on all devices).

3. Printing something to System.out

The same thing that worked with Thread.sleep(...) above does also work with System.out.println("123"), strangely.

4. Delaying the start of the render thread by 10ms

This is how I start my render thread from within the SurfaceView:

@Override
public void surfaceCreated(SurfaceHolder surfaceHolder) {
    mRenderThread = new MyThread(getHolder(), this);
    mRenderThread.setRunning(true);
    mRenderThread.start();
}

When wrapping these three lines inside the following delayed execution, the app does not freeze anymore:

new Handler().postDelayed(new Runnable() {

    @Override
    public void run() {
        ...
    }

}, 10);

This seems to be because there is only a presumable deadlock right in the beginning. If this is cleared (with the delayed execution), there is no other deadlock. The app runs just fine after that.

But when leaving the Activity, the app freezes again.

5. Just use a different device

Apart from the LG G4, Sony Xperia Z4, Huawei Ascend Mate 7, HTC M9 (and probably a few other devices), the app is working fine on thousands of devices.

Could this be a device-specific glitch? One would surely have heard about this ...


All these "solutions" are hacky. I wish there was a better solution -- and I bet there is!

caw
  • 30,999
  • 61
  • 181
  • 291
3

Look at the ANR trace. Where does it appear to be stuck? ANRs mean the main UI thread is failing to respond, so what you're doing on the renderer thread is irrelevant unless the two are fighting over a lock.

The symptoms you're reporting sound like a race. If your main UI thread is stuck on, say, mRunningLock, it's conceivable that your renderer thread is only leaving it unlocked for a very short window. Adding the log message or sleep call gives the main thread an opportunity to wake up and do work before the renderer thread grabs it again.

(This doesn't actually make sense to me -- your code looks like it should be stalled waiting for lockCanvas() while awaiting the display refresh -- so you need to look at the thread trace in the ANR.)

FWIW, you don't need to synchronize on mSurfaceHolder. An early example did that, and every example since then has cloned it.

Once you get this sorted out, you may want to read about game loops.

fadden
  • 51,356
  • 5
  • 116
  • 166
  • Thank you very much! Your document on Android's graphics architecture is very interesting. When the app freezes, my main thread is stuck on `java.lang.Object.wait(Native Method), java.lang.Thread.parkFor(Thread.java:1220), ..., java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:256), android.view.SurfaceView.updateWindow(SurfaceView.java:521), android.view.SurfaceView$3.onPreDraw(SurfaceView.java:176)` As you can see in my first example, I didn't originally synchronize on `mSurfaceHolder`, but the *latest* "Lunar Lander" example did. Please see my edit at the end of the question. – caw Dec 16 '15 at 23:09
  • The code at https://android.googlesource.com/platform/frameworks/base/+/marshmallow-release/core/java/android/view/SurfaceView.java looks close to your trace. It appears to be getting stuck on `mSurfaceLock.lock()`; that lock is also held while the Canvas is locked. So it appears your renderer thread is keeping the Canvas locked for an extended period. – fadden Dec 17 '15 at 01:08
  • Thanks, that really looks close. But how could this be an error in my code, e.g. my thread keeping the `Canvas` locked for too long? (1) I've changed my code to match the "Lunar Lander" example but it still doesn't work. (2) It works on thousands of devices but not on these few (maybe dozens). (3) As noted, when I delay the starting of the thread, there are no problems anymore, except when shutting down. These three points could speak for a glitch on these devices, right? But I didn't find anything about problems with these devices, and one would surely have heard about this. – caw Dec 17 '15 at 06:33
  • On the other hand, when I reduce my `doDraw()` method to painting a black background only, the issues are gone. So it could really be the `doDraw()` method taking too long and then the `Canvas` being locked for too long. But why doesn't this happen on all the other (older) devices? The only reason could be the high resolution? Then again, I've tried to "reduce" it via "setFixedSize()". Didn't work. Moreover, the simple `Thread.sleep(10);` at the end of the `while (mRunning) { }` loop does seem to fix the issue, too. But it's hacky and I can verify this only on a single device, the LG G4. – caw Dec 17 '15 at 06:35
  • 1
    You could add some `System.nanoTime()` calls at the start and end of the loop and watch how long the lock is held. Maybe keep track of the maximum duration and log it once every 3 seconds (then reset the max) to avoid flooding the log. A much better approach is to add custom systrace events with android.os.Trace and capture a trace while it's in the unhappy state -- see e.g. http://bigflake.com/systrace/ . That'll give you a nice visual representation of the various threads and their interaction. – fadden Dec 17 '15 at 18:45
  • Thanks! I now saved `System.nanoTime()` always *before* the call to `lockCanvas()` and *after* the call to `unlockCanvasAndPost(c)`. The delta was logged to `System.out` every *4s*. Results: (1) The app did always freeze right in the beginning. (2) After 4s (i.e. logging), app started being responsive and stayed so until the end. (3) Maximum lock time of 125ms was reached in the beginning, and was never surpassed. (4) During shutdown, the app froze again. Conclusion: The presumable deadlock does only occur once in the beginning and once in the end. Each time, when cleared once, it's gone. – caw Dec 19 '15 at 05:29
  • Furthermore, no exception is ever thrown (during locking/unlocking). And every lock that has been acquired is also released again later. Strangely, in both intervals 0s...4s and 4s...8s, the method `lockCanvas()` is called approximately 24 times/second. Apparently, the render thread is always running fine. And I can verify that `doDraw()` in `MySurfaceView` is running as well. So it's probably just the `Activity` that's in a deadlock. But it must be waiting on some objects or methods from the render thread or `SurfaceView`. Otherwise, the `Thread.sleep()` in the render thread wouldn't help. – caw Dec 19 '15 at 05:30
0

Faced the same problem on xiaomi mi 5, android 6. Activity with canvas freezed on start and some time after exit from this activity. Solved this problem using lockHardwareCanvas() instead of lockCanvas(). This method not available in android 6 directly, so I called sHolder.getSurface().lockHardwareCanvas(); and sHolder.getSurface().unlockCanvasAndPost(canvas); Now no delays needed, works normal