7

I manage an open source project and have a user reporting a situation which I think is impossible according to Java's order of initialization of static variables in classes. The value of a static final class variable is incorrect, apparently resulting from different results of a dependency's static method based on its own static final variable.

I'd like to understand what's happening in order to figure the best workaround. At the moment, I am baffled.

The problem

The main entry point for my project is the class SystemInfo which has the following constructor:

public SystemInfo() {
    if (getCurrentPlatform().equals(PlatformEnum.UNKNOWN)) {
        throw new UnsupportedOperationException(NOT_SUPPORTED + Platform.getOSType());
    }
}

When run by itself, the problem doesn't reproduce; but when run as part of many tests being executed a larger build (mvn install) it is consistently reproducible, implying the problem is likely associated with multithreading or multiple forks. (To clarify: I mean the simultaneous initialization of static members in two different classes, and the various JVM-internal locking/synchronization mechanisms associated with this process.)

They receive the following result:

java.lang.UnsupportedOperationException: Operating system not supported: JNA Platform type 2

This exception implies two things are true when SystemInfo instantiation begins:

  • The result of getCurrentPlatform() is the enum value PlatformEnum.UNKNOWN
  • The result of Platform.getOSType() is 2

However, this situation should be impossible; a value of 2 would return WINDOWS, and unknown would return a value other than 2. Since both variables are both static and final they should never simultaneously reach this state.

(User's) MCRE

I have tried to reproduce this on my own and failed, and am relying on a report from a user executing tests in their Kotlin-based (kotest) framework.

The user's MCRE simply invokes this constructor as part of a larger number of tests, running on the Windows operating system:

public class StorageOnSystemJava {
    public StorageOnSystemJava(SystemInfo info) {
    }
}

class StorageOnSystemJavaTest {
    @Test
    void run() {
        new StorageOnSystemJava(new SystemInfo());
    }
}

Underlying code

The getCurrentPlatform() method simply returns the value of this static final variable.

public static PlatformEnum getCurrentPlatform() {
    return currentPlatform;
}

This is a static final variable populated as the very first line in the class (so it should be the first thing initialized):

private static final PlatformEnum currentPlatform = queryCurrentPlatform();

where

private static PlatformEnum queryCurrentPlatform() {
    if (Platform.isWindows()) {
        return WINDOWS;
    } else if (Platform.isLinux()) {
        // other Platform.is*() checks here
    } else {
        return UNKNOWN; // The exception message shows the code reaches this point
    }
}

This means that during class initialization, all of the Platform.is*() checks returned false.

However, as indicated above this should not have happened. These are calls to JNA's Platform class static methods. The first check, which should have returned true (and does, if called in the constructor or anywhere in code after instantiation) is:

public static final boolean isWindows() {
    return osType == WINDOWS || osType == WINDOWSCE;
}

Where osType is a static final variable defined thus:

public static final int WINDOWS = 2;

private static final int osType;

static {
    String osName = System.getProperty("os.name");
    if (osName.startsWith("Linux")) {
        // other code
    }
    else if (osName.startsWith("Windows")) {
        osType = WINDOWS; // This is the value being assigned, showing the "2" in the exception
    }
    // other code
}

From my understanding of the order of initialization, Platform.isWindows() should always return true (on a Windows OS). I do not understand how it could possibly return false when called from my own code's static variable initialization. I've tried both the static method, and a static initialization block immediately following the variable declaration.

Expected order of initialization

  1. User calls the SystemInfo constructor
  2. SystemInfo class initialization begins ("T is a class and an instance of T is created.")
  3. The static final currentPlatform variable is encountered by the initializer (first line of class)
  4. The initializer calls the static method queryCurrentPlatform() to obtain a result (same result if the value is assigned in a static block immediately following the static variable declaration)
  5. The Platform.isWindows() static method is called
  6. The Platform class is initialized ("T is a class and a static method of T is invoked.")
  7. The Platform class sets the osType value to 2 as part of initialization
  8. When Platform initialization is complete, the static method isWindows() returns true
  9. The queryCurrentPlatform() sees the true result and sets the currentPlatform variable value (This is not happening as expected!)
  10. After SystemInfo class initialization is complete, its constructor executes, showing the conflicting values and throwing the exception.

Workarounds

Some workarounds stop the problem, but I don't understand why they do:

  • Performing the Platform.isWindows() check anytime during the instantiation process (including the constructor) properly returns true and assigns the enum appropriately.

    • This includes lazy instantiation of the currentPlatform variable (removing the final keyword), or ignoring the enum and directly calling JNA's Platform class.
  • Moving the first call to the static method getCurrentPlatform() out of the constructor.

These workarounds imply a possible root cause is associated with executing static methods of multiple classes during class initialization. Specifically:

  • During initialization, the Platform.isWindows() check apparently returns false because code reaches the else block
  • After initialization (during instantiation), the Platform.isWindows() check returns true. (Since it is based on a static final value it should not ever return different results.)

Research

I've thoroughly reviewed multiple tutorials about Java clearly showing the initialization order, as well as these other SO questions and the linked Java Language Specs:

Daniel Widdis
  • 8,424
  • 13
  • 41
  • 63
  • Moving static initializer block on top of class doesn't work as a solution? – zubergu Jul 16 '21 at 21:24
  • Maybe should have asked that first but... is `private static final PlatformEnum currentPlatform = queryCurrentPlatform();` in the same class as `private static final int osType;` and above its static initializer block? If yes then I'm wondering how you managed to assigned anything else than 0 as `osType` in the first place. – zubergu Jul 16 '21 at 22:04
  • I'd investigate this by adding debug logging to all classes involved, and look at the logfile produced by a failing run (to add logging output to compiled classes, I'd execute the code in a debugger and abuse conditional breakpoints by providing a "condition" that writes to `System.out` before returning true). That way, it's easy to verify whether this a multithreading problem, or mutually dependent classes seeing each other before they are fully initialized, or something else altogether. – meriton Jul 16 '21 at 22:36
  • (Of course, if the failure is not timing specific, you could simply execute the test suite in a debugger, add a few breakpoints, and step through the code to see what is going on) – meriton Jul 16 '21 at 22:52
  • Can it be some of the tests change the `os.name` system property? – yegodm Jul 17 '21 at 01:09
  • @zubergu Initial test report had a static initialization block immediately below the static final variable declaration. My first attempt to try to solve it was based on a suggestion that methods might work where blocks didn't. Fully history [here](https://github.com/oshi/oshi/pull/1671) and [here](https://github.com/oshi/oshi/pull/1680). – Daniel Widdis Jul 17 '21 at 03:46
  • @zubergu the `currentPlatform` variable is in one class (`SystemInfo`) at the very top of the source code, the first thing initialized. The `osType` is in a totally different class in a dependency (`Platform` class in JNA). – Daniel Widdis Jul 17 '21 at 03:47
  • @meriton Nice suggestions and I probably would already have done that if all this was under my control, but half of the problem here is in a dependency and I'm relying on a third party user with proprietary code running all the tests. – Daniel Widdis Jul 17 '21 at 03:50
  • Ah, I took the MCRE to mean that you're able to reproduce the bug. If only the user can, I'd give him a patched library (including patched libraries for the dependencies). Or ask if you can remote debug the problem on their system. Or ask *them* to debug the issue and to report their findings. – meriton Jul 17 '21 at 14:36

2 Answers2

5

It's not multithreading, because the JVM prevents other threads from accessing a class while it is being initialized. This behavior is mandated by the Java Language Specification, section 12.4.2, step 2:

If the Class object for C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this step.

It is exceedingly unlikely for a JVM to have a bug in this area, since it would cause repeated initializer execution, which would be very noticeable.

However, a static final field can appear to have a changing value if:

  • there is a cyclic dependency among initializers

    Same section, step 3 writes:

    If the Class object for C indicates that initialization is in progress for C by the current thread, then this must be a recursive request for initialization. Release LC and complete normally.

    Therefore, a recursive initialization may allow a thread to read a static final field before it is assigned. This can only happen if class initializers create a cyclic dependency among initializers.

  • somebody (ab)uses reflection to reassign the static final field

  • the class is loaded by more than one class loader

    In this case, each class has its own copy of the static field, and may initialize it differently.

  • if the field is a compile time constant expression, and the code was compiled at different times

    The spec mandates that compile time constant expressions are inlined by the compiler. If different classes are compiled at different times, the value being inlined may have been different. (In your case, the expression is not compile time constant; I only mention this possiblity for the sake of future visitors).

From the evidence you have given, it is impossible to say which of these apply. That's why I recommend further investigation.

meriton
  • 68,356
  • 14
  • 108
  • 175
  • OK, so after some back and forth with the user, it turns out they were using reflection in another test. Since your answer explicitly listed reflection as a possibility, I'm marking it as accepted. – Daniel Widdis Jul 19 '21 at 15:38
3

DISCLAIMER: I'm writing this as an answer because I don't know how to make it fit into a comment. If it doesn't help you let me know, and I'll delete it.


Let's start with a small recap which given the question quality, I'm sure you're aware of already:

  • A field which is static to a class, means that it only exists once for any instance. No matter how many instances of the class you create, the field will always point to the same memory address.
  • A field which is final means that once initialized, its value cannot change any longer.

So, when you mix these two into a static final field, it means:

  • You only have one value for that field, no matter how many instances
  • Once the value is assigned, it doesn't change any longer

So, my suspect is not that there is any thread-safety issue (I don't think you're running your tests in parallel, so I guess no two-threads would simultaneously work on these objects, right?), but rather that a previous test of your test suite has initialized the variables differently and, being they run into the same JVM, they don't change any longer their values.

Take this very simple test example.

I have one very basic class:

public final class SomeClass {

    private static final boolean FILE_EXISTS;

    static {
        FILE_EXISTS = new File("test").exists();
    }

    public SomeClass() {
        System.out.println("File exists? " + FILE_EXISTS);
    }

}

The above class simply has a static final boolean saying whether a certain file named test exists into the working directory. As you can see, the field is initialized once (final) and will be the same for each instance.

Now, let's run these two very simple tests:

@Test
public void test_some_class() throws IOException {
    System.out.println("Running test_some_class");
    File testFile = new File("test");
    if (testFile.exists()) {
        System.out.println("Deleting file: " + testFile.delete());
    } else {
        System.out.println("Could create the file test: " + testFile.createNewFile());
    }
    SomeClass instance1 = new SomeClass();
}

@Test
public void other_test_some_class() {
    System.out.println("Running other_test_some_class");
    SomeClass instance2 = new SomeClass();
}

In the first test, I check if file test exists. If it does exists, I'll delete it. Else, I'll create it. Then, I'll initialize a new SomeClass().

In the second test, I simply initialize a new SomeClass().

This is the output of my tests run together:

Running other_test_some_class //<-- JUnit picks the second test to start
File exists? false //<-- The constructor of SomeClass() prints the static final variable: file doesn't exist
Running test_some_class //<-- JUnit continues running the first test
Could create the file test: true //<-- it is able to create the file
File exists? false //<-- yet, the initializer of new SomeClass() still prints false

The reason why it prints false, even though we clearly created the test file before initializing new SomeClass(), is that the field FILE_EXISTS is static (hence shared across all instances) and final (hence initialized once, lasting forever).

So if you wonder why private static final int osType; has a value which returns you UNKNOWN when you run mvn install but not when you run the single test, I'd simply look what test, in your full test suite, has already initialized it with a value that you don't expect.

Solution

There are two types of solution for this, and they depend on your production code.

It may be that, functionally, you actually need this field to be maybe final to the instance of the class, but not static. If that's the case, you should just declare it final to the class (once initialized, it doesn't change, but still you have one different value per instance).

Or, it may be that you really need the field to be static final in production, but not during tests as you initialize a new test context each time. If that's the case, you should configure your test plugin to reuseForks = false (meaning a fresh JVM fork is created for each test class, and that guarantees you that each test class will start with a fresh memory for your static final fields):

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-surefire-plugin</artifactId>
            <version>${maven.surefire.plugin.version}</version>
            <configuration>
                <forkCount>1</forkCount>
                <reuseForks>false</reuseForks>
            </configuration>
        </plugin>
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
Matteo NNZ
  • 11,930
  • 12
  • 52
  • 89
  • Thanks for the detailed suggestions. However, the "nested" nature of the initialization means that both classes (my `SystemInfo` and JNA's `Platform`) would equally share the same one-time-only static final results. The `SystemInfo` class begins initializing first, then `Platform` begins and ends, and then `SystemInfo` ends. Still a useful debugging question to inquire whether there is any non-Windows initialization as part of the testing. – Daniel Widdis Jul 17 '21 at 03:43
  • @DanielWiddis have you tried to run the tests on a single fork without reusing it just to potentially exclude this possible cause? – Matteo NNZ Jul 17 '21 at 05:51
  • This is a third party user on a private repo doing the tests. They report executing the test by itself does not reproduce the error. – Daniel Widdis Jul 17 '21 at 06:16
  • @DanielWiddis I don't mean that. I mean, did you try to configure the test plugin as I showed you above and run the full build (mvn install) again? – Matteo NNZ Jul 17 '21 at 09:55
  • Thanks. I do understand the intent here and will be asking the user for more information. I still think on any given fork /JVM instance, there is an exactly 1:1 relationship with the two initialized classes. I could understand if something initialized Platform first with unknown, but then the static final `osType` value would have been "stuck" at a different value. The only possible situation I can think of is some very low level race condition during the initialization phase of the two classes. – Daniel Widdis Jul 17 '21 at 15:22