0

I'm using CLion with a c++ project (cmake), which starts a jvm. The java part is built with gradle. The project works, but I'm having a problem with debugging.

When I start the JVM, I immediately get a SIGSEGV. I understand that it's normal and there's no workaround except ignoring SIGSEGV. A bit annoying but not too bad as it only happens once per session.

BUT, after that, I continue debugging, and I get constant SIGBUS signals.

<unknown> 0x000000011f108385
<unknown> 0x000000011761dca7
<unknown> 0x000000011761dca7
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
StackWalk::fetchFirstBatch(BaseFrameStream&, Handle, long, int, int, int, objArrayHandle, Thread*) 0x000000010c227cac
StackWalk::walk(Handle, long, int, int, int, objArrayHandle, Thread*) 0x000000010c2278fc
JVM_CallStackWalk 0x000000010bfb14a2
<unknown> 0x0000000117623950
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
InstanceKlass::call_class_initializer(Thread*) 0x000000010bf22af7
InstanceKlass::initialize_impl(Thread*) 0x000000010bf2244f
Reflection::invoke_constructor(oopDesc*, objArrayHandle, Thread*) 0x000000010c1ebdbb
JVM_NewInstanceFromConstructor 0x000000010bfc14f6
<unknown> 0x0000000117623950
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761dae2
<unknown> 0x000000011761dcec
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) 0x000000010bf7e2af
jni_CallStaticVoidMethodV 0x000000010bf81c69
JNIEnv_::CallStaticVoidMethod(_jclass*, _jmethodID*, ...) jni.h:1521
main main.cpp:80
start 0x00007fff6f6563d5
start 0x00007fff6f6563d5

It doesn't stop in my code. I don't understand why this is happening, or if it's possible to avoid them, aside from ignoring all SIGBUSs.

I minimized my code and created the most simple example which reproduces the issue. Basically I created a cpp project which starts a jni with org/junit/platform/console/ConsoleLauncher as main (junit5), which makes one simple test. And the SIGBUS happens. It happens before my test even run.

I suspect something within JUnit, but not sure. Any way to get to the root cause?

Sample project for reproduction is here: https://github.com/tallavi/sigbus-reproduction

If I run it, you can see that the code stops running after the call to the java part, no "after call", no "CppMainEnd":

CppMainStart
current_path: /Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out
Loading JAR: jars/junit-platform-console-standalone-1.5.2.jar
Loading JAR: jars/.DS_Store
Loading JAR: jars/junit-platform-console-standalone-1.6.0-M1.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
Before call
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

.
+-- JUnit Jupiter [OK]
| '-- FirstTest [OK]
|   '-- myTest() [OK]
'-- JUnit Vintage [OK]

Test run finished after 154 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]


Process finished with exit code 0

If I just change the main from JUnit5 to my main and run the same code, everything works:

CppMainStart
current_path: /Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out
Loading JAR: jars/junit-platform-console-standalone-1.5.2.jar
Loading JAR: jars/.DS_Store
Loading JAR: jars/junit-platform-console-standalone-1.6.0-M1.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
Before call
main START
main END
After call
CppMainEnd

Process finished with exit code 0

I managed handling signals by @Oo.oO's advice, but it doesn't fix the issue of course. The java code finishes, but if I try to access that JVM, for example, destroying it, it hangs! : The stack trace of the hang

But if I let it run (not trying to debug it), it crashes with a different error:

main(31549,0x1177515c0) malloc: *** error for object 0x7ffee6360628: pointer being freed was not allocated
main(31549,0x1177515c0) malloc: *** set a breakpoint in malloc_error_break to debug

With this trace:

When destroying jvm after handling the signal

Note that the SIGBUS doesn't always happen, but the code after that JVM call stops running 100% of the time.

Hope this makes sense to anyone..

UPDATE: this is how it looks in lldb:

MyComputer:out tal$ lldb main
(lldb) target create "main"
Current executable set to 'main' (x86_64).
(lldb) r
Process 57274 launched: '/Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out/main' (x86_64)
CppMainStart
Process 57274 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010b33f51b
->  0x10b33f51b: movl   (%rsi), %eax
    0x10b33f51d: leaq   0x30(%rbp), %rsi
    0x10b33f521: movl   $0x10000, %eax            ; imm = 0x10000
    0x10b33f526: andl   0x4(%rsi), %eax
Target 0: (main) stopped.
(lldb) c
Process 57274 resuming
CreateVM:       JVM loaded successfully!
Before call
Process 57274 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGBUS
    frame #0: 0x0000000112e263ff
->  0x112e263ff: testl  %eax, (%r10)
    0x112e26402: retq
    0x112e26403: nop
    0x112e26404: nop
Target 0: (main) stopped.
(lldb) c
Process 57274 resuming
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 2740 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

After call
before destroying
after destroying
CppMainEnd
Process 57274 exited with status = 0 (0x00000000)
TalL
  • 1,661
  • 16
  • 16
  • Hi. Yep, could you please create a simple project to reproduce it – Alexey Dec 16 '19 at 20:42
  • While there are many possibilities, The most common causes of a SIGBUS (bus error) are unaligned data access (usually someone did a dodgy cast) or a write to a constant (usually someone did a dodgy cast). I can't speak for your case, but trying to continue after a SIGSEGV (segfault) is usually the wrong solution. Segfault means the program is horribly broken and accessing illegal storage. It needs to be fixed because the program is now unstable. Who can say what has been smashed? – user4581301 Dec 16 '19 at 21:03
  • @Alexey - here you go: https://github.com/tallavi/sigbus-reproduction If it doesn't do a SIGBUS (after the SIGSEGV), I suggest renaming the test class over and over again. I don't know why, but it help me reproduce SIGBUSes.. Hope you'll get it. – TalL Dec 16 '19 at 21:05
  • @user4581301 - yeah, that's why I want to get to the root cause even if it's not happening in my own code. I want to leave ignoring as a last resort. Although, if it's a fatal error, wouldn't you expect it to happen all the same on normal running without debugging? Why would it only happen on debug? It's like those are ghost signals or something. – TalL Dec 16 '19 at 21:08
  • Almost certainly some lurking Undefined Behaviour. Debugging changes things and in this case changes them enough that the bad code visibly blows up. Personally i consider this a win. Code that runs flawlessly (as far as you can tell) when debugging but doesn't outside the debugger is usually MUCH harder to find and kill. – user4581301 Dec 16 '19 at 21:12
  • I'm not that expert in C++ (even C is hard to write for me), but my educated guess would be to check `env->NewStringUTF` with just strings without `.c_str()`. As far as I understand that returns pointer to `std::string` instead of pointer to char array and can be the cause. – Alexey Dec 16 '19 at 21:54
  • Have you considered the possibility that these SIGBUS signals are integral to the working of your third party library? I know of at least one library that uses it to trap writes into mmapped memory and mark pages as dirty – Botje Dec 17 '19 at 07:14
  • @Alexey - thank you for your response. But do note that there's ANYTHING in my code which causes this. If it were, you would see some of my methods in the stack trace. It's like it is caused by junit or the jvm or something, but I don't even have a lead. Have you tried to reproduce? – TalL Dec 17 '19 at 09:52
  • @Botje - thank you for your response. That's what I'm afraid of. That I have no choice other than ignoring all SIGBUSes, and by that perhaps masking actual issues. If at least I had a way of ignoring SIGBUSes coming from a specific libraries but not the java or c++ of my own code, that would be great. Currently I use 'pro hand -p true -s false SIGBUS'. Anyone knows if it's possible to limit the ignoring to a specific library? – TalL Dec 17 '19 at 09:55
  • If the third-party library is well-behaved it should restore the original SIGBUS handler when it exits. If that is the case, you can install a SIGBUS handler of your own and put a breakpoint in it. – Botje Dec 17 '19 at 09:59
  • @Botje - I tried a few examples of catching SIGBUS, but it doesn't get caught.. For example https://stackoverflow.com/questions/13834643/catch-sigbus-in-c-and-c – TalL Dec 17 '19 at 13:55
  • @Botje - Also, I'm not sure how this would help. Catching the SIGBUS at best would put me in the same position I'm already in - I see the stacktrace but can't make anything of it because it's not my code which throws it. – TalL Dec 17 '19 at 14:06
  • Assuming the library installs its own SIGBUS handler on entry and restores yours on exit, any remaining (java or your code) SIGBUS should go to *your* handler. – Botje Dec 17 '19 at 14:09
  • So it doesn't do that. The debugger stops on the signal, but my handler doesn't get called. – TalL Dec 17 '19 at 14:44

2 Answers2

1

It might be hard to find without knowing exactly what env you have. There are multiple factors here:

  • boost version
  • Java version
  • compiler version
  • etc.

If I take your sample, strip it to bare minimum (like this)

# Linux

> g++ -o obj/main \
  -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux/ \
  -L${JAVA_HOME}/jre/lib/amd64/server -ljvm \
  -L${BOOST_LIB} -lboost_system -lboost_filesystem \
  -I$BOOST_INC src/main/cpp/main.cpp

> javac -cp jars/junit-platform-console-standalone.jar \
  -d target src/main/java/FirstTest.java

> jar cf jars/sigbus-reproduction.jar -C target .

> ./obj/main

or, slightly modified on macOS

# macOS

> g++ -std=c++11 -o obj/main \
  -I${JAVA_HOME}/include -I${JAVA_HOME}/include/darwin/ \
  -L${JAVA_HOME}/lib/server -rpath ${JAVA_HOME}/lib/server -ljvm \
  -L${BOOST_LIB} -rpath ${BOOST_LIB} -lboost_system -lboost_filesystem \
  -I$BOOST_INC src/main/cpp/main.cpp

it simply works as expected. Also, there are neither SIGSEGV nor SIGBUS inside gdb, lldb

> ./obj/main
CppMainStart
current_path: /Users/michalo/tmp/sigbus-reproduction
Loading JAR: jars/junit-platform-console-standalone.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5061 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

I guess, it may take time and effort to find somebody who can reproduce your issue.

Calling JUnit as method

#include <iostream>
...
...
...

int main(int argc, char **argv) {

  // make sure to store oryginal stdout
  // JVM (JUnit) will mess with it
  int old_stdout = dup(1);

  std::cout << "CppMainStart" << std::endl;

...
...
...

  env->SetObjectArrayElement(argsArray, 0, env->NewStringUTF("--class-path"));
  env->SetObjectArrayElement(argsArray, 1, env->NewStringUTF(V2X_FILE_NAME.c_str()));
  env->SetObjectArrayElement(argsArray, 2, env->NewStringUTF((std::string("--scan-classpath")).c_str()));

// instead of calling main, you can call execute

  jclass system_class     = env->FindClass( "java/lang/System");
  jfieldID field_id_out   = env->GetStaticFieldID(system_class, "out", "Ljava/io/PrintStream;");
  jobject field_id_out_v  = env->GetStaticObjectField(system_class, field_id_out);

  jfieldID field_id_err   = env->GetStaticFieldID(system_class, "err", "Ljava/io/PrintStream;");
  jobject field_id_err_v  = env->GetStaticObjectField(system_class, field_id_err);

  jmethodID execMethod = env->GetStaticMethodID(mainClass,
    "execute",
    "(Ljava/io/PrintStream;Ljava/io/PrintStream;[Ljava/lang/String;)Lorg/junit/platform/console/ConsoleLauncherExecutionResult;");

  jobject result = env->CallStaticObjectMethod(mainClass, execMethod, field_id_out_v, field_id_err_v, argsArray);

  jvm->DestroyJavaVM();

  // restore oryginal stdout
  FILE *fp2 = fdopen(old_stdout, "w");
  *stdout = *fp2;

  std::cout  << "CppMainEnd" << std::endl << std::flush;

  return 0;
}

and here you go. There is CppMainEnd at the end.

> ./obj/main
CppMainStart
current_path: /Users/michalo/tmp/sigbus-reproduction
Loading JAR: jars/junit-platform-console-standalone.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5060 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

CppMainEnd

I'd suggest to minimise the content of your code. Make is as essential as possible. Otherwise, it will be hard for you to find the source of the issue.

If I run this kind of code (which is really close to the essence of JNI calls).

#include <iostream>
#include <jni.h>
#include <unistd.h>

int main(int argc, char **argv) {

  int old_stdout = dup(1);

  std::cout << "Cpp Start" << std::endl;

  JavaVM *jvm;
  JNIEnv *env;
  JavaVMInitArgs vm_args;
  JavaVMOption* options = new JavaVMOption[1];

  options[0].optionString = const_cast<char *>("-Djava.class.path=jars/junit-platform-console-standalone.jar:jars/sigbus-reproduction.jar");
  vm_args.version = JNI_VERSION_1_6;
  vm_args.nOptions = 1;
  vm_args.options = options;
  vm_args.ignoreUnrecognized = false;

  long status = JNI_CreateJavaVM(&jvm, (void**)&env, &vm_args);

  jclass mainClass = env->FindClass("org/junit/platform/console/ConsoleLauncher");

  jclass stringClass = env->FindClass("java/lang/String");

  jobject emptyStringObject = env->NewStringUTF("");

  jobjectArray argsArray = env->NewObjectArray(3, stringClass, emptyStringObject);

  env->SetObjectArrayElement(argsArray, 0, env->NewStringUTF("--class-path"));
  env->SetObjectArrayElement(argsArray, 1, env->NewStringUTF("jars/sigbus-reproduction.jar"));
  env->SetObjectArrayElement(argsArray, 2, env->NewStringUTF("--scan-classpath"));

  jclass system_class     = env->FindClass( "java/lang/System");
  jfieldID field_id_out   = env->GetStaticFieldID(system_class, "out", "Ljava/io/PrintStream;");
  jobject field_id_out_v  = env->GetStaticObjectField(system_class, field_id_out);

  jfieldID field_id_err   = env->GetStaticFieldID(system_class, "err", "Ljava/io/PrintStream;");
  jobject field_id_err_v  = env->GetStaticObjectField(system_class, field_id_err);

  jmethodID execMethod = env->GetStaticMethodID(mainClass,
    "execute",
    "(Ljava/io/PrintStream;Ljava/io/PrintStream;[Ljava/lang/String;)Lorg/junit/platform/console/ConsoleLauncherExecutionResult;");

  jobject result = env->CallStaticObjectMethod(mainClass, execMethod, field_id_out_v, field_id_err_v, argsArray);

  jvm->DestroyJavaVM();

  // restore oryginal stdout
  FILE *fp2 = fdopen(old_stdout, "w");
  *stdout = *fp2;

  std::cout  << "CppMainEnd" << std::endl << std::flush;

  delete[] options;

  return 0;
}

there is nothing strange in the lldb

lldb obj/main
(lldb) target create "obj/main"
Current executable set to 'obj/main' (x86_64).
(lldb) run
Process 921 launched: '.../main' (x86_64)
Cpp Start
Process 921 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010b33f51b
->  0x10b33f51b: movl   (%rsi), %eax
    0x10b33f51d: leaq   0x30(%rbp), %rsi
    0x10b33f521: movl   $0x10000, %eax            ; imm = 0x10000
    0x10b33f526: andl   0x4(%rsi), %eax
Target 0: (main) stopped.
(lldb) cont
Process 921 resuming
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5060 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

CppMainEnd
Process 921 exited with status = 0 (0x00000000)

Running multiple times

No matter how many times I run the code, there is no SIGBUS :(

You can easily run the code (thousands of times) like this:

--- 8< --- CUT HERE --- lldb_run --- 8< --- CUT HERE ---

target create main
break set -n main -C "process handle --pass true --stop false SIGSEGV" -C "continue"
run
script import os; os._exit(0)

--- 8< --- CUT HERE --- lldb_run --- 8< --- CUT HERE ---

and then, running it in the loop: for i in {1..100}; do lldb --source ./lldb_run; done

Oo.oO
  • 12,464
  • 3
  • 23
  • 45
  • Thanks for the reply. Have you tried debugging? Everything works perfectly when running without debug. No SIGSEGV, no SIGBUS. I understand the SIGSEGV when starting a JVM is a known issue, but the SIGBUS is of course not a known issue. – TalL Dec 18 '19 at 17:05
  • Source for the known issue of SIGSEGV when starting a JVM: https://stackoverflow.com/questions/13132669/strange-sigsegv-while-calling-java-code-from-c-through-jni – TalL Dec 18 '19 at 17:07
  • `SIGSEGV` is used by JVM to throw exceptions. When it comes to "catching" `SIGSEGV/SIGBUS` you can try this approach: http://jnicookbook.owsiak.org/recipe-No-015/ - note that this is more like surviving nasty stuff from third party libraries (keeping JVM alive for a while) rather than turning it into general approach. As for your issue, I can't reproduce it neither in `gdb` nor in `lldb` – Oo.oO Dec 18 '19 at 19:00
  • Well, I guess it's some specific combination then. My versions, if you have any way to try and reproduce: MacOS 10.14.6 Clion 2019.3 Java 11.0.4 2019-07-16 LTS JUnit versions I tried: 1.5.1, 1.5.2, 1.6.0-M1 lldb-1001.0.13.3 – TalL Dec 18 '19 at 20:11
  • Sorry mate :( No chance for me :( I am already at 10.15. Please, also note that you are using boost as well. Question is whether it was compiled for your macOS version, whether it's most recent one, etc. I suggest to install boost from sources: http://www.owsiak.org/installing-boost-at-macos/ – Oo.oO Dec 18 '19 at 20:38
  • Thanks @Oo.oO, but I noticed something interesting! Notice that your message has CppMainStart but no CppMainEnd! It did crash. I added more messages (before call, after call). I see the before call, but don't see the after call! It just stops running my code. You can change a single thing - just change the main in the start of the file to FirstTest instead of JUnit and see that everything works! – TalL Dec 19 '19 at 12:45
  • I updated the questions with lots more information on what's happening and how you can see the problem for yourself. – TalL Dec 19 '19 at 12:58
  • It has nothing to do with crash :) Please note that you are calling method `main` inside class `org/junit/platform/console/ConsoleLauncher` where you have `System.exit(exitCode);` at the very end of `main`. It means your code never gets back to `C++` part after calling tests. If you want to get back to your `C++` code you need to call `execute` in this class, instead. – Oo.oO Dec 19 '19 at 13:17
  • Wow, I wasn't aware of that. I updated the project. You might be surprised to know that even with execute instead of main, the code doesn't return the the c++ side after finishing the java. It doesn't even print the log line which is after the junit call (still in java)! It's like it crashes silently after the tests are done. You can change CALL_METHOD to a method without junit to see that without it, it returns to cpp. Also, I made a main in the java, to show that if you run the same code without starting a jvm from c++, it does finishes and print the log line in the end. Something is fishy. – TalL Dec 19 '19 at 21:23
  • Also, note that linking with shared libraries on macOS might be tricky. Especially with `Java`. You may end up with different shared library being loaded at the end. Double check that you are using proper one: `DYLD_PRINT_LIBRARIES=1 ./main` and take a look at your binary - to what it is linked to: `otool -L ./main` – Oo.oO Dec 20 '19 at 05:37
  • Checked, nothing too weird there. And also to refute the theory that boost somehow causes this, I removed boost (it was used to build the class path, I just wrote it explicitly). Same problem. Have you tried my latest version? I believe it shows without doubt that JUnit is the problem. You can see there that running from java works (runs code after the test), but if you run in c++ it doesn't run the code after the test, BUT if you run from c++ the same exact code, just without JUnit, it also runs! – TalL Dec 20 '19 at 20:07
  • Just to be clear - now that I removed boost, there are no more c++ dependencies aside from JDK, and in the java side the only dependency is JUnit. – TalL Dec 20 '19 at 20:08
  • If you think this is JUnit who is responsible for the bug, try different version, make sure it fails with a different release as well. Also, note that you are deleting `options` quite early in the process. To be honest, I am not quite sure whether options are copied inside JVM or not - when it is created. Anyway, mate, good luck with this one :) – Oo.oO Dec 20 '19 at 21:50
  • Wow, thanks for all the tweaks! I tried them all. Moving the delete option to the end, calling execute directly from the c++ (before I called a main method in java which did this, because I'm bad in writing JNI wrappers for methods with lots of parameters). I also stripped down boost and explicitly wrote the classpath. I tried 3 different versions of junit + compiling the source on my own. No change. The part about restoring the stdout did work! I don't know why I need to do it, but it does indeed writes the end logs now. – TalL Dec 22 '19 at 09:41
  • I updated my question with my output of lldb. I see you also got the SIGSEGV as this is apparently expected behaviour, but I also get SIGBUS after this. Note that I get the SIGBUS ONLY SOMETIMES! If you don't mind trying 10 times and see if you get the SIGBUS at least once, that would be great.. – TalL Dec 22 '19 at 09:43
0

You are incorrectly assuming that signals such as SIGSEGV or SIGBUS indicate a problem in Java. You are also likely breaking things such as null pointer detection.

Why am I seeing SIGSEGV when I strace a Java application on Linux?!

Main Article

Most people that have used Unix for any amount of time are familiar with occasionally seeing "Segmentation Fault (core dumped)" from poorly written programs. If that's all you knew about Unix and you looked at the output of strace on a Java process you'd think something was seriously wrong ("Wow, look at all these segfaults. Those guys at Sun/Oracle must be terrible programmers and they don't know what the hell they're doing!").

The real story is quite different - SIGSEGV in a Java process is almost always perfectly normal and completely safe.

...

The JVM is a multi-threaded process and so under the covers it's using signals to do OS level threading. ...

...

Signal Description

  • SIGSEGV, SIGBUS, SIGFPE, SIGPIPE, SIGILL Used in the implementation for implicit null check, and so forth.
  • SIGQUIT Thread dump support: To dump Java stack traces at the standard error stream. (Optional.)

...

Table stolen wholesale from http://download.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/signals.html

Per that link:

6.1 Signal Handling on Solaris OS and Linux

The HotSpot Virtual Machine installs signal handlers to implement various features and to handle fatal error conditions. For example, in an optimization to avoid explicit null checks in cases where java.lang.NullPointerException will be thrown rarely, the SIGSEGV signal is caught and handled, and the NullPointerException is thrown.

In general there are two categories of situations where signal/traps arise.

  • Situations in which signals are expected and handled. Examples include the implicit null handling cited above. Another example is the safepoint polling mechanism, which protects a page in memory when a safepoint is required. Any thread that accesses that page causes a SIGSEGV, which results in the execution of a stub that brings the thread to a safepoint.

  • Unexpected signals. This includes a SIGSEGV when executing in VM code, JNI code, or native code. In these cases the signal is unexpected, so fatal error handling is invoked to create the error log and terminate the process.

If you need to handle fatal signals, see Signal Handling on Linux when using Java/JNI.

Community
  • 1
  • 1
Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • Hi Andrew, thanks for the reply. Your two points at the beginning are contradicting each other, aren't there? Either the SIGBUSes are not a problem, or I'm breaking stuff like null referencing. How can it be both? Anyway, I understand that the jvm uses signals not just for critical issues, and the SIGBUS indeed doesn't crash the process, but it's impossible to debug this way unless I ignore all SIGBUSes using 'pro hand -p true -s false SIGBUS', but I'm wondering if that's the correct approach or I just sweeping under the rug here. – TalL Dec 22 '19 at 09:36
  • @TalL They are not contradictory. `SIGBUS` in the JVM is not necessarily a problem - in fact, the JVM installs signal handlers for those signals. Read the quote: "The HotSpot Virtual Machine installs signal handlers to implement various features and to handle fatal error conditions." When you install your own signal handlers, you break the JVM. You've already seen that: "I managed handling signals by @Oo.oO's advice, but it doesn't fix the issue of course. The java code finishes, but if I try to access that JVM, for example, destroying it, it hangs!" – Andrew Henle Dec 22 '19 at 12:03
  • What I don't get is that if the signals are handled, why lldb also stops on them and how to prevent this behaviour. – TalL Dec 22 '19 at 15:54