0

In short I've been trying to accomplish the following: I want to be able to effectively manage a lot(in the billions/they don't take long to finish, but I am creating arrays to process at a high pace, and if i don't pass stuff to the thread right away, the array gets so large it causes segfaults) of threads which pass data to JNI if necessary and need to be stored in a vector.

I've been facing two problems:

The first one is, that if I try to spawn more than around 45 threads all running JNI simultaneously, Java crashes. If they aren't running all at the same time, it works fine, but I get lots of complaints from the GC that it wasn't getting enough memory which doesn't seem to affect anything though.

The second is, that if I spawn the threads at the speed I am at the moment, the vector which I use to manage and later join them, gets too large.

So in conclusion I need a fast way to keep track of the threads I am creating without sacrificing any time pretty much.

//g++ -std=c++11 -I/usr/lib/jvm/java-8-openjdk/include -I/usr/lib/jvm/java-8-openjdk/include/linux cpptest/Test.cpp -L/usr/lib/jvm/java-8-openjdk/jre/lib/amd64/server -ljvm -lpthread
#include <jni.h>
#include <iostream>
#include <thread>
#include <string.h>
#include <vector>
#include <chrono>
#include <mutex>
#include <fstream>
#include <algorithm>

jclass cls;
jmethodID mid;
JNIEnv* env;
JavaVM* jvm;
std::mutex m;

typedef struct {
    long seed;
    int chunkX;
    int chunkZ;
    int eyes;
} Stronghold;

void ThreadFunc(Stronghold strhld, std::ofstream *outfile) {
  jvm->AttachCurrentThread((void**)&env, NULL);
  jlongArray rt = (jlongArray)env->CallStaticLongMethod(cls, mid, (jlong)strhld.seed, (jint)strhld.chunkX, (jint)strhld.chunkZ, (jint)strhld.eyes);
  jsize size = env->GetArrayLength(rt);
  std::vector<long> rtVec(size);
  env->GetLongArrayRegion(rt, 0, size, &rtVec[0]);
    jvm->DetachCurrentThread();
    std::string write;
    m.lock();
  for(long &element : rtVec) {
        write = std::to_string(element) + "; ";
    *outfile << write;
  }
    *outfile << std::endl;
    m.unlock();
}

int main(int argc, char* argv[]) {
  std::ofstream outfile("./new.txt",std::ofstream::binary);
    std::vector<std::thread> threads;

  const int kNumOptions = 3;
  JavaVMOption options[kNumOptions] = {
    { const_cast<char*>("-Xmx512m"), NULL },
    { const_cast<char*>("-verbose:gc"), NULL },
    { const_cast<char*>("-Djava.class.path=/home/jewe37/Desktop/"), NULL }
  };

  JavaVMInitArgs vm_args;
  vm_args.version = JNI_VERSION_1_8;
  vm_args.options = options;
  vm_args.nOptions = sizeof(options) / sizeof(JavaVMOption);

  env = NULL;
  jvm = NULL;
  JNI_CreateJavaVM(&jvm, reinterpret_cast<void**>(&env), &vm_args);

  const char* kClassName = "Processor";
  cls = env->FindClass(kClassName);
  if (cls == NULL) {
    std::cerr << "FINDCLASS" << std::endl;
        return 1;
    }

  const char* kMethodName = "ProcessSeed";
  mid = env->GetStaticMethodID(cls, kMethodName, "(JIII)[J");
  if (mid == NULL) {
    std::cerr << "FINDMETHOD" << std::endl;
        return 1;
    }

  Stronghold strhld;

    for(int i = 0; i < std::stoi(argv[1]); i++) {
        strhld = {i, i*2, i*3, i*4};
        threads.emplace_back(ThreadFunc, strhld, &outfile);
        std::this_thread::sleep_for(std::chrono::microseconds(50));
    }

    std::cout << threads.size() << std::endl;

    for (std::thread &thread : threads) if (thread.joinable()) thread.join();

  jvm->DestroyJavaVM();

  outfile.close();
  return 0;
}
JeWe37
  • 21
  • 3
  • 2
    It sounds like you're looking for a thread pool. Creating a new thread is comparatively expensive, and you're doing an insane amount of that. Moreover, having more concurrent threads than you have execution resources with which to run them is not usually helpful, and can be harmful. Anyway, I don't see the value of getting so far ahead of actual execution of your tasks. Put them in a bounded queue, so that whatever thread is generating them blocks when it gets too far ahead. That could have the additional benefit of freeing up resources with which to run the actual jobs. – John Bollinger May 30 '17 at 15:41
  • @JohnBollinger Having this many threads seemed like the best option to deal with the output of the other computations coming in every ~350us. I do definitely need to use threads though since I need to separate out this to allow the rest to be kept going. I'll see if a Thread pool is going to work in my specific case. – JeWe37 May 30 '17 at 15:52
  • That your vector is growing so large suggests that execution of your tasks is not keeping up with their generation. It is possible that removing the overhead of creating and joining threads (by using a thread pool) will help with that, but if that's not enough then you have a deeper problem. – John Bollinger May 30 '17 at 16:00
  • @JeWe37 There is no way that you have 1 billion threads all running at the same time (because that would take up at least 4TB of RAM). Rather than starting and stopping a bunch of threads, you should keep a fixed number (e.g. 32) of threads running and distribute work to them. – Tavian Barnes May 30 '17 at 20:31

2 Answers2

0

You cannot share JNIEnv between threads. It must be per-thread. Make env local to ThreadFunc(). This question is answered thoroughly here. Also do not forget to detach your native thread before it quits.

Wheezil
  • 3,157
  • 1
  • 23
  • 36
0

You can always use mutex and this way make sure to access JVM by single thread only:

http://jnicookbook.owsiak.org/recipe-no-027/

Oo.oO
  • 12,464
  • 3
  • 23
  • 45