2

I have a quite complex project to migrate from C++ (Linux) to Java Currently, the C++ version is being distributed as a shared library (.so) followed by top-level interface header class. The implementation details are fully hidden from the final user.

This question is not about porting the C++ code to Java, but rather about creating similar distribution package.

Let's assume I have a very simple 'public' class in C++, topapi.h:

class TopApi
{ 
public:
  void do( const string& v );
}

The actual implementation is hidden from the API user. The actual project may contain another 100 files/classes do() will call.

The distribution will contain 2 files: topapi.so and topapi.h

Users will #include "topapi.h" in their code, and link their applications with topapi.so.

The questions are: 1. How can I achieve a similar effect in Java (hide the IP related code) 2. How do I show public methods to the user ( not related to code protection, just a java version of the header file above )

pjs
  • 18,696
  • 4
  • 27
  • 56
  • 3
    *"how can I achieve a similar effect in Java?"* The same way you do with any other proprietary product: an NDA and anti-reverse-engineering clause in the license agreement. – cdhowie Mar 31 '17 at 20:10
  • 2
    Is this really a question about obfuscation? Or do you simply want to expose only the public API to the user? – Jorn Vernee Mar 31 '17 at 20:29
  • 1
    Great question! Actually, it's both. And while I did get the answer on the first one (obfuscation), the second one remains open. I will try to update the question. – Oleg Erlikh Mar 31 '17 at 20:32

2 Answers2

2

Check out proguard. It will at least obfuscate the jar file, which otherwise is basically human readable. It's not absolutely safe from reverse engineering, but I guess neither is an so file.

I'm not an expert with Java, but this is what we have done to protect implementations in the past.

I don't know exactly what the motivations are for a Java port, but if it is just to support a Java end user, you could consider a JNI wrapper. I guess this probably isn't the case, but I thought I would mention it.

As far as exposing interface code to the user, you could write a Java interface class (like a pure virtual abstract c++ class) and simply not proguard that class.

Mitchell Kline
  • 343
  • 2
  • 9
  • 1
    Thank you. This is for Java end user, and we did consider JNI option. The problem is the performance is critical in our case. The existing code (C++ at the moment) should be called from Java environment, do some heavy calculations, and return massive results to the caller. I'm afraid in this case JNI will completely kill the performance. Byu the way, .so file *relatively* safe from rev. eng, if generated with corresponding flags, and, obviously, with debug info removed. – Oleg Erlikh Mar 31 '17 at 20:17
  • I just ported a bunch of performance critical numerical computations from Java to C++. Again, I'm not a java expert, but the JNI layer may not be your only problem. In my case, I'm working with an EKF with moderately large matrices (20-40 elements squared). I'm pretty sure there are ways to make Java competitive, but there were other reasons we would have liked the C++ implementation anyway. – Mitchell Kline Mar 31 '17 at 20:41
1

To answer the question of how to show public methods to the user. This is usually done through a combination of declaring the internal classes without an access modifier, which makes them only accessible from within the same package, and not documenting them. Don't depend on the former though, it's easily circumventable, but it sends the message to the user that those classes are internal.

Java 9 adds modules which allow you to encapsulate entire packages, but it's not here yet, and you would still be able to circumvent the encapsulation.

One side effect of ahead of time compilation (usually the case with C++) is that the distributed code is already optimized, and contains no metadata, so it's harder to reverse engineer. Java is distributed in an intermediate language, but the actual machine code is generated at runtime (JIT compilation). The intermediate language is practically un-optimized, so it's easier to reverse engineer. Java also merges the idea of header files and source files where a .class file will contain all the metadata you need to use it.

Jorn Vernee
  • 31,735
  • 4
  • 76
  • 93
  • Thank you, I will look into that (declaring internal classes ). I think this is exactly what I need. I realize it will not really protect the code, but at least will create relatively 'clean' visible lawyer for the final user. – Oleg Erlikh Mar 31 '17 at 20:59