1

I need to find out what the default file encoding is on a remote Java vm, in a Java program.

Is there a way to execute Charset.defaultCharset() on the remote vm and get its value back... without altering the program running on the remote jvm?

Update:

I am trying to find out what the default Charset is for a WebLogic 11g or WebLogic 12c server... that I did not start, cannot restart and I do not have the 'right' to deploy code onto it.

I also need to be able to determine the default Charset of the server process from inside a Java program that I am writing. It may execute on the same machine as the server... or not. It is very doubtful that the the server and my program will start with the same environment.

I would prefer a method which depends on very few assumption... so that usually means more code...

I probably cannot execute Charset.defaultCharset() on the server... so I should not have said 'execute Charset.defaultCharset()'. Sorry about that folks. I need to do something that will provide the answer that is as correct as executing Charset.defaultCharset() from inside the server process.

vkraemer
  • 9,864
  • 2
  • 30
  • 44
  • 1
    Why can't you modify the program? – Alex W Feb 17 '13 at 16:02
  • @alex-w : I do not have the source for it – vkraemer Feb 17 '13 at 16:06
  • @Jayan : VisualVM has an API that I can use in my Java program? – vkraemer Feb 17 '13 at 16:14
  • @ vkraemer : It comes with lot of plugins, not sure how to add to java program. There may be some thing useful to you. What is the platform? – Jayan Feb 17 '13 at 16:17
  • @Jayan : jdk 6, Hotspot VM... it would be nice if the answer would apply to JRockit, but that is not a primary requirement. – vkraemer Feb 17 '13 at 16:28
  • I think you need to add more information to your question about the constraints for a solution. Is this a problem you have with a single deployment, or do you need a generic and scalable solution for many deployments? Do you have access to the machine running the application? if yes, what can/can't you do and what OS is it? Is the app running inside a container of some form (servlet, ejb, OSGi)? How strict is 'without altering the program`, eg. can you start the JVM process with an altered CLASSPATH? Can you influence the startup of the application in any way at all? etc... – unthought Feb 17 '13 at 19:14

3 Answers3

3

Edit: After writing my answer, I discovered that it's at least partially based on a faulty assumption, in that Charset.defaultCharset() isn't guaranteed to always return the same value. Some of the approaches below should still work provided that they're tried on the same host as the target application, but I certainly recommend to also read the first two answers of this question for more background.

In particular it might be easier to forcibly override file.encoding instead of trying to figure out what it actually is.


As the javadoc of defaultCharset states:

The default charset is determined during virtual-machine startup and typically depends upon the locale and charset of the underlying operating system.

Meaning that defaultCharset() is read-only inside the JVM process and will return the same charset for all JVM processes started on the same machine unless their environment has been changed explicitly prior to starting the process (eg. a wrapper/launcher script starting the JVM and setting a different locale for the current process and its children). If you're sure sure that the two processes are started in the same way, then Charset.defaultCharset() should return the same Charset as the application you're asking for.

With that as a backdrop, and in increasing order of annoyance/effort:

  1. If your host is running Unix/Linux, try procfs. Eg. /proc/<vmpid>/environ and /proc/<vmpid>/cmdline (on Linux) would be great places to start because they show you how the process was actually started without the obfuscation of a wrapper script. This solution also gets bonus points because it doesn't need you to restart/alter the application for inspection. Things to look out for: LANG and LC_* variables (intro to locale on Linux) and JVM command line parameters affecting the locale. Other operating systems will likely also have some form of process inspection that you can use to show this information.

  2. Next up: Compile and run this on the particular host/JVM:

    import java.nio.charset.Charset;
    
    public class DumpCharset {
      public static void main(String[] args) {
        System.out.println(Charset.defaultCharset().displayName());
      }
    }
    

    As mentioned, if the processes are started in the same way, Charset.defaultCharset() should return the same value (on the same host). To get very close, you could even replace the application's jar containing the main method with a jar containing the above code temporarily (make sure the class names match).

  3. If that doesn't give you information you need (it should), try launching the process so it accepts a debugger, attach the debugger, and then drill down into the locale, and/or execute expressions similar to the above code.

  4. If that still doesn't give you the info you need, then you can go radical and use dynamic bytecode weaving at class-load time. This could be achieved with an existing AOP framework based on load time weaving (eg. AspectJ), or directly with ASM 4 and the java.lang.instrument API. Be aware that there are pitfalls to make this work, so it's hard to judge whether this is going to be reasonably straightforward or not in your case. But expect it to be (much?) more work than the above methods.

Community
  • 1
  • 1
unthought
  • 651
  • 1
  • 15
  • 24
  • It looks like all of these approaches hinge on the same "if": the 'local' Java program and the 'remote' Java program are started the same way. – vkraemer Feb 17 '13 at 18:25
  • The last two don't, and I was always talking about the remote process, as in 'the same host'. – unthought Feb 17 '13 at 18:39
0

I suggest you to use System.getProperty( "os.name" ), System.getProperty( "os.arch" ) to identify the remote architecture.

The default Charset may be useful too:

java.nio.Charset cs = java.nio.Charset.defaultCharset();
Aubin
  • 14,617
  • 9
  • 61
  • 84
0

Here is what I ended up doing... (roughly)

  mbs = conn.getMBeanServerConnection();
  ObjectName runtime = new ObjectName(ManagementFactory.RUNTIME_MXBEAN_NAME);
  TabularDataSupport foo = 
    (TabularDataSupport) mbs.getAttribute(runtime, "SystemProperties");
  for (Iterator<Object> it = foo.values().iterator(); 
                      it.hasNext() && null == retVal; ) {
    CompositeDataSupport cds = (CompositeDataSupport) it.next();
    for (Iterator<?> iter = cds.values().iterator() ; 
                   iter.hasNext() && null == retVal ;) {
      if ("file.encoding".equals(iter.next()) && iter.hasNext())
        retVal = iter.next().toString();
    }

I connected to the MBeanServer and then worked through the SystemProperties to find the file.encoding for the process on the other end of the connection.

vkraemer
  • 9,864
  • 2
  • 30
  • 44