11

I've got a rather large java ee application with a huge classpath doing a lot of xml processing. Currently I am trying to speed up some of my functions and locating slow code paths via sampling profilers.

One thing I noticed is that especially parts of our code in which we have calls like TransformerFactory.newInstance(...) are desperately slow. I tracked this down to FactoryFinder method findServiceProvider always creating a new ServiceLoader instance. In ServiceLoader javadoc I found the following note about caching:

Providers are located and instantiated lazily, that is, on demand. A service loader maintains a cache of the providers that have been loaded so far. Each invocation of the iterator method returns an iterator that first yields all of the elements of the cache, in instantiation order, and then lazily locates and instantiates any remaining providers, adding each one to the cache in turn. The cache can be cleared via the reload method.

So far so good. This is a part of OpenJDKs FactoryFinder#findServiceProvider method:

private static <T> T findServiceProvider(final Class<T> type)
        throws TransformerFactoryConfigurationError
    {
      try {
            return AccessController.doPrivileged(new PrivilegedAction<T>() {
                public T run() {
                    final ServiceLoader<T> serviceLoader = ServiceLoader.load(type);
                    final Iterator<T> iterator = serviceLoader.iterator();
                    if (iterator.hasNext()) {
                        return iterator.next();
                    } else {
                        return null;
                    }
                 }
            });
        } catch(ServiceConfigurationError e) {
            ...
        }
    }

Every call to findServiceProvider calls ServiceLoader.load. This creates a new ServiceLoader each time. This way it seems that there is no use of ServiceLoaders caching mechanism at all. Every call scans the classpath for the requested ServiceProvider.

What I've already tried:

  1. I know you can set a system property like javax.xml.transform.TransformerFactory to specify a specific implementation. This way FactoryFinder does not use the ServiceLoader process and its super fast. Sadly this is a jvm wide property and affects other java processes running in my jvm. For example my application ships with Saxon and should use com.saxonica.config.EnterpriseTransformerFactory I've got another application which does not ship with Saxon. As soon as I set the system property, my other application fails to start, because there is no com.saxonica.config.EnterpriseTransformerFactory on its classpath. So this does not seem to be an option for me.
  2. I already refactored every place where a TransformerFactory.newInstance is called and cache the TransformerFactory. But there are various places in my dependencies where I can not refactor the code.

My questions is: Why does FactoryFinder not reuse a ServiceLoader? Is there a way to speed up this whole ServiceLoader process other than using system properties? Couldn't this be changed in the JDK so that a FactoryFinder reuses a ServiceLoader instance? Also this is not specific to a single FactoryFinder. This bahaviour is the same for all FactoryFinder classes in the javax.xml package i have looked at so far.

I am using OpenJDK 8/11. My applications are deployed in a Tomcat 9 instance.

Edit: Providing more details

Here is the call stack for a single XMLInputFactory.newInstance call: enter image description here

Where most resources are used is in ServiceLoaders$LazyIterator.hasNextService. This method calls getResources on ClassLoader to read the META-INF/services/javax.xml.stream.XMLInputFactory file. That call alone takes about 35ms each time.

Is there a way to instruct Tomcat to better cache these files so they are served faster?

Wagner Michael
  • 2,172
  • 1
  • 15
  • 29
  • I agree with your assessment of FactoryFinder.java. It looks like it should be caching the ServiceLoader. Have you tried downloading the openjdk source and building it. I know that sounds like a large task but it might not be. Also, it might be worth it to write an issue against FactoryFinder.java and see if somebody picks up the issue and offers a solution. – djhallx Oct 18 '19 at 03:11
  • Have you tried to set property using `-D` flag to your `Tomcat` process? For example: `-Djavax.xml.transform.TransformerFactory=.` It should not override properties for other apps. Your post is well described and probably you have tried it but I would like to confirm. See [How to set Javax.xml.transform.TransformerFactory system property](https://stackoverflow.com/questions/53629927/how-to-set-javax-xml-transform-transformerfactory-system-property), [How to set HeapMemory or JVM Arguments in Tomcat](https://www.middlewareinventory.com/blog/set-heapmemory-jvm-arguments-tomcat/) – Michał Ziober Oct 22 '19 at 17:18

2 Answers2

2

35 ms sounds like there is disc access times involved, and that points to a problem with OS cacheing.

If there any directory/non-jar entries on the classpath that can slow things down. Also if the resource isn't present at the first location that is checked.

ClassLoader.getResource can be overridden if you can set the thread context class loader, either through configuration (I haven't touched tomcat for years) or just Thread.setContextClassLoader.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
2

I could get another 30 minutes to debug this and looked at how Tomcat does Resource Caching.

In particular CachedResource.validateResources (which can be found in the flamegraph above) was of interest for me. It returns true if the CachedResource is still valid:

protected boolean validateResources(boolean useClassLoaderResources) {
        long now = System.currentTimeMillis();
        if (this.webResources == null) {
            ...
        }

        // TTL check here!!
        if (now < this.nextCheck) {
            return true;
        } else if (this.root.isPackedWarFile()) {
            this.nextCheck = this.ttl + now;
            return true;
        } else {
            return false;
        }
    }

Seems like a CachedResource actually has a time to live (ttl). There is actually a way in Tomcat to configure the cacheTtl but you can only increase this value. Resource caching configuration is not really flexible easily it seems.

So my Tomcat has the default value of 5000 ms configured. This tricked me while doing performance testing because I had a little more than 5 seconds between my requests (looking at graphs and stuff). That's why all my requests basically ran without cache and triggered this heavy ZipFile.open every time.

So as I am not really very experienced with Tomcat configuration I am not yet sure what is the right solution here. Increasing the cacheTTL keeps the caches longer but does not fix the problem in the long run.

Summary

I think there are actually two culprits here.

  1. FactoryFinder classes not reusing a ServiceLoader. There might be a valid reason why they do not reuse them - I can not really think of one though.

  2. Tomcat evicting caches after a fixed time for web application resource (files in the classpath - like a ServiceLoader configuration)

Combine this with not having defined the System Property for the ServiceLoader class and you get a slow FactoryFinder call every cacheTtl seconds.

For now I can live with increasing cacheTtl to a longer time. I also might take a look at Tom Hawtins suggestion of overriding Classloader.getResources even if I kind of think this is a harsh way of getting rid of this performance bottleneck. It might be worth looking at though.

Wagner Michael
  • 2,172
  • 1
  • 15
  • 29