1

I'm running a legacy Spring Boot (1.4.4) application with Thymeleaf (2.1.4) in production on Java 8 (1.8.0_161-b12 / HotSpot 64 25.161-b12). When the application experiences moderate load (1K reqs/min), all app requests experience latency (> 1min), and CPU on the app instances max out.

How can I pinpoint exactly where this performance bottleneck is occurring? If it is a code / design issue how can I be certain, or if I just need to allocate more resources?

Things I've tried:

  • Scaling up the DB and increase connection pool size ( CPU is negligible )
  • Disabled the template/expression (Thymeleaf) and query cache (Hibernate/couchbase) and experienced same latency.
  • Scale up cluster (3x t2.large)

Other things to note:

  • JVM heap is not configured, but doesn't appear to approach max heap size.JVM during load
  • JVM monitoring shows regular (Allocation Failed) garbage collection during load w/ ParallelGCFailedAllocation occurring frequentlyParallelGCFailedAllocation
  • There's a few Full GC events occurring that take b/w 1-3 seconds.
  • A heap dump during peak latency reveals over 172 blocked threads that appear to be waiting on Thymeleaf cache PUT requests?!
"http-nio-8080-exec-168" - Thread t@488
   java.lang.Thread.State: BLOCKED
        at org.thymeleaf.cache.StandardCache$CacheDataContainer.putWithoutTracing(StandardCache.java:384)
        - waiting to lock <4e409b1c> (a [Ljava.lang.Object;) owned by "http-nio-8080-exec-170" t@490
        at org.thymeleaf.cache.StandardCache$CacheDataContainer.put(StandardCache.java:368)
        at org.thymeleaf.cache.StandardCache.put(StandardCache.java:141)
        at org.thymeleaf.spring4.expression.SpelVariableExpressionEvaluator.getExpression(SpelVariableExpressionEvaluator.java:184)
        at org.thymeleaf.spring4.expression.SpelVariableExpressionEvaluator.evaluate(SpelVariableExpressionEvaluator.java:129)
        at org.thymeleaf.standard.expression.VariableExpression.executeVariable(VariableExpression.java:154)
        at org.thymeleaf.standard.expression.SimpleExpression.executeSimple(SimpleExpression.java:59)
        at org.thymeleaf.standard.expression.Expression.execute(Expression.java:103)
        at org.thymeleaf.standard.expression.Expression.execute(Expression.java:133)
        at org.thymeleaf.standard.expression.Expression.execute(Expression.java:120)
        at org.thymeleaf.standard.processor.attr.AbstractStandardLocalVariableDefinitionAttrProcessor.getNewLocalVariables(AbstractStandardLocalVariableDefinitionAttrProcessor.java:89)
        at org.thymeleaf.processor.attr.AbstractLocalVariableDefinitionAttrProcessor.processAttribute(AbstractLocalVariableDefinitionAttrProcessor.java:62)
        at org.thymeleaf.processor.attr.AbstractAttrProcessor.doProcess(AbstractAttrProcessor.java:87)
        at org.thymeleaf.processor.AbstractProcessor.process(AbstractProcessor.java:212)
        at org.thymeleaf.dom.Node.applyNextProcessor(Node.java:1017)
        at org.thymeleaf.dom.Node.processNode(Node.java:972)
        at org.thymeleaf.dom.NestableNode.computeNextChild(NestableNode.java:695)
        at org.thymeleaf.dom.NestableNode.doAdditionalProcess(NestableNode.java:668)
        at org.thymeleaf.dom.Node.processNode(Node.java:990)
        at org.thymeleaf.dom.NestableNode.computeNextChild(NestableNode.java:695)
        at org.thymeleaf.dom.NestableNode.doAdditionalProcess(NestableNode.java:668)
        at org.thymeleaf.dom.Node.processNode(Node.java:990)
        at org.thymeleaf.dom.NestableNode.computeNextChild(NestableNode.java:695)
        at org.thymeleaf.dom.NestableNode.doAdditionalProcess(NestableNode.java:668)
        at org.thymeleaf.dom.Node.processNode(Node.java:990)
        at org.thymeleaf.dom.NestableNode.computeNextChild(NestableNode.java:695)
        at org.thymeleaf.dom.NestableNode.doAdditionalProcess(NestableNode.java:668)
        at org.thymeleaf.dom.Node.processNode(Node.java:990)
        at org.thymeleaf.dom.NestableNode.computeNextChild(NestableNode.java:695)
        at org.thymeleaf.dom.NestableNode.doAdditionalProcess(NestableNode.java:668)
        at org.thymeleaf.dom.Node.processNode(Node.java:990)
        at org.thymeleaf.dom.Document.process(Document.java:93)
        at org.thymeleaf.TemplateEngine.process(TemplateEngine.java:1155)
        at org.thymeleaf.TemplateEngine.process(TemplateEngine.java:1060)
        at org.thymeleaf.TemplateEngine.process(TemplateEngine.java:1011)
        at org.thymeleaf.spring4.view.ThymeleafView.renderFragment(ThymeleafView.java:335)
        at org.thymeleaf.spring4.view.ThymeleafView.render(ThymeleafView.java:190)
        at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1271)

Since the application has many layers, is there a way to nail down the culprit?

UPDATE:

OK so after digging some more, a few more things:

  1. Another thread dump during another point in the load testing shows a bunch of blocked threads on hibernate calls to the database.
 java.lang.Thread.State: BLOCKED
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:543)
        - waiting to lock <3d1c45eb> (a java.lang.Object) owned by "http-nio-8080-exec-270" t@823
        at sun.security.provider.NativePRNG$RandomIO.access$400(NativePRNG.java:331)
        at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:220)
        at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
        at sun.security.ssl.CipherBox.createExplicitNonce(CipherBox.java:1015)
        at sun.security.ssl.OutputRecord.encrypt(OutputRecord.java:248)
        at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:869)
        at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:857)
        at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
        - locked <7721ca63> (a sun.security.ssl.AppOutputStream)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        - locked <2135117a> (a java.io.BufferedOutputStream)
        at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3725)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2506)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2677)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
        - locked <5c36dc6f> (a com.mysql.jdbc.JDBC4Connection)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
        - locked <5c36dc6f> (a com.mysql.jdbc.JDBC4Connection)
        at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
        - locked <5c36dc6f> (a com.mysql.jdbc.JDBC4Connection)
        at sun.reflect.GeneratedMethodAccessor254.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.github.gquintana.metrics.proxy.MethodInvocation.proceed(MethodInvocation.java:75)
        at com.github.gquintana.metrics.sql.PreparedStatementProxyHandler.execute(PreparedStatementProxyHandler.java:53)
        at com.github.gquintana.metrics.sql.AbstractStatementProxyHandler.invoke(AbstractStatementProxyHandler.java:50)
        at com.github.gquintana.metrics.proxy.ProxyHandler.invoke(ProxyHandler.java:56)
        at com.sun.proxy.$Proxy143.executeQuery(Unknown Source)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:79)
        at org.hibernate.loader.Loader.getResultSet(Loader.java:2062)
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1859)
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1838)
        at org.hibernate.loader.Loader.doQuery(Loader.java:906)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:348)
        at org.hibernate.loader.Loader.doList(Loader.java:2550)
        at org.hibernate.loader.Loader.doList(Loader.java:2536)
        at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2366)
        at org.hibernate.loader.Loader.list(Loader.java:2361)
        at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:495)
        at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:357)
        at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:198)
        at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1230)
        at org.hibernate.internal.QueryImpl.list(QueryImpl.java:101)
        at org.hibernate.ejb.QueryImpl.getResultList(QueryImpl.java:268)
        at org.hibernate.ejb.criteria.CriteriaQueryCompiler$3.getResultList(CriteriaQueryCompiler.java:254)
        at org.broadleafcommerce.common.i18n.dao.TranslationDaoImpl.readTranslation(TranslationDaoImpl.java:262)
        at org.broadleafcommerce.common.i18n.service.TranslationServiceImpl.getTranslatedValue(TranslationServiceImpl.java:194)
        at org.broadleafcommerce.common.i18n.service.DynamicTranslationProvider.getValue(DynamicTranslationProvider.java:47)
        at org.broadleafcommerce.core.catalog.domain.CategoryImpl.getName(CategoryImpl.java:489)
        at org.<myorg>.CatalogService.lambda$getSeries1ActiveTechniques$11(CatalogService.java:95)

Normally this call getSeries1ActiveTechniques is behind a hibernate query cache, but with it disabled, there's a lot more traffic to the SQL database.

While I'm aware this design needs refactored, I'm still unclear as to whether this slowdown is due specifically from load via overall CPU, overall GC impacting CPU, or are these two separate contentions causing GC and high CPU?

  1. CPU sampling during the test shows:

Hotspots

Hotspots show high CPU for certain calls; jumping out is org.apache.catalina.webresources.JarWarResourceSet.getArchiveEntries which might be a known issue with embedded tomcat?

But how do I know if these CPU times are actually representative or if these methods are just suffering from greater general slowdown from GC or threadpool limiting at apache/jdbc level?

eschie
  • 194
  • 1
  • 12
  • I don't have any thymeleaf experience, but what is the thread which has the cache lock doing? – tgdavies Mar 02 '22 at 07:34
  • Have you disabled caching in Thymeleaf? I would also suggest to upgrade Spring Boot to at least the latest 1.4.x or even better an even higher version that still is supported (but that might be a bridge to far for now). So upgrade to the latest 1.4.x and suggest to upgrade to the latest 1.8.x JDK as well. – M. Deinum Mar 02 '22 at 08:09
  • @tgdavies it appears all the blocked threads are waiting on the same locked thymeleaf thread to finish processing. 'Waiting to lock `4e409b1c`' where 4e409b1c is ```"http-nio-8080-exec-170" - Thread t@490 java.lang.Thread.State: RUNNABLE at org.thymeleaf.cache.StandardCache$CacheDataContainer.putWithoutTracing(StandardCache.java:384) - locked <4e409b1c> (a [Ljava.lang.Object;) at org.thymeleaf.cache.StandardCache$CacheDataContainer.put(StandardCache.java:368)``` – eschie Mar 03 '22 at 18:26
  • @M.Deinum Disabling the cache, takes over 60 seconds to process a template. We are using a lot of thymeleaf templates and fragments, and I'm sure they can be refactored to be more performant. Upgrading is not off the table but I would like to gain better understanding into what is the primary cause. – eschie Mar 07 '22 at 21:14
  • 1
    Well 1 thing you get with upgrades is better performance, a lot has been done to improve that. A lot of work has gone into polishing the classloading in the special Spring Boot jar. Another thing is a lot of performance improvements in Thymeleaf 3 which really speedup the thymeleaf processing. So it might not be your code but the framework (and the fact you are using older versions). – M. Deinum Mar 07 '22 at 21:20
  • The later thread dump you shared looks like it's stuck at `SecureRandom.nextBytes`. I wonder if this could be related to a very old issue with the secure random seed source (/dev/random vs /dev/urandom). Can you check your JVM args and if you are able to restart can you try with `-Djava.security.egd=file:/dev/./urandom`? Or just write a small code that generates a secure random and run it on the same host to see if it is slow/blocking. This is a long shot so take it with a grain of salt :) Upgrading JVM minor version may help as well. For more info https://www.baeldung.com/java-security-egd. – murtiko Mar 17 '22 at 20:52

1 Answers1

0

You’ll find that you’ll get better performance with newer versions of Java for free, which will help with some of the GC delays you are reporting. If latency is an issue then you should consider upgrading to either Java 11, or Java 17.

However the set of locks you have highlighted is the real problem. The Thymeleaf cache looks like it’s holding a global (or at least highly contended) lock per request, And everything is serialising behind that. It’s not clear if there is an upgrade to the library you depend on which might have fixed the issue (2.1.6 is latest, but there’s a 3.x series as well); alternatively they may have different cache implementations that you can choose/configure. It looks like the StandardCache is the culprit; maybe there an alternative you can use?

You could try and disable caching to e see if that removes the locks, though having a cache does make sense.

https://www.thymeleaf.org/doc/tutorials/2.1/usingthymeleaf.html#template-cache

AlBlue
  • 23,254
  • 14
  • 71
  • 91
  • Although agree on the Java upgrades, the current Spring (Boot) versions in use don't support that. – M. Deinum Mar 02 '22 at 08:06
  • @AlBlue I am definitely interested in upgrading all of the systems. But I am still curious, what specifically is causing high CPU.... simply load? GC? GC as a function of load? inefficient code? thread locking due to connection pool limiting? and how best do I go about determining that if not one by one as I have been trying here? – eschie Mar 07 '22 at 21:19