Could something besides Spring's @Transactional
annotation be causing this massive 15-minute slowdown?
Edited to add information at the end.
I have a web app that is occasionally performing extremely slowly. Parts of the site are driven by dynamic json. Most of the time the site works great. I want to stress this fact: the site isn't always slow. The site is always running but it doesn't see much use. Occasionally the site seems to fall asleep and doesn't display correctly because the json request isn't returning. If I open additional tabs/windows to the site I get quick responses for requests of static resources. Any requests that need database data hang. If I leave the browser open and go to lunch - by the time I get back the site is back to normal. Once the site "wakes up" it will continue to work at normal speed for the rest of the day.
In an effort to identify the cause of this strange slow behavior I've begun utilizing the Dropwizard/Codahale Metrics library.
I've added metrics to much of the site's back-end. I recently observed the slow behavior and I'm trying to pinpoint the cause.
I believe Spring's @Transactional
annotation is occasionally adding an additional 931 seconds to a method call. I realize that, if true, the slowdown isn't due to a bug or poor coding on Spring's part but probably a mis-configuration on my part. The numbers just seem so far out-there (15 minutes?) that I need stackoverflow's assistance.
4 questions:
- Is there another way these data-points can be interpreted?
- Is there something really stupid I'm doing that could be causing this slowdown?
- Is there something else I should add metrics to in order to pinpoint the cause of the slowdown?
- Finally, if all the time is being spent in
@Transactional
, do you have any suggestions regarding how to fix it?
I captured these metrics this morning but the slowdown event happened several days ago. The "max" values are what I'm especially concerned about.
"example.web.controller.CategoryGroupController.listJsonGroups":{"count":76,
"max":931812.646,"mean":79.981,"min":79.981,"p50":79.981,"p75":79.981,"p95":79.981,
"p98":79.981,"p99":79.981,"p999":79.981,"stddev":0.0,"m15_rate":4.121759882047415E-4,
"m1_rate":5.778884146644216E-9,"m5_rate":1.7016008277662782E-4,"mean_rate":1.4308967477371374E-4,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.service.CategoryGroupServiceImpl.findByCategoryGroupId":{"count":76,
"max":931812.091,"mean":79.86399999999999,"min":79.86399999999999,"p50":79.86399999999999,
"p75":79.86399999999999,"p95":79.86399999999999,"p98":79.86399999999999,"p99":79.86399999999999,
"p999":79.86399999999999,"stddev":0.0,"m15_rate":4.121759882047415E-4,
"m1_rate":5.778884146644216E-9,"m5_rate":1.7016008277662782E-4,"mean_rate":1.4499753290885513E-4,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.service.CategoryGroupServiceImpl.getDao":{"count":0,
"max":0.0,"mean":0.0,"min":0.0,"p50":0.0,"p75":0.0,"p95":0.0,"p98":0.0,"p99":0.0,"p999":0.0,
"stddev":0.0,"m15_rate":0.0,"m1_rate":0.0,"m5_rate":0.0,"mean_rate":0.0,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.dao.CategoryGroupDao.findByCategoryGroupId":{"count":76,
"max":10.404,"mean":6.916000069494177,"min":6.9159999999999995,"p50":6.9159999999999995,
"p75":6.9159999999999995,"p95":6.9159999999999995,"p98":6.9159999999999995,
"p99":6.9159999999999995,"p999":6.9159999999999995,"stddev":7.267429203455992E-5,
"m15_rate":4.8689221587514396E-6,"m1_rate":7.154074307612488E-40,"m5_rate":1.1717927914268928E-10,
"mean_rate":1.4546355576766556E-4,"duration_units":"milliseconds","rate_units":"calls/second"}
The Controller layer calls into a Service layer which then calls into the Dao layer.
From the metrics above, the controller layer is slow, along with the service layer but the Dao layer is fast.
The problem is that the service layer doesn't really do anything - it makes very little sense for it to be slow unless the @Transactional
annotation is slowing it down.
@M. Deinum explained that the getDao() method has a 0 count because the Spring AOP proxy only intercepts external calls.
Controller layer:
@Controller
public class CategoryGroupController extends GenericController<CategoryGroup> {
public static final String CATEGORY = "Awesome Groups";
@Value("${spring.datasource.office}")
private String dsOfficeId = null;
public CategoryGroupController() {
super(CategoryGroup.class);
}
@Autowired
private CategoryGroupService _service;
@Override
protected CategoryGroupService getService() {
return _service;
}
@Timed
@RequestMapping(value = "/json/groups/")
public @ResponseBody
FoundGroups listJsonGroups() {
List<CategoryGroup> found = listJsonGroups(dsOfficeId, CATEGORY);
found = applyOverrides(found);
List<FoundGroup> jsonGroups = buildGroup(found);
return new FoundGroups(jsonGroups);
}
public List<CategoryGroup> listJsonGroups(String dbOfficeId, String categoryId) {
CategoryGroupService service = getService();
CategoryGroupId template = new CategoryGroupId(dbOfficeId, categoryId, null);
List<CategoryGroup> found = service.findByCategoryGroupId(template);
return found;
}
// other methods...
}
Service layer:
@Service
public class CategoryGroupServiceImpl extends GenericService<CategoryGroup> implements CategoryGroupService {
@Autowired
private ICategoryGroupDao _dao;
public CategoryGroupServiceImpl() {
super(CategoryGroup.class);
}
@Override
@Timed
protected ICategoryGroupDao getDao() {
return _dao;
}
@Override
@Timed
@Transactional(readOnly = true)
public List<CategoryGroup> findByCategoryGroupId(CategoryGroupId idCategoryGroup) {
return getDao().findByCategoryGroupId(idCategoryGroup);
}
// other methods ...
}
Dao Layer:
@Repository
public class CategoryGroupDao extends GenericDao<CategoryGroup> implements ICategoryGroupDao {
public CategoryGroupDao() {
super(CategoryGroup.class);
}
@Timed
@Override
public List<CategoryGroup> findByCategoryGroupId(CategoryGroupId idCategoryGroup) {
List<CategoryGroup> retval = null;
SessionFactory sessionFactory = getSessionFactory();
Session currentSession = sessionFactory.getCurrentSession();
Criteria crit = currentSession.createCriteria(CategoryGroup.class).setReadOnly(true);
if (idCategoryGroup != null) {
addEqRestriction(idCategoryGroup.getDbOfficeId(), "id.dbOfficeId", crit);
addEqRestriction(idCategoryGroup.getCategoryId(), "id.categoryId", crit);
addEqRestriction(idCategoryGroup.getGroupId(), "id.groupId", crit);
}
retval = crit.list();
return retval;
}
// other methods...
}
Spring config:
<bean id="dataSource" class = "org.apache.commons.dbcp2.BasicDataSource" >
<property name="driverClassName" value="${jdbc.driverClassName}" />
<property name="url" value="${jdbc.databaseurl}" />
<property name="username" value="${jdbc.username}" />
<property name="password" value="${jdbc.password}" />
<property name="accessToUnderlyingConnectionAllowed" value="true" />
</bean>
<tx:annotation-driven />
<bean id="transactionManager" class="org.springframework.orm.hibernate4.HibernateTransactionManager">
<property name="sessionFactory" ref="sessionFactory" />
</bean>
<bean id="sessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="packagesToScan" value="example.db.entity" />
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">${jdbc.dialect}</prop>
<prop key="hibernate.show_sql">false</prop>
<prop key="hibernate.generate_statistics" >false</prop>
</props>
</property>
</bean>
JDBC:
jdbc.driverClassName=oracle.jdbc.OracleDriver
jdbc.dialect=org.hibernate.dialect.Oracle10gDialect
jdbc.username=auser
jdbc.databaseurl=jdbc:oracle:thin:@192.168.0.23:1521:MYDB01
jdbc.password=apassword
New details:
Based on the suggestion by M. Deinum I changed my connection pool to use HikariCP. I increased the maximum pool size to 30 connections.
I also created the following InstrumentedTransactionManager class to add timers to the TransactionManager used by Springs @Transactional:
@Component
public class InstrumentedTransactionManager extends org.springframework.orm.hibernate4.HibernateTransactionManager {
private static final Logger logger = Logger.getLogger(InstrumentedTransactionManager.class.getName());
MetricRegistry metricRegistry;
SessionFactory sessionFactory;
private Map<String, Timer> timers = new HashMap<>();
public InstrumentedTransactionManager() {
}
public InstrumentedTransactionManager(SessionFactory sessionFactory) {
super(sessionFactory);
this.sessionFactory = sessionFactory;
}
@Autowired
public void setSessionFactory(SessionFactory sessionFactory) {
super.setSessionFactory(sessionFactory);
this.sessionFactory = sessionFactory;
}
@Autowired
public void setMetricRegistry(MetricRegistry metricRegistry) {
this.metricRegistry = metricRegistry;
updateTimers();
}
private static String[] methodNames = {"doBegin", "doCommit", "doResume", "doRollback", "getDataSource", "doGetTransaction", "doSuspend"};
private void updateTimers() {
timers.clear();
if (metricRegistry != null) {
for (String methodName : methodNames) {
Timer aTimer = metricRegistry.timer(metricRegistry.name(InstrumentedTransactionManager.class.getName(), methodName));
timers.put(methodName, aTimer);
}
}
}
public Timer.Context getTimerContext(String name) {
Timer.Context context = null;
Timer aTimer = timers.get(name);
if (aTimer != null) {
context = aTimer.time();
}
return context;
}
@Override
protected void doBegin(Object transaction, TransactionDefinition definition) {
Timer.Context context = getTimerContext("doBegin");
try {
super.doBegin(transaction, definition);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doCommit(DefaultTransactionStatus status) {
Timer.Context context = getTimerContext("doCommit");
try {
super.doCommit(status);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doResume(Object transaction, Object suspendedResources) {
Timer.Context context = getTimerContext("doResume");
try {
super.doResume(transaction, suspendedResources);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doRollback(DefaultTransactionStatus status) {
Timer.Context context = getTimerContext("doRollback");
try {
super.doRollback(status);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
public DataSource getDataSource() {
Timer.Context context = getTimerContext("getDataSource");
try {
return super.getDataSource();
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected Object doSuspend(Object transaction) {
Timer.Context context = getTimerContext("doSuspend");
try {
return super.doSuspend(transaction);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected Object doGetTransaction() {
Timer.Context context = getTimerContext("doGetTransaction");
try {
return super.doGetTransaction();
} finally {
if (context != null) {
context.stop();
}
}
}
}
Finally I added metrics to the Executor used by Springs @Async annotation in an attempt to identify threadpool issues.
<task:annotation-driven executor="myTaskExecutor" />
<bean id="myTaskExecutor" class="org.springframework.scheduling.concurrent.ConcurrentTaskExecutor">
<constructor-arg type="java.util.concurrent.Executor" ref="meteredExecutor"/>
</bean>
<bean id="meteredExecutor" class="com.codahale.metrics.InstrumentedExecutorService">
<constructor-arg type="java.util.concurrent.ExecutorService" ref="es"/>
<constructor-arg type="com.codahale.metrics.MetricRegistry" ref="metricRegistry"/>
</bean>
<bean id="es" class="java.util.concurrent.ThreadPoolExecutor">
<constructor-arg type="int" value="0"/>
<constructor-arg type="int" value="7"/>
<constructor-arg type="long" value="60"/>
<constructor-arg type="java.util.concurrent.TimeUnit" value="SECONDS"/>
<constructor-arg type="java.util.concurrent.BlockingQueue" ref="bq"/>
<constructor-arg type="java.util.concurrent.ThreadFactory" ref="instrumentedThreadFactory"/>
</bean>
<bean id="bq" class="java.util.concurrent.LinkedBlockingQueue"/>
<bean id="instrumentedThreadFactory" class="com.codahale.metrics.InstrumentedThreadFactory" >
<constructor-arg type="java.util.concurrent.ThreadFactory" ref="threadFactory"/>
<constructor-arg type="com.codahale.metrics.MetricRegistry" ref="metricRegistry"/>
</bean>
<bean id="threadFactory" class="java.util.concurrent.Executors" factory-method="defaultThreadFactory" >
</bean>
I have not observed the slowness in production since making the above changes but absence of evidence is not evidence of absence...