Connection pooling in multi tenant app. Shared pool vs pool per tenant

Question

I'm building a multi tenant REST server application with Spring 2.x, Hibernate 5.x, Spring Data REST, Mysql 5.7. Spring 2.x uses Hikari for connection pooling.

I'm going to use a DB per tenant approach, so every tenant will have his own database.

I created my MultiTenantConnectionProvider in this way:

@Component
@Profile("prod")
public class MultiTenantConnectionProviderImpl implements MultiTenantConnectionProvider {
    private static final long serialVersionUID = 3193007611085791247L;
    private Logger log = LogManager.getLogger();

    private Map<String, HikariDataSource> dataSourceMap = new ConcurrentHashMap<String, HikariDataSource>();

    @Autowired
    private TenantRestClient tenantRestClient;

    @Autowired
    private PasswordEncrypt passwordEncrypt;

    @Override
    public void releaseAnyConnection(Connection connection) throws SQLException {
        connection.close();
    }

    @Override
    public Connection getAnyConnection() throws SQLException {
        Connection connection = getDataSource(TenantIdResolver.TENANT_DEFAULT).getConnection();
        return connection;

    }

    @Override
    public Connection getConnection(String tenantId) throws SQLException {
        Connection connection = getDataSource(tenantId).getConnection();
        return connection;
    }

    @Override
    public void releaseConnection(String tenantId, Connection connection) throws SQLException {
        log.info("releaseConnection " + tenantId);
        connection.close();
    }

    @Override
    public boolean supportsAggressiveRelease() {
        return false;
    }

    @Override
    public boolean isUnwrappableAs(Class unwrapType) {
        return false;
    }

    @Override
    public <T> T unwrap(Class<T> unwrapType) {
        return null;
    }

    public HikariDataSource getDataSource(@NotNull String tentantId) throws SQLException {
        if (dataSourceMap.containsKey(tentantId)) {
            return dataSourceMap.get(tentantId);
        } else {
            HikariDataSource dataSource = createDataSource(tentantId);
            dataSourceMap.put(tentantId, dataSource);
            return dataSource;
        }
    }

    public HikariDataSource createDataSource(String tenantId) throws SQLException {
        log.info("Create Datasource for tenant {}", tenantId);
        try {
            Database database = tenantRestClient.getDatabase(tenantId);
            DatabaseInstance databaseInstance = tenantRestClient.getDatabaseInstance(tenantId);
            if (database != null && databaseInstance != null) {
                HikariConfig hikari = new HikariConfig();
                String driver = "";
                String options = "";
                switch (databaseInstance.getType()) {
                case MYSQL:
                    driver = "jdbc:mysql://";
                    options = "?useLegacyDatetimeCode=false&serverTimezone=UTC&useUnicode=yes&characterEncoding=UTF-8&useSSL=false";
                    break;

                default:
                    driver = "jdbc:mysql://";
                    options = "?useLegacyDatetimeCode=false&serverTimezone=UTC&useUnicode=yes&characterEncoding=UTF-8&useSSL=false";
                }

                hikari.setJdbcUrl(driver + databaseInstance.getHost() + ":" + databaseInstance.getPort() + "/" + database.getName() + options);
                hikari.setUsername(database.getUsername());
                hikari.setPassword(passwordEncrypt.decryptPassword(database.getPassword()));

                // MySQL optimizations, see
                // https://github.com/brettwooldridge/HikariCP/wiki/MySQL-Configuration
                hikari.addDataSourceProperty("cachePrepStmts", true);
                hikari.addDataSourceProperty("prepStmtCacheSize", "250");
                hikari.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
                hikari.addDataSourceProperty("useServerPrepStmts", "true");
                hikari.addDataSourceProperty("useLocalSessionState", "true");
                hikari.addDataSourceProperty("useLocalTransactionState", "true");
                hikari.addDataSourceProperty("rewriteBatchedStatements", "true");
                hikari.addDataSourceProperty("cacheResultSetMetadata", "true");
                hikari.addDataSourceProperty("cacheServerConfiguration", "true");
                hikari.addDataSourceProperty("elideSetAutoCommits", "true");
                hikari.addDataSourceProperty("maintainTimeStats", "false");
                hikari.setMinimumIdle(3);
                hikari.setMaximumPoolSize(5);

                hikari.setIdleTimeout(30000);
                hikari.setPoolName("JPAHikari_" + tenantId);
                // mysql wait_timeout 600seconds
                hikari.setMaxLifetime(580000);
                hikari.setLeakDetectionThreshold(60 * 1000);

                HikariDataSource dataSource = new HikariDataSource(hikari);


                return dataSource;

            } else {
                throw new SQLException(String.format("DB not found for tenant %s!", tenantId));
            }
        } catch (Exception e) {
            throw new SQLException(e.getMessage());
        }
    }

}

In my implementation I read tenantId and I get information about the database instance from a central management system. I create a new pool for each tenant and I cache the pool in order to avoid to recreate it each time.

I read this interesting question, but my question is quite different. I'm thinking to use AWS (both for server instance, and RDS db instance).

Let's hypothesize a concrete scenario in which I've 100 tenants. The application is a management/point of sale software. It will be used just from agents. Let's say each tenants has an average of 3 agents working concurrently in each moment.

With that numbers in mind and according to this article, the first thing I realize is that it seems hard to have a pool for each tenant.

For 100 tenants I would like to think that a db.r4.large (2vcore, 15,25GB RAM and fast disk access ) with Aurora should be enough (about 150€/month).

According to the formula to size a connection pool:

connections = ((core_count * 2) + effective_spindle_count)

I should have 2core*2 + 1 = 5 connections in the pool.

From what I got, this should be the max connections in the pool to maximise performance on that DB instance.

1st solution

So my first question is pretty simple: how can I create a separate connection pool for each tenant seen that I should only use 5 connection in total?

It seems not possible to me. Even if I assign 2 connections to each tenant, I would have 200 connections to the DBMS!!

According to this question, on a db.r4.large instance I could have at max 1300 connections, so seems the instance should face quite well the load. But according the article I mentioned before, seems a bad practice use hundreds connections to the db:

If you have 10,000 front-end users, having a connection pool of 10,000 would be shear insanity. 1000 still horrible. Even 100 connections, overkill. You want a small pool of a few dozen connections at most, and you want the rest of the application threads blocked on the pool awaiting connections.

2nd solution

The second solution I have in mind is to share a connection pool for tenants on the same DMBS. This means that all 100 tenants will use the same Hikari pool of 5 connections (honestly it seems quite low to me).

Should this the right way to maximize performance and redure the response time of the application?

Do you have a better idea of how to manage this scenario with Spring, Hibernate, Mysql (hosted on AWS RDS Aurora)?

https://github.com/brettwooldridge/HikariCP/issues/856 – ASharma7 Jun 24 '19 at 14:30 — ASharma7, Jun 24 '19 at 14:30

Sid Malani · Answer 1 · 2018-11-21T04:15:29.540

5

Most definitely opening connection per tenant is a very bad idea. All you need is a pool of connections shared across all users.

So first step would be to find the load or anticipate what it would be based on some projections.
Decide how much latency is acceptable, what is the burst peak time traffic etc
Finally come to number of connections you will need for this and decide on number of instances required. For instance if your peak time usage is 10k per s and each query takes 10ms then you will need 100 open connections for latency of 1s.
Implement it without any bindings to user. i.e. the same pool shared across all. Unless you have a case to group say premium/basic users to say have set of two pools etc
Finally as you are doing this in AWS if you need more than 1 instance based on point 3 - see if you can autoscale up/down based on load to save costs.

Check these out for some comparison metrics

This one is probably most interesting in terms of spike demand

https://github.com/brettwooldridge/HikariCP/blob/dev/documents/Welcome-To-The-Jungle.md

Some more...

https://github.com/brettwooldridge/HikariCP

https://www.wix.engineering/blog/how-does-hikaricp-compare-to-other-connection-pools

edited Nov 21 '18 at 04:15

answered Nov 17 '18 at 13:01

Sid Malani

2,078
1
13
13

Thanks for your reply. Could you explain why using a pool connection for tenant is a bad idea? Thanks – drenda Nov 19 '18 at 17:54
1

Because your connections will not be fully utilised. If you just think how many times do you hit the database during the usage of your application - it will only be a small fraction. Having unnecessary connections open is an added load as well. So you may find even with a few connections you could be running really smoothly because you will be distributing its usage. Most queries will for example take only few ms to 10s of ms rest of the time those connections will be idle. – Sid Malani Nov 20 '18 at 01:24
To me it looks this doesn't answer the original question: solution for one DB per tenant. as far as I know, at least in JDBC, the connections are DB specific (when creating the connections, we have to specify which DB and it's credentials), and as such, the same connections cannot be share between DBs; and when requesting a connection from a connection pool, the pool management logic do not differentiate between the connections within the pool, it would just pick up one and pass it over. (1/2) – nabeel Aug 21 '22 at 07:24
So in a 'one DB per tenant' scenario we can't have same connection pool sharing connections of multiple DBs. So, in a 'one DB per tenant' scenario, if there are too many tenants, what should happen is that the DBs should be distributed across multiple VMs (horizontal scaling). (2/2) – nabeel Aug 21 '22 at 07:25

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

1

Follow previous Q&A the selected strategy for multi tenant environment will be (surprisingly) using connection pool per tenant

Strategy 2 : each tenant have it's own schema and it's own connection pool in a single database

strategy 2 is more flexible and safe : every tenant cannot consume more than a given amount of connection (and this amount can be configured per tenant if you need it)

I suggest put the HikariCP's formula aside here, and use less tenants number as 10 (dynamic size? ) with low connection pool size as 2.

Be more focus on the traffic you expect, notice that 10 connection pool size comment in HikariCP Pool Size maybe should suffice:

10 as a nice round number. Seem low? Give it a try, we'd wager that you could easily handle 3000 front-end users running simple queries at 6000 TPS on such a setup.

See also comment indicates that 100 instances are too much

, but it would have to be a massive load to require 100s.

By @EssexBoy

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 19 '18 at 13:53

Ori Marko

56,308
23
131
233

Thanks for the reply. For how my application is made I've to use the Strategy 1. Is your reply still valid in that case? – drenda Nov 19 '18 at 17:54
@drenda it different db, but still different connection pool, so yes – Ori Marko Nov 19 '18 at 17:59
Having 100 tenants I would have min 200 connections and max 1000 connections to the database. Considering that a r4.large instance can handle max 1300 connections, the costs per tenant seems quite high. If I had 1000 tenants how should I manage that load? 10 DB instances? Do u have some useful links to some case-study of real multi-tenants app? Thanks – drenda Nov 19 '18 at 18:06
1

@drenda I add this question to Hikari group https://groups.google.com/forum/#!topic/hikari-cp/gJpB4gPtnAs – Ori Marko Nov 20 '18 at 06:29
1

@drenda brettwooldridge direct to answer: https://github.com/brettwooldridge/HikariCP/issues/1023#issuecomment-345592654 – Ori Marko Nov 20 '18 at 13:09

Connection pooling in multi tenant app. Shared pool vs pool per tenant

2 Answers2