2

I use a VMWare environment to compare the performance of Postgres-XL 9.5 and PostgreSQL 9.5.

I build Postgres-XL cluster following the instruction of Creating a Postgres-XL cluster

Physical HW:
    M/B: Gigabyte H97M-D3H
    CPU: Intel i7-4790 @3.60Mhz
    RAM: 32GB DDR3 1600
    HD: 2.5" Seagate SSHD ST1000LM014 1TB
Infra:
  VMWare ESXi 6.0
VM:
    DB00~DB05:
        CPU: 1 core, limit to 2000Mhz
        RAM: 2GB, limit to 2GB
        HD: 50GB
        Advanced CPU Hyperthread mode: any
        OS: Ubuntu 16.04 LTS x64 (all packages are upgraded to the current version with apt-update; apt-upgrade)
        PostgreSQL 9.5+173 on DB00
        Postgres-XL 9.5r1.2 on DB01~DB05

    userver: (for executing pgbench)
        CPU: 2 cores,
        RAM: 4GB,
        HD: 50GB
        OS: Ubuntu 14.04 LTS x64
Role:
    DB00: Single PostgreSQL
    DB01: GTM
    DB02: Coordinator Master
    DB03~DB05: datanode master dn1~dn3

postgresql.conf in DB01~DB05

shared_buffers = 128MB
dynamic_shared_memory_type = posix  
max_connections = 300
max_prepared_transactions = 300
hot_standby = off
# Others are default values

postgresql.conf of DB00 is

max_connections = 300
shared_buffers = 128MB
max_prepared_transactions = 300
dynamic_shared_memory_type = sysv
#Others are default values

On userver:

pgbench -h db00 -U postgres -i -s 10 -F 10 testdb;
pgbench -h db00 -U postgres -c 30 -t 60 -j 10 -r testdb;

pgbench -h db02 -U postgres -i -s 10 -F 10 testdb;
pgbench -h db02 -U postgres -c 30 -t 60 -j 10 -r testdb;

I confirmed that all tables pgbench_* are averagely distributed amoung dn1~dn3 in Postgres-XL

pgbench results:

Single PostgreSQL 9.5: (DB00)

    starting vacuum...end.
    transaction type: TPC-B (sort of)
    scaling factor: 10
    query mode: simple
    number of clients: 30
    number of threads: 10
    number of transactions per client: 60
    number of transactions actually processed: 1800/1800
    tps = 1263.319245 (including connections establishing)
    tps = 1375.811566 (excluding connections establishing)
    statement latencies in milliseconds:
            0.001084        \set nbranches 1 * :scale
            0.000378        \set ntellers 10 * :scale
            0.000325        \set naccounts 100000 * :scale
            0.000342        \setrandom aid 1 :naccounts
            0.000270        \setrandom bid 1 :nbranches
            0.000294        \setrandom tid 1 :ntellers
            0.000313        \setrandom delta -5000 5000
            0.712935        BEGIN;
            0.778902        UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
            3.022301        SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
            3.244109        UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
            7.931936        UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
            1.129092        INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
    4.159086        END;

_

Postgres-XL 9.5
    starting vacuum...end.
    transaction type: TPC-B (sort of)
    scaling factor: 10
    query mode: simple
    number of clients: 30
    number of threads: 10
    number of transactions per client: 60
    number of transactions actually processed: 1800/1800
    tps = 693.551818 (including connections establishing)
    tps = 705.965242 (excluding connections establishing)
    statement latencies in milliseconds:
            0.003451        \set nbranches 1 * :scale
            0.000682        \set ntellers 10 * :scale
            0.000656        \set naccounts 100000 * :scale
            0.000802        \setrandom aid 1 :naccounts
            0.000610        \setrandom bid 1 :nbranches
            0.000553        \setrandom tid 1 :ntellers
            0.000536        \setrandom delta -5000 5000
            0.172587        BEGIN;
            3.540136        UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
            0.631834        SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
            6.741206        UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
            17.539502       UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
            0.974308        INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
        10.475378       END;

My question is, why are Postgres-XL's TPS and other indexes (such as INSERT, UPDATE) are far poor than those of PostgreSQL ? I thought Postgres-XL's performance should be better that of PostgreSQL, isn't it ?

wureka
  • 731
  • 1
  • 11
  • 26

4 Answers4

4

Postgres-XL is designed to run on multiple physical nodes. Running it on VMWare is a good educational exercise but should not be expected to show any performance gain. You are adding virtualization overhead and the overhead of the clustering software. The web page test from joyeu’s answer used 4 physical machines. Assuming that the performance increase quoted over a single node is based on the same machine you would read this as 4 times the hardware for a 2.3x increase in performance.

Nathan Tuggy
  • 2,237
  • 27
  • 30
  • 38
anonxxxx
  • 41
  • 2
1

Maybe you should try a large "Scale" value. I got the similar result as you. And then I found this webpage from Postgres-XL official site: http://www.postgres-xl.org/2016/04/postgres-xl-9-5-r1-released/eased/
It says:

Besides proving its mettle on Business Intelligence workloads, Postgres-XL has performed remarkably well on OLTP workloads when running pgBench (based on TPC-B) benchmark. In a 4-Node (Scale: 4000) configuration, compared to PostgreSQL, XL gives up to 230% higher TPS (-70% latency comparison) for SELECT workloads and up to 130% (-56% latency comparison) for UPDATE workloads. Yet, it can scale much, much higher than even the largest single node server.

So I guess Postgres-XL performs well for large data size. And I will conduct a test to confirm this right now.

joyeu
  • 11
  • 1
1

Postgres-XL is a clustered server. Individual transactions will always be slightly slower on it, but because it can be scale up to massive clusters letting it be able to process MUCH more data concurrently letting it process large data sets much faster.

Also performance varies WIDELY depending on what configuration options you use.

BrianC
  • 1,793
  • 1
  • 18
  • 26
1

From your test specs:

Physical HW: M/B: Gigabyte H97M-D3H CPU: Intel i7-4790 @3.60Mhz RAM: 32GB DDR3 1600 HD: 2.5" Seagate SSHD ST1000LM014 1TB <-----

using a single disk will likely introduce a bottleneck and slower your performances. You are using the same read/write speed divided by 4 considering that GTM, Coordinator and data nodes are going to access/spool data etc.

Despite of people speaking about performance gaps introduced by the hypervisor, database are disk intensive applications, not memory/cpu intensive one, this means that are perfect for virtualization to the condition of distributing accordingly the workload between disk groups. Obiviously use a preallocated disk or you will slow down the inserts for real.