I built a real test case with the csv provided with 20M records. For my proof of concept I will compare two options in Oracle. Oracle In-Memory Option vs Oracle Parallel Query. The idea is to see whether the results match your expectations.
- In Memory Option is license fee, so you have to pay additional costs
- Parallel query is a feature included with the Enterprise Edition of Oracle database.
Lab: Linux Red Hat 7
Server: 16 CPUs and 32 GB RAM
In Memory section = 4GB
VM Virtual Server on VMware using IBM ESX Hardware Series
Elements for the Test Case
SQL> create tablespace tbtest datafile '/bbdd_odcgrc1r/datos/test.dbf' size 2g autoextend on next 100m maxsize 10g ;
Tablespace created.
SQL> create user test_perf identified by "Oracle_1" ;
User created.
SQL> grant connect to test_perf ;
Grant succeeded.
SQL> grant create table to test_perf ;
Grant succeeded.
SQL> alter user test_perf quota unlimited on tbtest ;
User altered.
SQL>
Oracle Parallel Query without in memory option
I loaded the csv file using direct path sql loader:
Table TEST_PERFORMANCE:
20000000 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 1
Total logical records read: 20000000
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 4112
Total stream buffers loaded by SQL*Loader load thread: 0
Run began on Sat Jul 25 00:57:23 2020
Run ended on Sat Jul 25 00:57:34 2020
Obviously loading a file into the database is not the same ( at all ) than loading the file in panda. As Panda does not need to load the data into anything ( in this case a database data file )
SQL> desc test_perf.test_performance
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NUMBER
DATE_ID DATE
INSTANCE_ID NUMBER
TERRITORY_ID VARCHAR2(10 CHAR)
CODE VARCHAR2(10 CHAR)
PRICE NUMBER
CURRENCY_CODE_ID VARCHAR2(10 CHAR)
PRICE_IN_USD NUMBER
I collect statistics of the table and now let's see how it behaves
SQL> exec dbms_stats.gather_table_stats ( 'TEST_PERF' , 'TEST_PERFORMANCE' , block_sample => true );
PL/SQL procedure successfully completed.
Group by by currency_code_id ( without sorting )
SQL> select count(*) , currency_code_id from test_perf.test_performance group by currency_code_id
90 rows selected.
Elapsed: 00:00:00.35
Group by currency_code_id but sorting by counter
SQL> select count(*) , currency_code_id from test_perf.test_performance group by currency_code_id order by count(*) desc ;
90 rows selected.
Elapsed: 00:00:00.70
Sorting all records and showing them by sqlplus consume a lot of time just for creating the output, but the query itself is ultra fast
SQL> select * from test_perf.test_performance order by PRICE_IN_USD desc ;
20000000 rows selected.
Elapsed: 00:01:31.48
Execution Plan
----------------------------------------------------------
Plan hash value: 1897103579
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20M| 972M| | 5940 (1)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 20M| 972M| | 5940 (1)| 00:00:01 | Q1,01 | P->S | QC (ORDER) |
| 3 | SORT ORDER BY | | 20M| 972M| 1376M| 5940 (1)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 20M| 972M| | 435 (1)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND RANGE | :TQ10000 | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | P->P | RANGE |
| 6 | PX BLOCK ITERATOR | | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | PCWC | |
| 7 | TABLE ACCESS FULL| TEST_PERFORMANCE | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is 32 because of table property
Statistics
----------------------------------------------------------
364 recursive calls
3 db block gets
36963 consistent gets
45558 physical reads
2860 redo size
703698256 bytes sent via SQL*Net to client
14667271 bytes received via SQL*Net from client
1333335 SQL*Net roundtrips to/from client
64 sorts (memory)
1 sorts (disk)
20000000 rows processed
Real time consumed by the query itself is
SQL> select * from test_perf.test_performance order by PRICE_IN_USD desc ;
Elapsed: 00:00:02.03
Execution Plan
----------------------------------------------------------
Plan hash value: 1897103579
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20M| 972M| | 5940 (1)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 20M| 972M| | 5940 (1)| 00:00:01 | Q1,01 | P->S | QC (ORDER) |
| 3 | SORT ORDER BY | | 20M| 972M| 1376M| 5940 (1)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 20M| 972M| | 435 (1)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND RANGE | :TQ10000 | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | P->P | RANGE |
| 6 | PX BLOCK ITERATOR | | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | PCWC | |
| 7 | TABLE ACCESS FULL| TEST_PERFORMANCE | 20M| 972M| | 435 (1)| 00:00:01 | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is 32 because of table property
Select the distinct territory_id of the table
SQL> select distinct territory_id from test_perf.test_performance order by territory_id desc ;
136 rows selected.
Elapsed: 00:00:00.58
For the pivot I chose this simple example
SQL> select * from test_perf.test_performance
pivot ( count(*) for TERRITORY_ID in
(
'ZW',
'ZM',
'ZA',
'VN',
'VG',
'VE',
'UZ',
'UY',
'US',
'UG',
'UA',
'TZ',
'TW',
'TT',
'TR',
'TM',
'TJ',
'TH',
'TG',
'SZ',
'SV',
'SN',
'SK',
'SI',
'SG',
'SE',
'SA',
'RW',
'RU',
'QA',
'PY',
'PT',
'PL',
'PH',
'PG',
'PE',
'PA',
'OM',
'NZ',
'NP',
'NO',
'NL',
'NI',
'NE',
'NA',
'MZ',
'MY',
'MX',
'MU',
'MT',
'MO',
'MN',
'ML',
'MK',
'MD',
'LV',
'LU',
'LT',
'LK',
'LB',
'LA',
'KZ',
'KY',
'KW',
'KR',
'KN',
'KH',
'KG',
'JP',
'JO',
'JM',
'IT',
'IS',
'IN',
'IL',
'IE',
'ID',
'HU',
'HT',
'HR',
'HN',
'HK',
'GW',
'GT',
'GR',
'GM',
'GH',
'GD',
'GB',
'GA',
'FR',
'FM',
'FJ',
'FI',
'ES',
'EG',
'EE',
'EC',
'DO',
'DM',
'DK',
'DE',
'CZ',
'CY',
'CV',
'CR',
'CO',
'CL',
'CI',
'CH',
'CA',
'BZ',
'BY',
'BW',
'BS',
'BR',
'BO',
'BN',
'BM',
'BJ',
'BH',
'BG',
'BF',
'BE',
'BA',
'AZ',
'AW',
'AU',
'AT',
'AR',
'AO',
'AM',
'AL',
'AI',
'AG',
'AE'
) )
order by id
Elapsed: 00:00:04.74
Oracle In Memory Option
I configured an In memory area of 4GB, which is not so much.
Total System Global Area 1.2885E+10 bytes
Fixed Size 12192520 bytes
Variable Size 5184161016 bytes
Database Buffers 3388997632 bytes
Redo Buffers 4583424 bytes
In-Memory Area 4294967296 bytes
Database mounted.
Database opened.
SQL> SHOW PARAMETER INMEMORY
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
inmemory_adg_enabled boolean TRUE
inmemory_clause_default string
inmemory_expressions_usage string ENABLE
inmemory_force string DEFAULT
inmemory_max_populate_servers integer 8
inmemory_query string ENABLE
inmemory_size big integer 4G
inmemory_trickle_repopulate_servers_ integer 1
percent
inmemory_virtual_columns string MANUAL
optimizer_inmemory_aware boolean TRUE
SQL> ALTER TABLE TEST_PERF.TEST_PERFORMANCE INMEMORY PRIORITY HIGH;
Table altered.
SQL> select segment_name
, partition_name
2 3 , inmemory_size / 1024 / 1024 as inmemory_size_mb
, bytes / 1024 / 1024 as bytes_mb
, populate_status
4 5 6 , trunc(bytes / inmemory_size, 1) * 100 as compression_ratio
from v$im_segments
7 8 order by segment_name, partition_name;
SEGMENT_NAME
--------------------------------------------------------------------------------
PARTITION_NAME
--------------------------------------------------------------------------------
INMEMORY_SIZE_MB BYTES_MB POPULATE_STAT COMPRESSION_RATIO
---------------- ---------- ------------- -----------------
TEST_PERFORMANCE
362.25 514.046875 COMPLETED 140
SQL> select count(*),length(date_id) from test_perf.test_performance group by length(date_id)
Execution Plan
----------------------------------------------------------
Plan hash value: 3227171220
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 121 | 29 (59)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 11 | 121 | 29 (59)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 11 | 121 | 29 (59)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 11 | 121 | 29 (59)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 11 | 121 | 29 (59)| 00:00:01 | Q1,00 | P->P | HASH |
| 6 | HASH GROUP BY | | 11 | 121 | 29 (59)| 00:00:01 | Q1,00 | PCWP | |
| 7 | PX BLOCK ITERATOR | | 20M| 209M| 14 (15)| 00:00:01 | Q1,00 | PCWC | |
| 8 | TABLE ACCESS INMEMORY FULL| TEST_PERFORMANCE | 20M| 209M| 14 (15)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is 32 because of table property
Let's test some queries.
Group by territory
SQL> select count(*),TERRITORY_ID from test_perf.test_performance group by TERRITORY_ID ;
136 rows selected.
Elapsed: 00:00:00.24
Group by instance_id
SQL> select count(*) , INSTANCE_ID from test_perf.test_performance group by INSTANCE_ID
11251 rows selected.
Elapsed: 00:00:00.27
Group by 2 fields
SQL> select count(*), instance_id, territory_id from test_perf.test_performance group by instance_id, territory_id ;
278269 rows selected.
Elapsed: 00:00:00.84
As you can , obviously when I have one table in memory, the performance of these operations increase, but you would notice more improvement if you start applying analytics to the queries.
Several statistics functions and group by
SQL> select territory_id,sum(to_number(price)),avg(to_number(price)),max(to_number(price)),min(to_number(price))
2 from test_perf.test_performance group by territory_id ;
Elapsed: 00:00:00.57
However, PIVOT is usually a bad idea for inmemory, because of the column storage used for the tables.
The same query with pivot executed before takes
Elapsed: 00:00:15.93
As you can see in the examples above, I did not change the PARALLEL property of the table, so let me show you how behaves the query when parallel is not enable for an object which has INMEMORY option
SQL> alter table test_perf.TEST_PERFORMANCE noparallel ;
Table altered.
SQL> select count(*) , currency_code_id from test_perf.test_performance group by currency_code_id
2 ;
90 rows selected.
Elapsed: 00:00:02.14
Execution Plan
----------------------------------------------------------
Plan hash value: 151279035
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 90 | 450 | 909 (62)| 00:00:01 |
| 1 | HASH GROUP BY | | 90 | 450 | 909 (62)| 00:00:01 |
| 2 | TABLE ACCESS INMEMORY FULL| TEST_PERFORMANCE | 20M| 95M| 411 (16)| 00:00:01 |
------------------------------------------------------------------------------------------------
IMPQ
For the last part, I leave the best of all the options, a combination of both parallel and inmemory, called IMPQ or In Memory Parallel Query. This feature provides the best of both worlds, and in order be kick in you have to enable parallel for your table , have the table in the inmemory area and define the parameter PARALLEL_DEGREE_POLICY to AUTO.
An example of this would be
SQL> alter session set parallel_degree_policy=auto ;
Session altered.
SQL> set autotrace traceonly explain
SQL> select count(*),territory_id from test_perf.test_performance group by territory_id
Execution Plan
----------------------------------------------------------
Plan hash value: 3227171220
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 136 | 408 | 78 (59)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 136 | 408 | 78 (59)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 136 | 408 | 78 (59)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 136 | 408 | 78 (59)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 136 | 408 | 78 (59)| 00:00:01 | Q1,00 | P->P | HASH |
| 6 | HASH GROUP BY | | 136 | 408 | 78 (59)| 00:00:01 | Q1,00 | PCWP | |
| 7 | PX BLOCK ITERATOR | | 20M| 57M| 38 (16)| 00:00:01 | Q1,00 | PCWC | |
| 8 | TABLE ACCESS INMEMORY FULL| TEST_PERFORMANCE | 20M| 57M| 38 (16)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------------------------
Note
-----
- automatic DOP: Computed Degree of Parallelism is 12
Check in the above statement the change in DOP from 32 to 12. That indicates that IMPQ has identify that the best degree for the statement is 12, instead of 32 which was derived of the CPU_COUNT * 2 ( as the table was created with degree default automatic ).
This query below now with IMPQ runs in only 0.14 seconds, instead of 0.35.
SQL> select count(*),territory_id from test_perf.test_performance group by territory_id
2 ;
136 rows selected.
Elapsed: 00:00:00.14
This other query now takes 0.14 seconds when before took 0.70 seconds
SQL> select count(*) , currency_code_id from test_perf.test_performance group by currency_code_id order by count(*) desc ;
90 rows selected.
Elapsed: 00:00:00.14
Summary
If you need times lower than a second, and you are using analytics mostly, Oracle in-memory option might be a good idea. If that is the case, you might want also enable IMPQ to obtain the best results. There will be a work to do in regards which columns should or not be included, how the table is populated into the inmemory area, etc. The greater the complexity in the analytics the greater the performance you will notice.
Nevertheless, if you can live with times between 1 and 5 seconds, Parallel query without indexes might be a solution free of cost and quite easy to configure.
Feel free to comment anything or ask for any clarification regarding the test case.