0

I've following 2 tables in postgresql database -

table1

  • col1_id (integer)
  • col2 (text)
  • col3 (text)
  • col4 (text)
  • col5 (numeric(1,0))
  • col6 (numeric(9,0))
  • col7_created_date (timestamp),
  • col8 (text)
CREATE UNIQUE INDEX col1_pkey ON table1 USING btree (col1_id),  

table2

  • col1_id (integer)
  • tab2_col2 (text)
  • tab3_col3 (text)
  • tab4_col4 (integer)
  • tab5_col5 (text)
CREATE UNIQUE INDEX table2_pkey ON table2 USING btree (col1_id, tab3_col3)

FOREIGN KEY (col1_id) REFERENCES table1(col1_id) ON UPDATE RESTRICT ON DELETE RESTRICT

Note: From above tables definition, you would have understood that col1_id from table1 is not only a foreign key in table2 but also a part of primary key along with tab3_col3 in table2 to support 1:Many relationship.

The problem here is table1 have 10,00,000 rows & table2 has 50,00,000 records. So my queries are taking min 5 seconds in executing, Here are my queries -

Query1:

SELECT * 
FROM table1 t1  
WHERE (col6 >= ?)  
  AND col5 IN (?)  
  AND (t1.col8 LIKE ? OR t1.col8 LIKE ?) 
ORDER BY col7_created_date DESC 
LIMIT 50

Query2 :

SELECT COUNT(*) 
FROM table1 t1 
LEFT JOIN table2 t2 ON t2.col1_id = t1.col1_id 
WHERE t1.col7_created_date > ? 
  AND t2.tab4_col4 = ? 
  AND t2.tab3_col3 IN (?, ?) 
  AND a.tab2_col2 IN (?) 

Key Points to understand:

  1. table2 should have separate id column as a primary key to have a good performance because composite key can slow performance with a hig volume of data in tables. I can work on this.
  2. I can also work in bringing down some old records from both table1 & table2 but as of now this is not an option for me.
  3. In Query1 I can't remove "col7_created_date DESC" since I want top 50 records only, with pagination controlled from business logic.
  4. The "LIKE" search in Query1 are created dynamically, it can be 1 or 2..

Questions:

  1. If I create an index on t1.col8 (Query1) will it improve performance? If so, is there an efficient way to create an index? (NOTE: t1.col8 are comma separated text max up to 4)
  2. What I should do to improve performance for Query1 & Query2, I am open to change a DDL. Any suggestion(s)?
Sam
  • 23
  • 5

3 Answers3

0

Your tables have 1mln and 5 mln records, is this right? It really is not that big a deal for an index access.

The index does nothing for you in query1. Think about it, in order to select a record, the optimiser has to access col5 and col6 and col8 - there is no way to avoid accessing the table, because the index just does not have them. Not sure where you got this stuff about composite indexes - you create whatever indexes your queries dictate, and if you query wants col5 and col6 and col8 and col1 and col2 - this is what you have to do. Because whatever index you have is much better than full scanning the table, which is what you are doing now. Think which column offers the best selectivity and put it first in your composite index.

Is your Query2 giving you trouble too? What is the execution plan for it?

Bird
  • 1
  • 2
0

Creating an index on an integer field provides faster lookup, in your case however a b-tree index should work. Also columns with a high degree of uniqueness (having few repeated values) are also very good candidates for getting indexed. So if t1.col8 has this property, you should very well go ahead with an index on it. For more reference look at this.

0

Run the explain analyze to check the query execution time.. explain

Query 1:

SELECT * 
FROM table1 t1  
WHERE (col6 >= ?)  
  AND col5 IN (?)  
  AND (t1.col8 LIKE ? OR t1.col8 LIKE ?) 
ORDER BY col7_created_date DESC 
LIMIT 50

You can add composite index on col6 and col5.. these are numeric columns so it will increase the execution time.. Also you're using IN query which make the database slower.. Avoid using the in query

Same for second query..

After creating the different indexes you can run this code..

explain analyze SELECT * 
FROM table1 t1  
WHERE (col6 >= ?)  
  AND col5 IN (?)  
  AND (t1.col8 LIKE ? OR t1.col8 LIKE ?) 
ORDER BY col7_created_date DESC 
LIMIT 50

Which will give you the idea of what is taking time..

You can also paste that in question so i can suggest more indexes.

Ashok Gadri
  • 520
  • 4
  • 11