0

Pretty new to postgres and seem to have an issue here and I am not sure which direction to go in.

I have a query that is causing some performance issues and I cannot work out how to optimize it.

The query is actually quite simple:

SELECT transactions.* FROM transactions
LEFT OUTER JOIN companies ON "companies"."id" = "transactions"."company_id"
WHERE companies.code ILIKE '%777%'
ORDER BY transactions.id desc LIMIT 10

I want to find all transactions that are for a company that has 777 in the name. There are around 20million transactions in the database and around 200 companies.

The query currently times out even though we have a LIMIT of 10. I believe this is because the ILIKE is not using an index so it's horribly slow.

Would this be an instance where you may want to add the WHERE filter to the JOIN instead? I have tested this and it works lightning-quick IF a record is found. If no record is found it again times out.

Is there a type of Index we should be looking at which would improve the speed here?

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • 2
    note that your `WHERE` condition effectively turns your outer join back into an inner join. –  Nov 17 '21 at 15:43
  • (1) I doubt that any RDBMS would be able to use index for comparisons **starting** with a wild card. (2) If you want to keep your `JOIN` as `OUTER` move your condition from `WHERE` to m`ON`. – PM 77-1 Nov 17 '21 at 15:46
  • 1
    Please **[edit]** your question and add the [execution plan](https://www.postgresql.org/docs/current/static/using-explain.html) generated using **`explain (analyze, buffers, format text)`** (_not_ just a "simple" explain) as [formatted text](http://stackoverflow.com/help/formatting) and make sure you preserve the indention of the plan. Paste the text, then put `\`\`\`` on the line before the plan and on a line after the plan. Please also include complete `create index` statements for all indexes as well. –  Nov 17 '21 at 15:46
  • @PM77-1: yes, Postgres could use a trigram index for that kind of condition. But I guess it's the sorting of a huge number of transactions that makes this slow, not _finding_ them. –  Nov 17 '21 at 15:47
  • Check this out. https://stackoverflow.com/questions/69216373/postgres-select-ilike-text-is-slow-on-large-string-rows/69223762#69223762 You may be able to put a trigram index on your `companies.code` column to speed up your otherwise pathological `ILIKE '%constant%'` WHERE filter. – O. Jones Nov 17 '21 at 19:08
  • The reason the LIMIT 10 is not helping is because you ask it to get all results first and then sort them and then get the first 10. Remove the order by and you'll likely get some results sooner. That said, you don't know which ones you'll get; if there is no ORDER BY you need to consider consider the output being a completely random selection, even when it looks like it's the same one over and over. – deroby Nov 18 '21 at 12:46

2 Answers2

0

You can try an EXISTS condition as you don't need any columns from the companies table:

SELECT tr.* 
FROM transactions tr
WHERE exists (select *  
              from companies c 
              where c.id = tr.company_id 
                and c.code ILIKE '%777%')
ORDER BY tr.id desc 
LIMIT 10

But in the end, the order by is probably the bottleneck here. If e.g. 10 million transactions are returned, then sorting those 10 million rows will take some time.

  • Similar behavior with the subquery. Times out after 30 seconds on my read replica. If I remove the ILIKE in its entirety the system returns me 10 rows pretty quickly with the ORDER BY. – Afrodog Nov 17 '21 at 16:20
0

An EXPLAIN of the query would be helpful. That said, I don't think the ILIKE is bottlenecking you here but rather a scan on the transactions table. Followed by a potentially large ORDER BY sorting operation

Although (I think) the system is already doing this automagically, let's split up the functionalities and force the order of operations manually by using an (indexed) temptable.

Something along the lines of this:

SELECT id
  INTO TEMPORARY TABLE matching_companies
  FROM companies 
 WHERE companies.code ILIKE '%777%';
 
CREATE UNIQUE INDEX idx_matching_companies_id ON matching_companies (id);

SELECT t.* 
  FROM transactions t
  JOIN matching_companies c
    ON c."id" = t."company_id"
ORDER BY t.id desc LIMIT 10;

Assuming few companies match the 777-requirement, this brings to light that an index on the transactions table on company_id should come in handy here.

I wouldn't be too surprised if simply adding the index might already speed up your original query because IMHO the system will split the query in pretty much the same operations (minus the idx_matching_companies_id maybe)

deroby
  • 5,902
  • 2
  • 19
  • 33