0

I have one table, say buchas containing a list of products. I had to create another table buchas_type with the type (class if you will) of the products in buchas.

Now I have to reference the class of product in buchas, by adding a type_id column.

So, my problem can be divided in two parts:

I need a subquery to "guess" the product type by its name. That should be somewhat easy, since the name of the product contains somewhere the name of its type. Example:

   -----------------Name----------------- | ---Type---
               BUCHA GOB 1600                   GOB

The problem is I have a type GOB(2) that will mess things up with the type GOB. (I have also other look alike types raising the same problem).

So far, I got this:

SELECT buchas_type.id 
FROM buchas_type 
  JOIN buchas ON buchas.description LIKE '%' || buchas_type.type || '%'
ORDER BY length(type) desc LIMIT 1;

That will solve the problem with the look alike types, since it returns the longest match. However, I need a WHERE clause otherwise I get always the same buchas_type_id. If I try for instance Where buchas.id = 50 I get a correct result.

The second part of the problem is the UPDATE command itself. I need to fill up the recently created buchas_type_id from the subquery I showed in (1). So, for every row in buchas, it has to search for the type in buchas_type using the current row's description as a parameter.

How to achieve that?

Charley R.
  • 187
  • 1
  • 8

1 Answers1

1

Assuming you have a primary key on buchas, the first part of your problem can be solved with distinct on.

select * from buchas;

 buchas_id |    description    | buchas_type_id 
-----------+-------------------+----------------
         1 | BUCHA GOB 1600    |               
         2 | BUCHA GOB(2) 1700 |               
(2 rows)

select * from buchas_type;

 buchas_type_id |  type  
----------------+--------
              1 | GOB
              2 | GOB(2)
(2 rows)

select distinct on (b.buchas_id)  b.buchas_id, t.buchas_type_id, b.description, t.type
  from buchas b 
  join buchas_type t 
    on b.description ilike '%'||t.type||'%' 
 order by b.buchas_id, length(t.type) desc;

 buchas_id | buchas_type_id |    description    |  type  
-----------+----------------+-------------------+--------
         1 |              1 | BUCHA GOB 1600    | GOB
         2 |              2 | BUCHA GOB(2) 1700 | GOB(2)
(2 rows)

With this solved, you can do an update...from:

with type_map as (                                              
  select distinct on (b.buchas_id)  b.buchas_id, t.buchas_type_id
    from buchas b 
    join buchas_type t 
      on b.description ilike '%'||t.type||'%' 
   order by b.buchas_id, length(t.type) desc
)
update buchas
   set buchas_type_id = t.buchas_type_id
  from type_map t
 where t.buchas_id = buchas.buchas_id;

UPDATE 2

select * from buchas b join buchas_type t on t.buchas_type_id = b.buchas_type_id;
 buchas_id |    description    | buchas_type_id | buchas_type_id |  type  
-----------+-------------------+----------------+----------------+--------
         1 | BUCHA GOB 1600    |              1 |              1 | GOB
         2 | BUCHA GOB(2) 1700 |              2 |              2 | GOB(2)
(2 rows)
Mike Organek
  • 11,647
  • 3
  • 11
  • 26
  • Thank you Mike, it worked perfectly. While waiting for help, I kept trying to solve my problem and suspected that my subquery was the real issue. Your answer proved that right. I don't quite understand how the `Distinct On` statement solved the type lookalike issue. Do you mind explaining it? Thank you very much for your help. – Charley R. Jul 17 '20 at 17:31
  • @CharleyR. Here is a good explanation: https://stackoverflow.com/a/9795768/13808319 What we did was say that we want only one row per `buchas_id`. The row that is chosen is based on the `order by b.buchas_id, length(t.type) desc`. This gets us the longest matching `buchas_type` in case any shorter ones matched also. If two matches have the same length, then the one that gets picked is nondeterministic. Perhaps I should have put `order by b.buchas_id, length(t.type) desc, t.buchas_type_id` so that it will always return the row with the lowest `buchas_type_id` in case of ties. – Mike Organek Jul 17 '20 at 17:46
  • I see. Thank you once again. – Charley R. Jul 17 '20 at 18:00