Combine two tables into a new one so that select rows from the other one are ignored

Question

I have two tables that have identical columns. I would like to join these two tables together into a third one that contains all the rows from the first one and from the second one all the rows that have a date that doesn't exist in the first table for the same location.

Example:

transactions:

date    |location_code| product_code | quantity 
------------+------------------+--------------+----------
2013-01-20 | ABC         | 123          |  -20         
2013-01-23 | ABC         | 123          |  -13.158
2013-02-04 | BCD         | 234          |  -4.063

transactions2:

date    |location_code| product_code | quantity 
------------+------------------+--------------+----------
 2013-01-20 | BDE         | 123          |  -30         
 2013-01-23 | DCF         | 123          |  -2
 2013-02-05 | UXJ         | 234          |  -6

Desired result:

date    |location_code| product_code | quantity 
------------+------------------+--------------+----------
 2013-01-20 | ABC         | 123          |  -20         
 2013-01-23 | ABC         | 123          |  -13.158
 2013-01-23 | DCF         | 123          |  -2
 2013-02-04 | BCD         | 234          |  -4.063
 2013-02-05 | UXJ         | 234          |  -6

How would I go about this? I tried for example this:

SELECT date, location_code, product_code, type, quantity, location_type, updated_at
      ,period_start_date, period_end_date
   INTO transactions_combined
   FROM ( SELECT * FROM transactions_kitchen k
          UNION ALL
          SELECT *
            FROM transactions_admin h
            WHERE h.date NOT IN (SELECT k.date FROM k)
        ) AS t;

but that doesn't take into account that I'd like to include the rows that have the same date, but different location. I have Postgresql 9.2 in use.

Your query doesn't agree with your sample data. Please repair this broken question. While being at it, make clear what you want by displaying the result according to your sample. Also `psql` is the default command line terminal of PostgreSQL. I fixed the tag. Now also tell us the version in use. — Erwin Brandstetter, Mar 21 '13 at 23:15
Clarified the question by stating the desired result with the example data and added the version number of my PostgreSQL. And yes, I know my query is broken, that's just give a sense of what I've tried :) — jesseniem, Mar 26 '13 at 11:20

score 1 · Answer 1 · answered Mar 21 '13 at 22:06

1

Change UNION ALL to just UNION and it should return only unique rows from each table.

answered Mar 21 '13 at 22:06

Ilion

6,772
3
24
47

`UNION` only excludes *complete duplicates* and is clearly wrong. – Erwin Brandstetter Mar 21 '13 at 23:33

Erwin Brandstetter · Accepted Answer · 2020-08-25T22:09:03.023

UNION simply doesn't do what you describe. This query should:

CREATE TABLE AS 
SELECT date, location_code, product_code, quantity
FROM   transactions_kitchen k

UNION  ALL
SELECT h.date, h.location_code, h.product_code, h.quantity
FROM   transactions_admin h
LEFT   JOIN transactions_kitchen k USING (location_code, date)
WHERE  k.location_code IS NULL;

LEFT JOIN / IS NULL to exclude rows from the second table for the same location and date. See:

Select rows which are not present in other table

Use CREATE TABLE AS instead of SELECT INTO. The manual:

CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.

Or, if the target table already exists:

INSERT INTO transactions_combined (<list names of target column here!>)
SELECT ...

Aside: I would not use date as column name. It's a reserved word in every SQL standard and a function and data type name in Postgres.

Combine two tables into a new one so that select rows from the other one are ignored

2 Answers2

Linked