So I have two tables (let's say x and y). Most of the data from both table are duplicated but there are some rows that are different. I insert whole data from those two tables into a new table (let's say table_mixed). There's one column that indicate table's date e.g. 20190307 for x and 20190308 for y So, for any duplicated rows, there will be a date column that is different.
num Code col1 col2 col3..... import_date file_date
-------- --------- ----------------- ---------- ----------
01 AA ...... 20190308 20190307
01 AA ...... 20190308 20190308
02 AA ...... 20190308 20190307
03 BB ...... 20190308 20190308
What I am trying to do is, I want to query a data such that, show a non-duplicated row from both table and for any duplicated row, shows only a row with recent date.
I have done some finding and I have tried this:
select *,max(file_date) over (partition by stx_import_date) max_date
from table_mixed;
where file_date is a date that tell the different date from each table and every row from both table has the same import_date.
num Code col1 col2 col3...... import_date file_date max_date
-------- --------- ------------ ---------- ----------
01 AA ...... 20190308 20190307 20190308
01 AA ...... 20190308 20190308 20190308
02 AA ...... 20190308 20190307 20190307
03 BB ...... 20190308 20190308 20190308
The result from this query show every row (including all duplicated row) and add another column (max_date) that show only the recent file_date for each of the row. But I want the result to show only what I mentioned above and no additional column (max_date).
This is the result that I am looking for:
num Code col1 col2 col3... import_date file_date
-------- --------- ------------ ---------- --------
01 AA ...... 20190308 20190308
02 AA ...... 20190308 20190307
03 BB ...... 20190308 20190308
Thank you
P.S Not only column num, code and import date that need to be duplicated but also other columns that I ..... So, what I mean duplicated row >> every column except file_date (which I have 10+ columns)
P.S.2 I edited the example so that you guy want get me wrong. There're other columns (like col1, col2, col3 and so on) that also used. How should I use partitioned by in this case