I have a MySQL DB which is currently appending new data from CSV files. Some of the rows are duplicates apart from a few columns (i.e last_status).
I would like to filter out the whole DB by only capturing the latest record based on MAX(last_status).
Assuming I have 4 rows with similar ID:
ID, last_status,....(50 other columns)
100, 08/08/2020,...
100, 09/08/2020,...
200, 09/08/2020,...
200, 11/08/2020,...
I expect the result to be like this for all records:
ID, last_status,...(50 other columns)
100, 09/08/2020,...
200, 11/08/2020,...
I have tried this:
select *
from total_report s1
inner join
(
select MAX(last_status) as last_status, ID
from htotal_report
group by ID
) s2
on s1.ID = s2.ID
and s1.last_status = s2.last_status
The reason I need this is to visualize the data in Tableau and now it is counting the duplicated ID rows separately. I would like to create a View based on the question that could be used in Tableau to solve this.
Edit: ID is a unique ID, not an auto-incremented ID, nor a PK or FK.