I'm trying to find and remove duplicates from a table consisting of 29 columns as follows:
[Partitioned Commodity Code] (nvarchar(max),null)
[Supplier] (nvarchar(255),null)
[Manufacturer] (nvarchar(255),null)
[Order ID] (nvarchar(55),null)
[Status] (nvarchar(25),null)
[Ordered Date] (nvarchar(105),null)
[Order Method] (nvarchar(55),null)
[Ship to Address] (nvarchar(255),null)
[Address] (nvarchar(255),null)
[City] (nvarchar(55),null)
[State] (nvarchar(55),null)
[Postal Code] (nvarchar(55),null)
[Requisitioner] (nvarchar(100),null)
[Accounting Entity] (nvarchar(255),null)
[Preparer] (nvarchar(255),null)
[Item No] (float,null)
[Description] (nvarchar(max),null)
[Nominal Account] (nvarchar(255),null)
[LineItem ProjectCode] (float,null)
[Cost Center] (nvarchar(100),null)
[CostCenter Description] (nvarchar(max),null)
[Supplier Part Number] (float,null)
[Is Catalogue] (nvarchar(55),null)
[Unit Price] (float,null)
[Quantity] (float,null)
[Line Item Total] (float,null)
[Line Item Total USD] (float,null)
[Currency] (nvarchar(55),null)
[Request Cross Reference] (nvarchar(55),null)
In this dataset, there is no primary key and honestly I can't figure out what combination of keys can be treated as primary key to find and remove duplicates.
I tried the code below considering [Order ID]
:
select *
from [PO_BuySmart_Main] a
join
(select [Order ID], count(*) as OID
from [PO_BuySmart_Main]
group by [Order ID]
having count(*) > 1 ) b on a.[Order ID] = b.[Order ID]
order by a.[Order ID]
I want to consider all columns to check duplicate for and delete those records accordingly.
[PO_BuySmart_Main]
is the table name for which a single stored procedure will do this task.
Any help would be much appreciated, thanks.