-2

I have over 3000 rows of duplicate data and I basically need to delete the duplicate data where Env_ID ='1' but I firstly need to check whether there are some data rows with Env_id = '1' that are not duplicates compared to Env_ID ='4'

Is this clear? Can anyone please assist?

Hardik
  • 3,815
  • 3
  • 35
  • 45
  • so basically this is the code (not entirely accurate) select * from [Core_IBE].[dbo].[CR_ITEM] where Env_ID = '1' BUT IS NOT '4' – user3983253 Aug 27 '14 at 14:09
  • This isn't very clear can you please post some table structure, sample data, and what your expected duplicate in the sample data is. – ZeRaTuL_jF Aug 27 '14 at 14:09
  • 2
    @user3983253 can you take a look into http://sqlfiddle.com/ and add that to your question? – SOfanatic Aug 27 '14 at 14:12
  • sorry -- this would be (Item_Number) – user3983253 Aug 27 '14 at 14:13
  • So within table (CR_ITEM) I have 7300 rows, 3434 have Env_id = '1' and 3867 have Env_ID ='4'. I want to safely remove the Env ID = '1' data but I needed to firstly check whether there are any rows containing ENV ID ='1' that are not duplicates of ENV_ID = '4'. Hope this is clearer. – user3983253 Aug 27 '14 at 14:21
  • That's about the same you said in the question.. Still not making it very clear though. Try to add some example data. Are you trying to delete rows that have `Env_id=1` but not also `Env_id=4`? – NickyvV Aug 27 '14 at 14:24
  • Is there a primary key on the table? – Keith Payne Aug 27 '14 at 14:26
  • I'm guessing that you are saying that there might be two rows one with Env_id=1 and another with Env_id=4 BUT the rest of the fields of the table might be exactly the same? That's what makes them duplicate? In order to help you at least put the schema of your table. – Yanire Romero Aug 27 '14 at 14:26
  • I'll try again, sorry. So Basically the env_id = '1' is the old environment and env_id = '4' is the new environment. So when the data got mirrored across to the data warehouse all the env_id = '1' data stayed in there. Before I delete it out of the table I needed to check whether there were any rows containing env_id = '1' that are not only duplicates – user3983253 Aug 27 '14 at 14:28
  • 1
    @user3983253 instead of describing what the problem is why can't you edit your question to contain, the table structure. This would be a lot more clear than explaining with words. Just post the table structure, some sample data and what the duplicates are in the sample data. – ZeRaTuL_jF Aug 27 '14 at 14:30
  • Source_ID Env_ID Record_ID Item_Number Item_Description Item_Class 1 1 IM 0010000500K2#TI BOLA B.1.6".5-1AC KA 1 4 IM 0010000500K2#TI BOLA B.1.6".5-1AC KA – user3983253 Aug 27 '14 at 14:34
  • How about the complete schema? Constraints? At least PK? – Yanire Romero Aug 27 '14 at 14:34
  • I strongly suggest looking at some of the other "duplicate rows in a table" questions on this site. Your question is likely already answered. Here is a good one to start you off http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows – Oliver Aug 27 '14 at 14:34
  • 6 columns all with duplicate data, the only difference is the env_id. I need to check there are no ROWS containing env_id = '1' that AREN'T DUPLICATES. – user3983253 Aug 27 '14 at 14:35
  • Source_id, Env_ID and Item_Number are all PK. There are no constraints – user3983253 Aug 27 '14 at 14:42
  • @user3983253 There are loads of people here trying to help you, and asking for more information so they can help. How about you giving us the information we need to help you? "Source_ID Env_ID Record_ID Item_Number Item_Description Item_Class 1 1 IM 0010000500K2#TI BOLA B.1.6".5-1AC KA 1 4 IM 0010000500K2#TI BOLA B.1.6".5-1AC KA" is absolutely meaningless to us. Edit your question, give us a table structure and some sample data and we will help, otherwise we can't. – HoneyBadger Aug 27 '14 at 14:44
  • Ok, the column names in my table are Source_ID, ENV_ID, Record_ID, Item_Number, Item_Description, Item Class. The sample data from these columns is (in order) (1,1,IM,0010000500K2#TI,BOLA B.1.6".5-1AC,KA). This example of data is historic data and needs removing from the database, there are over 3000 rows with ENV_ID = '1', all of these need deleting because they are from an old environment. The new environment has populated data into the database that has created duplicates of the historic data, the only difference is the ENV_ID (environment id) which is now 4 (tbc)... – user3983253 Aug 27 '14 at 14:58
  • So the duplicate rows are the same (with the execption of the ENV_ID '4'). I know there are over 3000 rows with ENV_ID '1' and over 3000 rows with ENV_ID '4' that contain all same data. Before I delete the ENV_ID '1' rows I wanted to check if there are any more rows in the database containing ENV_ID '1' that are not duplicated by ENV_ID '4' rows. Please tell me this makes sense, I dont know how else to explain lol – user3983253 Aug 27 '14 at 15:02

1 Answers1

0

Here it goes:

Schema:

create table TableName 
( Source_ID varchar(100) not null,
 Env_ID varchar(100) not null,
 Record_ID  varchar(100),
 Item_Number varchar(100) not null, 
 Item_Description varchar(100), 
 Item_Class varchar(100)
)
alter table TableName add constraint PK_Env primary key (Env_ID, Source_ID, Item_Number)

insert into TableName (Source_ID,
                   Env_ID, 
                   Record_ID, 
                   Item_Number, 
                   Item_Description,
                   Item_Class)
values ('1','1','IM','0010000500K2#TI', 'BOLA B.1.6".5-1AC','KA')

insert into TableName (Source_ID,
                   Env_ID, 
                   Record_ID, 
                   Item_Number, 
                   Item_Description,
                   Item_Class)
values ('1','4','IM','0010000500K2#TI', 'BOLA B.1.6".5-1AC','KA')

Solution from: How can I remove duplicate rows?

;WITH cte
 AS (SELECT ROW_NUMBER() OVER (PARTITION BY Source_ID, Record_ID, Item_Number, Item_Description, Item_Class
                                   ORDER BY (Env_ID) desc) RN
     FROM   TableName WHERE Env_ID = '1' or Env_ID = '4')
DELETE FROM cte
WHERE  RN > 1

It checks all fields BUT Env_ID. Hope it helps and next time please: https://stackoverflow.com/help/how-to-ask

Community
  • 1
  • 1
Yanire Romero
  • 480
  • 3
  • 13