1

I have a table which looks like the follwing.

EMPLID EFFDT    EFFSEQ     DEPTID   JOBCODE FLAG1     FLAG2      FLAG3
  X    <date>   <number>   <text>   <text>  <number>  <number>   <number>
  X                         
  Y                         
  Z    <date1>  <number1>  <text1>  <text1> <number1>  <number1>  <number1>
  Z    <date2>  <number2>  <text2>  <text2> <number2>  <number2>  <number2>
  Z                         

What I'm not able to figure out is, how to eliminate the 2nd and the last row while managing to keep the 3rd row (EMPLID=Y)

In words, I need to eliminate records of employees where there is one with full population and keep the ones which is the only one which is empty. This is a part of a much larger problem I'm having to face. I wish I had the time to discuss the whole problem.

Thank you

  • 1
    **What** database system, and which version?? **SQL** is just the Structured Query Language - a language used by many database systems - SQL is **NOT** a database product... stuff like this is very often vendor-specific - so we really need to know what database system you're using.... – marc_s May 10 '12 at 18:48
  • possible duplicate of [SQL - How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows) – Chris Gessler May 10 '12 at 18:53
  • I have a local table in access. I need to know logically, how i could go about doing that I asked above. – user1387874 May 10 '12 at 18:53

3 Answers3

0

You need a 'DELETE DUPE' query. The idea behind this type of query is to group rows by some unique key, order by either primary key or maybe a date time field, then delete everything except the first one, or the last one. If you need to isolate groups that have duplicates, add a HAVING clause (i.e. having count(*) > 1)

Here's one on SO:

T-SQL: Deleting all duplicate rows but keeping one

with cte as (   
  select [foo], [bar],       
    row_number() over (partition by foo, bar order by baz) as [rn]   from table ) 
delete cte where [rn] > 1 

If you're not using SQL Server, I'm sure there's a method to do this in almost all other DBMS.

Here's another one that's more generic on SO:

How can I remove duplicate rows?

DELETE MyTable  
FROM MyTable 
LEFT OUTER JOIN (    
  SELECT MIN(RowId) as RowId, Col1, Col2, Col3     
  FROM MyTable     
  GROUP BY Col1, Col2, Col3 
) as KeepRows ON    MyTable.RowId = KeepRows.RowId 
WHERE    KeepRows.RowId IS NULL 
Community
  • 1
  • 1
Chris Gessler
  • 22,727
  • 7
  • 57
  • 83
0

Depending on the columns that you suppose to be null or not you can use :

DELETE TABLENAME
WHERE
EMPLID IN (
SELECT DISTINCT EMPLID
    FROM TABLENAME
WHERE EFFDT IS NOT NULL [OR EFFSEQ IS NOT NULL [...]]
) AND EFFDT IS NULL [AND EFFSEQ IS NULL [...]]
Amine Chafai
  • 188
  • 2
  • 10
-1

I am assuming you only want to remove the rows in your query, not from the table itself:

select *
from MyTable
where EMPLID = 'Y'
    or coalesce(EFFDT, '') <> ''
D'Arcy Rittich
  • 167,292
  • 40
  • 290
  • 283
  • Hi RedFilter. I do not understand. My table has about 50k records with a combination of cases like the emlid's x,y and z. I'm also not clear on what coalesce does. I've taken a quick look online and its not very clear. – user1387874 May 10 '12 at 18:43
  • @user `COALESCE` returns the first non-NULL value, else NULL. I am using it to convert NULLs to an empty string in this case. It is not needed if you have no NULL values. It could be replaced with `or EFFDT <> '' ` – D'Arcy Rittich May 10 '12 at 19:01