Need help figuring out logic to manipulate data in a local table (SQL)

Question

I have a table which looks like the follwing.

EMPLID EFFDT    EFFSEQ     DEPTID   JOBCODE FLAG1     FLAG2      FLAG3
  X    <date>   <number>   <text>   <text>  <number>  <number>   <number>
  X                         
  Y                         
  Z    <date1>  <number1>  <text1>  <text1> <number1>  <number1>  <number1>
  Z    <date2>  <number2>  <text2>  <text2> <number2>  <number2>  <number2>
  Z

What I'm not able to figure out is, how to eliminate the 2nd and the last row while managing to keep the 3rd row (EMPLID=Y)

In words, I need to eliminate records of employees where there is one with full population and keep the ones which is the only one which is empty. This is a part of a much larger problem I'm having to face. I wish I had the time to discuss the whole problem.

Thank you

**What** database system, and which version?? **SQL** is just the Structured Query Language - a language used by many database systems - SQL is **NOT** a database product... stuff like this is very often vendor-specific - so we really need to know what database system you're using.... — marc_s, May 10 '12 at 18:48
possible duplicate of [SQL - How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows) — Chris Gessler, May 10 '12 at 18:53
I have a local table in access. I need to know logically, how i could go about doing that I asked above. — user1387874, May 10 '12 at 18:53

score 0 · Answer 1 · edited May 23 '17 at 12:04

You need a 'DELETE DUPE' query. The idea behind this type of query is to group rows by some unique key, order by either primary key or maybe a date time field, then delete everything except the first one, or the last one. If you need to isolate groups that have duplicates, add a HAVING clause (i.e. having count(*) > 1)

Here's one on SO:

T-SQL: Deleting all duplicate rows but keeping one

with cte as (   
  select [foo], [bar],       
    row_number() over (partition by foo, bar order by baz) as [rn]   from table ) 
delete cte where [rn] > 1

If you're not using SQL Server, I'm sure there's a method to do this in almost all other DBMS.

Here's another one that's more generic on SO:

How can I remove duplicate rows?

DELETE MyTable  
FROM MyTable 
LEFT OUTER JOIN (    
  SELECT MIN(RowId) as RowId, Col1, Col2, Col3     
  FROM MyTable     
  GROUP BY Col1, Col2, Col3 
) as KeepRows ON    MyTable.RowId = KeepRows.RowId 
WHERE    KeepRows.RowId IS NULL

score 0 · Answer 2 · answered Mar 30 '13 at 18:33

Depending on the columns that you suppose to be null or not you can use :

DELETE TABLENAME
WHERE
EMPLID IN (
SELECT DISTINCT EMPLID
    FROM TABLENAME
WHERE EFFDT IS NOT NULL [OR EFFSEQ IS NOT NULL [...]]
) AND EFFDT IS NULL [AND EFFSEQ IS NULL [...]]

D'Arcy Rittich · Answer 3 · 2012-05-10T19:01:59.293

-1

I am assuming you only want to remove the rows in your query, not from the table itself:

select *
from MyTable
where EMPLID = 'Y'
    or coalesce(EFFDT, '') <> ''

edited May 10 '12 at 19:01

answered May 10 '12 at 18:34

D'Arcy Rittich

167,292
40
290
283

Hi RedFilter. I do not understand. My table has about 50k records with a combination of cases like the emlid's x,y and z. I'm also not clear on what coalesce does. I've taken a quick look online and its not very clear. – user1387874 May 10 '12 at 18:43
@user `COALESCE` returns the first non-NULL value, else NULL. I am using it to convert NULLs to an empty string in this case. It is not needed if you have no NULL values. It could be replaced with `or EFFDT <> '' ` – D'Arcy Rittich May 10 '12 at 19:01

Need help figuring out logic to manipulate data in a local table (SQL)

3 Answers3