To work on database related stuffs. Mostly it is done when client send you its data in form of excel sheets and you push that data to database tables after some excel manipulations. I have also done it many times.
A very common problem faced in this approach is that it might result in duplicate rows at times, because data sent is mostly from departments like HR and finance where people are not well aware of data normalization techniques [:-)].
I will use Employee table where column names are id, name, department and email.
Below are the SQL scripts for generating the test data.
Create schema TestDB;
CREATE TABLE EMPLOYEE
(
ID INT,
NAME Varchar(100),
DEPARTMENT INT,
EMAIL Varchar(100)
);
INSERT INTO EMPLOYEE VALUES (1,'Anish',101,'anish@howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (2,'Lokesh',102,'lokesh@howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (3,'Rakesh',103,'rakesh@howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (4,'Yogesh',104,'yogesh@howtodoinjava.com');
--These are the duplicate rows
INSERT INTO EMPLOYEE VALUES (5,'Anish',101,'anish@howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (6,'Lokesh',102,'lokesh@howtodoinjava.com');
Solution:
DELETE e1 FROM EMPLOYEE e1, EMPLOYEE e2 WHERE e1.name = e2.name AND e1.id > e2.id;