0

I have a problem with database query. I have three tables projects, developers and email. In developers table, there are a lot of rows with same name but different email. I have to insert the distinct names but all the emails(in the row of name to which they belong) in email table i.e

    example
    /////////////////////////////////////////////
     developers table have records:-
     id_developer  project_id  name  email
          0            1       umar  umar@gmail.com
          1            1       umar  umar@developers.com

     Now i want to inert the data in email table as:-
     user_id    name      email_ids
        0       umar      umar@gmail.com
                          umar@developers.com
    ////////////////////////////////////////////

    projects
    ----------
    id_project
    name
    ----------

    developers
    ----------
    id_developer
    project_id
    name
    email
    ----------

    email
    ----------
    user_id
    name
    email_ids
    ----------

Following is my current query. Please help me. Thanks in advance

    INSERT INTO email(user_id, dev_name, email_ids) 
    SELECT p.id_project, 
           d.name, 
           d.email 
      FROM projects p 
            INNER JOIN developers AS d 
            ON p.id_project = d.project_id 
     WHERE d.name IN (SELECT name 
                        FROM developers 
                       GROUP BY name HAVING(COUNT(name) > 1 )) 
     GROUP BY(d.name)
Jorge Campos
  • 22,647
  • 7
  • 56
  • 87
Muhammad Umar
  • 349
  • 3
  • 14
  • 1
    Why are you trying to de-normalize your data? You shouldn't do that. – Jorge Campos May 15 '17 at 22:02
  • @JorgeCampos, I am implementing Oliva's strategy here. I which I have to identify same developers having different email ids. Which i will insert into another table to make it as a record. – Muhammad Umar May 15 '17 at 22:05
  • 1
    Sorry my ignorance here, what is "Oliva's strategy" ? Do you have any article on it? – Jorge Campos May 15 '17 at 23:29
  • Yes I have research papers on it. The strategy says that even a person have different email addresses, He would mention same name during registration process. Now I have to implement this here. I have explained the query to my best. Now what are the possible solutions to achieve this – Muhammad Umar May 15 '17 at 23:32
  • So, "even a person have different email addresses" why not have a table for person (or developer if a person here is a developer) and a table for emails linked with persons table? That would do the job just fine and you would have your data perfectly normalized. Having a field with delimited data will give you a lot of headaches in the future. – Jorge Campos May 15 '17 at 23:48
  • Are you talking about trigger.? Or if not then can you please give a solution as an answer? – Muhammad Umar May 15 '17 at 23:53
  • No, I'm not talking about a trigger, just a proper way of modeling your data. I will provide you an answer on that. – Jorge Campos May 15 '17 at 23:54
  • 1
    One question, your current attempted query implies that the project_id is the user id, so what is the developer table? I think that the user_id should be actually the developer_id, Am I right? If not, please clarify! – Jorge Campos May 16 '17 at 00:00
  • Yes you are right – Muhammad Umar May 16 '17 at 00:02
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/144296/discussion-between-muhammad-umar-and-jorge-campos). – Muhammad Umar May 16 '17 at 00:38

1 Answers1

1

After some conversation in the comments what you really need is a proper data modeling here.

Having the data the way you wan't in the database is a very bad practice.

  user_id    name      email_ids
    0       umar      umar@gmail.com
                      umar@developers.com

You will end it up having problems in the future to retrieves this data because you will have to figure out a way how to retrieve or split this data when you need then.

So, based on your current model to attend your requirement you would need just to change the table email a bit. Your model would be this way:

projects       developers        email
----------     -------------     ------------
id_project     id_developer      id
name           project_id        id_developer
               name              email
----------     -------------     ------------

So, since you already have the data in the developers table lets first drop table table email and recreate it the right way. You will need to execute:

drop table email;
create table dev_email( -- changed the name because there is a field with same name
    id INTEGER AUTO_INCREMENT NOT NULL,
    id_developer INTEGER NOT NULL, -- this column should be the same type 
                                   -- as id_developer in the table developers
    email VARCHAR(150) NOT NULL
    PRIMARY KEY pk_email (id),
    CONSTRAINT uk_developer_email UNIQUE (id_developer, email), -- that will avoid duplicates,
    CONSTRAINT fk_dev FOREIGN KEY (developer_id) 
         REFERENCES developers(id_developer)
          ON UPDATE RESTRICT ON DELETE RESTRICT
);

Now lets fill this table with the right data:

INSERT INTO dev_email (id_developer, email)
   SELECT min(id_developer), email
     FROM developers
    GROUP BY email;

After that we must delete the duplicated data from the developers table like so:

DELETE FROM developers d
  WHERE NOT EXIST (SELECT 1 
                     FROM dev_email de 
                    WHERE de.id_developer = d.id_developer);

Then we drop the column that is no longer needed in the developers table:

ALTER TABLE developers DROP COLUMN email;

This should give you a proper normalized model.

Now if you need to retrieve the developer with all emails concatenated (which is simpler than to split it) you just do:

 SELECT d.id_developer, 
        d.name,
        GROUP_CONCAT(e.email, ', ') as emails
   FROM developers d
            INNER JOIN dev_email e
               ON d.id_developer = e.id_developer
  GROUP BY d.id_developer, 
           d.name

PS.: I did all of this out of my head, please run it in a test environment first (a copy of your current database to be safe). It should be ok but better safe than sorry right?

Jorge Campos
  • 22,647
  • 7
  • 56
  • 87
  • Hi, I am getting error unidentified statement type near unique key and a comma or closing bracket was expected. I am trying to fix it. I will get back to you – Muhammad Umar May 16 '17 at 00:35
  • Oooh sorry, I just missed a comma after the unique key command. I will edit it. – Jorge Campos May 16 '17 at 00:35
  • Hahahahah I actually added the comma... after the comment xD – Jorge Campos May 16 '17 at 00:36
  • I fixed the unique syntaxe, it was wrong. It should be fine now. – Jorge Campos May 16 '17 at 00:47
  • The error you mention in the chat is because the referenced column is not the primary key or is with a different type (as I mentioned in the comments). Refer to this thread: http://stackoverflow.com/questions/4061293/mysql-cant-create-table-errno-150 – Jorge Campos May 16 '17 at 03:15
  • Hi @Jorge Campos. I will fix it today and mark it as an answer. Sorry for the late reply. Many thanks for your help – Muhammad Umar May 16 '17 at 10:28