0

I have two tables whose data is as follows:

1st Table:

ID NAME   GENDER  AGE
1   test1   M     18
2   test2   F     20
3   test3   M     18

2nd Table:

ID SubjectCode
1    PHY1
2    PHY2
3    MATH1

I have created another table with 2 foreign keys both being Primary keys of above two tables. It is as follows:

Student_ID     Subject_ID
1               PHY1
2               MATH1  
1               PHY2
2               PHY2

This third table is created by inserting student id from the table and subject id from 2nd table. But as you can see one student can apply for multiple subjects, and that creates redundancy of student IDs. I am a beginner in SQL and want to convert the third table into 2nd normal form. How do I do it? Which operations should I perform that would help me achieve it?

I was a bit confused regarding 2NF; however, I don't want the IDs in my third table to repeat. I want the IDs to remain unique and the new subject codes corresponding to the same IDs to be included in a different column. I hope I have conveyed what I mean to ask. What do you suggest to achieve it?

philipxy
  • 14,867
  • 6
  • 39
  • 83
Del Monte
  • 153
  • 3
  • 14
  • 2
    Why do you think a redundancy of Student IDs breaks 2NF? What do you think 2NF is? – Tab Alleman Aug 22 '17 at 13:24
  • 3
    I am guessing that Su_ID should be the identity value of your second table, not the SubjectCode. And since you say you are learning I would suggest you not get in the habit of naming your primary key ID. It is ambiguous and challenging. It causes confusion because a single column name has to change names based on usage which is not a great naming convention. – Sean Lange Aug 22 '17 at 13:25
  • 1
    No, multiple references to the same row (student or subject ) from the other fact table is not considered a redudancy. It's OK . – Serg Aug 22 '17 at 13:26
  • I think you are in 2NF minus that your junction table is using subject codes instead of the `ID`. I think you should relate student ID vales to subject ID values. Oops, just saw Sean's comment. – Tim Biegeleisen Aug 22 '17 at 13:27
  • 2
    If your third table's PK is a composite of both fields (it should be) then there is no redundancy. – JBC Aug 22 '17 at 13:30
  • @gordonlinoff maybe you can help him ? – Noob Aug 22 '17 at 14:19
  • Yeah sure, I can use Subject ID instead of SubjectCode but still the Student_ID column will have repeating values which I don't want. – Del Monte Aug 22 '17 at 15:27
  • Please edit your question to say what reference you are using for "normalization". – philipxy Aug 23 '17 at 22:54
  • 1
    *"as you can see one student can apply for multiple subjects, and that creates redundancy of student IDs."* No it doesn't. It repeates student IDs, which are *necessary*, not redundant. That's exactly what foreign keys are *supposed* to do--repeat in each row that references another table. – Mike Sherrill 'Cat Recall' Sep 28 '17 at 00:52

2 Answers2

1

Normalization to higher NFs (normal forms) involves FDs (functional dependencies), yet you didn't mention any. You use vague unhelpful terms like "repeat" & "redundant". Multiple occurrences of the same value in a relation appear under all NFs. You mention PKs (primary keys) but they don't matter, CKs (candidate keys) do. Although design involves multiple tables, normalinzing a table is independent of other tables, so your 1st & 2nd tables have nothing to do with normalizing your 3rd. So it seems clear you don't understand absolute basics, and you need to follow a textbook.

The two common definitions of 2NF are phrased in terms of not having any FDs of certain kinds that are said to be "in violation". Note that you have to know all the FDs that hold. You can always losslessly decompose in a certain way that gives smaller components where one such FD is no longer a problem. This can reasonably vaguely be referred to as "moving the FD to its own component, leaving another component". If you keep doing it then you will finally have all 2NF components--a 2NF design.

However, that's a bad way to decompose a table to 2NF, because it can needlessly produce more tables than necessary, preclude good higher NF designs from turning up later, and fail to "preserve FDs". Use a 3NF algorithm, which will preserve FDs, to get 2NF. We don't normalize to higher NFs by going through lower NFs anyway. (Which is why 2NF is unimportant.)

PS Using common sense expectations, your third table has sole CK {Student_ID, Subject_ID} and is in 5NF.

philipxy
  • 14,867
  • 6
  • 39
  • 83
-2

It could be better if you enter subject_id instead of subject_code details like below:

ID   St_ID     Su_ID
1    1         1
2    2         3
3    1         2

Here Id is primary key for 3rd table and other 2 are foreign keys of first 2 tables,

Hoping that ID in 2nd table is a primary key, you could do that other than that it's okay

  • 5
    Why add yet another identity to this? A composite key of St_ID and Su_ID is adequate and probably a better choice since there should only ever be one student/subject relationship according to the OP. – Sean Lange Aug 22 '17 at 13:35
  • FWIW this also violates the 5th normal form. – JBC Aug 22 '17 at 15:05
  • @JBC Under the expected interpretation, ie ID unique and projecting it away giving the 3rd table, this is in 5NF. Adding ids never changes a table's highest NF. – philipxy Aug 24 '17 at 01:09