0

Please consider the following scenario.

We have a 0NF table

StudentTeacherTable:

StudentName StudentDepartment StudentDepartmentAdd TeacherName TeacherDepartment TeacherDepartmentAdd
    John          CS                  London           Dave        Eng, CS             Oxford
    Mike          CS                  London           Dave        Eng, CS             Oxford
    Chris         Eng                 Oxford           Dave        Eng, CS             Oxford

Ideally after normalization I would like to have tables like

Student Table:

StudentName Department TeacherName
    John        CS         Dave
    Mike        CS         Dave
    Chris       Eng        Dave

Teacher Table:

Name 
Dave

TeacherDepartment Table:

TeacherName DepartmentName
     Dave         CS
     Dave         ENG

Department Table:

   Name Address
    CS   London
    ENG  Oxford

However, if I follow normalization to the 3NF. I will get

Student Table:

StudentName Department TeacherName
    John        CS         Dave
    Mike        CS         Dave
    Chris       Eng        Dave

DepartmentForStudent Table:

   Name Address
    CS   London
    ENG  Oxford

Teacher Table:

Name 
Dave

TeacherToDepartment Table:

TeacherName DepartmentName
     Dave         CS
     Dave         ENG

DepartmentForStudent Table:

   Name Address
    CS   London
    ENG  Oxford

My question is that in which step in database normalization (1NF,2NF,3NF etc) I can merge/combine the studentDepartement with teacherDepartment columns into one table to derive the normalized form above?

In other words, following normalization rules. I will end up having a StudentDepartment table and a TeacherDepartment table rather than one Department table for both Student and Teacher

user2001850
  • 477
  • 4
  • 18
  • 1
    The Student table doesn't look like it will be normalized, unless all students are only allowed to have one teacher. (_Other remarks withdrawn_.) – Jonathan Leffler Apr 05 '15 at 13:59
  • 1
    Normalization through BCNF is based on functional dependencies. What are the functional dependencies? – Mike Sherrill 'Cat Recall' Apr 07 '15 at 02:26
  • @Mike Sherrill: normalization does imply cases where you need to have references for other table records in a table record. Normalization does not forbid to use surrogate keys for those... – Quicker Mar 14 '16 at 11:40
  • 1
    *Normalization* never introduces new attributes. It doesn't forbid using surrogate keys, but *normalization* never *introduces* surrogate keys. That's a different database design activity. – Mike Sherrill 'Cat Recall' Mar 14 '16 at 13:37
  • @Mike Sherrill: not sure, where the difference is between your and my words. Would you confirm or reject the statement "normalization does imply cases where you need to have references for other table records in a table record"? Why? - btw. that statement does say nothing about how those references are modeled (via existing or new attributes)... – Quicker Mar 15 '16 at 20:21
  • 1
    The difference is that I'm saying normalization never introduces new attributes. By definition, normalization involves decomposition by taking projections; projections select a subset of the existing attributes; a subset of existing attributes can't contain new attributes. – Mike Sherrill 'Cat Recall' Mar 16 '16 at 01:24
  • The designs & names are garbled. Why 2 DepartmentForStudent tables? Should 1 be DepartmentForTeacher? With different data? Should StudentDepartment/TeacherDepartment in the last paragraph have For in them? How is Department a "merge" of them? After fixing: Normalizing won't give either design. Eg 1-column Student & Teacher are redundant if normalizing since they are projections of other tables. You want tables with those columns that can hold more rows but normalizing won't give them. After dropping them: Edit to justify design 1 & derive design 2. But: justifying 1 will answer your question. – philipxy Aug 12 '22 at 00:13
  • Your question is not clear. Normalization doesn't "merge/combine" tables. So what do you mean by "in which step in database normalization (1NF,2NF,3NF etc) I can merge/combine"? Do you mean, which step in some design process that includes normalization? Why aren't you following one? Moreover we don't normally & needn't ever normalize to higher NFs by going through lower NFs. We use an algorithm proven to give a NF. So again, what is "in which step ... (1NF,2NF,3NF etc)"? In what sense is design 1 "after normalization"? You seem to have misconceptions. So, "justify design 1 & derive design 2". – philipxy Aug 12 '22 at 01:18

2 Answers2

0

Your question has nothing to do with normalization. You are asking the question, if of if not to physically join tables of similar types and same sets of attributes. Normalization has no preference in that matter. And basically there is no wrong or right. This is more about balance trade-offs according to a specific design setup:

option 1: have multiple tables (as you did show in you example): pros: - explicit database design -> easy to read - lower memory/disk space need as no type column is needed

cons: - when using surrogate or other no-natural keys: no unique cross table identifier which may make potential upcomming needs for change hard to manage - viewing accross all tables requires lots of unions (expecially if more than two tables)

option 2: have one table with an additional type column: pro's and cons in opposite direction of option 1

G*** may find you lots of resources to that topic.


2 examples: Storing hierarchical data (e.g. single table with type vs multiple tables with 1:1 key and differences...) in Relational Database Design Patterns?

http://sqlmag.com/sql-server/trouble-type-tables

Community
  • 1
  • 1
Quicker
  • 1,247
  • 8
  • 16
  • 1
    your answer is too practical not academic enough. my question is purely about theory which has right and wrong. – user2001850 Mar 15 '16 at 09:19
  • Being sure about the right or wrong question implies you know the answer to your question, which I do not understand anymore after your comment. Would be just fair, if you could update your audience with your findings. – Quicker Mar 15 '16 at 19:38
  • 1
    i mean i am sure in the theory of normalization there is clear right and wrong (rather than right/wrong based on how much you want to optimize it). but i don't know the answer to my question. – user2001850 Mar 16 '16 at 09:28
  • WHY are you sure if you do not know the answer? You are searching an answer in a set of rules that do not fit to your question. You would not find rules about what to do with murders in the traffic law either. Stating there is an answer in the 'normalization law' to your question implies you assume, that your question fits to that 'law' (so to say: it is the right place to search). How do you get to that assumption? – Quicker Mar 16 '16 at 11:20
0

You write "Ideally after normalization I would like..."

this suggests you have been given the solution, as to an exercise. Always be careful about retro-fitting any work to a pre-set solution; in the case of normalisation, which depends on / helps to reveal relations between elements of data, you should be very circumspect about the assumptions underlying one, or another, solution.

That said, let's try and resolve this, bearing in mind that a set of normalised tables is your result, but normalisation is a process: more precisely, producing the 1, 2, the 3NF in that order, from a small data sample, is a precise process, which is often practiced when learning to normalise.

First, let us list the attributes involved. At this point, I'll add surrogate keys that are clearly needed for this data, and identify them with ID:

StudentID
StudentName
StudentDepartment
StudentDepartmentAdd
TeacherID
TeacherName
TeacherDepartment (repeating)
TeacherDepartmentAdd

Your data is confusing because the sample is small, and there are few cues as there might be in a filled in form or report. But I believe that I can make two assumptions: (1) The teacherDepartment is dependent on teacher, as the name suggests; (2) each teacher (like Dave in the data) has many students withing each department where they work. If this is the case, then "studentdept" and "teacherdept" are best processed as one attributes, the two columns help simply work out the dependencies.

Under these two assumptions, the process becomes familiar, only there are two levels of repeating groups:

      UNF                     1NF                   2NF (and 3NF)

  _TeacherID_            _TeacherID_           _TeacherID_
   TeacherName            TeacherName           TeacherName
   TeacherDepartmentAdd   TeacherDepartmentAdd  TeacherDepartmentAdd
|  Department 
|| StudentID             _TeacherID_*          _TeacherID_*
|| StudentName           _Department_          _Department_
|| StudentDepartmentAdd
                         _TeacherID_  )*       _StudentID_*
                         _Department_ )        _Department_
                         _StudentID_            TeacherID *
                          StudentName         
                          StudentDepartmentAdd _Department_
                                                StudentDepartmentAdd

                                               _StudentID_ 
                                                StudentName         

Two more assumption are needed: that the student and department determine the Teacher; and that the department determines the department address (where that department teaches). These aren't at all certain from the small data sample, but I accept them on the basis of the result you said you should obtain. In any real situation, you would ask for a larger data sample, or confirm the structure of the data with its actual users. On that basis, the 3NF is the same as the 2NF, so I do not write above.

So the data given is compatible with the results you are looking for. But, you should understand:

  • Normalisation is not normally done from such incomplete information. Here, to arrive at the expected result, we have to assume many things to compensate for the absence of real data.
  • The purpose of this process is to identify the correct choice of determinants, but it doesn't replace reasoning about the sensible determinancy relationships within your data. Again, this is obvious from this case, and the limited information given by the data sample.
boisvert
  • 3,679
  • 2
  • 27
  • 53