1

I have 4 tables. Each of them contain the following attributes:

Table 1 :
 Person (Id (Primary key), Name, Occupation, Location, SecondJob, PerHour, HoursWorked, Phone, Workphone)

Table 2 :
 Job (Id (Foreign key that refers to Person), Title, Name, Location, Salary)

Table 3 :
 SecondJob (Id (Foreign key that refers to Person), Title, Name)

Table 4:
 PhoneNumber (Id (Foreign key that refers to Person), Name, Phone, Workphone)

I can obtain the values of each attribute like Name, Title, Phone and Workphone from the Person table with the following psuedo SQL statement:

Select (ATTRIBUTE NAME) FROM Person WHERE Id IN (PERSONS ID)
  1. Does the fact that some of the information is being repeated in DIFFERENT TABLES (Data Redundancy), break (ie, not follow) the Third Normal Form (3NF)?

    Or should the values be put into the other Tables separately and reason what attribute is identifying with the Primary Key of the Table?

  2. I calculate Salary in Job by getting PerHour and HoursWorked from Person, then multiply them. I have also heard that this is redundant Data, due to the fact that is is data that you could extrapolate from existing Data within the Tables.

    But, does this break the Third Normal Form??

Chandrahas Aroori
  • 955
  • 2
  • 14
  • 27
  • 1
    This is so broken in terms of normalization. Why is "Name" showing up everywhere? Why isn't that information consolidated in the Person record? If you're deliberately de-normalizing for performance reasons do you have methods to keep this data in sync and understand the canonical source for each field? Why does PhoneNumber contain *two* numbers? You've got a lot of work here to iron out the details. – tadman Nov 19 '16 at 19:15
  • 1
    Remember in proper database design you have only [Zero, One or Infinity](http://en.wikipedia.org/wiki/Zero_one_infinity_rule), there is no two. That's why `SecondJob` as a table is extremely concerning. What if they have a third job? A fourth? A nineteenth? People change careers, get promoted, transferred, it's to be expected people will switch N times. Likewise, salary information should be associated with the job, not the Person. – tadman Nov 19 '16 at 19:17
  • Feedback on your response: 1. Whilst your second post actually had some valid points - Pointing out things like "So broken in terms of normalization", without solid pointers to as of why, is basically not useful feedback. 2. Secondjob is merely a Name. How it's formed out, still allows several different jobs to fill it - with foreign key referal by ID. 3. The normalization was still dependant upon primary keys and wholy identifying of said key. There was duplication of data. The phone part, needs modification, though. – Mikael Ascended Rusin Nov 19 '16 at 19:55
  • 1
    "put in the values into the other Tables seperately and reason what attribute is identifying with the Primary Key of the Table" is not clear. – philipxy Nov 19 '16 at 20:00
  • The main concern of the Topic was duplication of values, in DIFFERENT tables. Meaning a suggestion i brought up, is putting Duplicated Values into seperate Tables - Instead of having Duplication of the same value, in several tables. I.e : Name and ID in one Table, Job and Phone number for said Job in one, etc. – Mikael Ascended Rusin Nov 19 '16 at 20:02
  • Good data normalization means associating properties with their related entity as directly as possible, without duplication. Name is clearly a property of a person, not a job, and not a phone number. "SecondJob" is a violation of the Zero, One or Many principle, as is having a second phone number in a record. You asked if this is breaking normalization and the answer is: Yes. As for the reasons why, I don't know that I have they time to explain it in great detail, but I'd encourage you to find a good reference on the subject and learn from that. – tadman Nov 19 '16 at 20:45

1 Answers1

3

Does the fact that information is repeated in DIFFERENT TABLES (Data Redundancy), break against 3NF Normalization?

No. A table value or variable is or isn't in a given NF. This is independent of any other table. (We do also talk about a database being in NF when all of its tables are in that NF.)

Normalization can be reasonably said to remove redundancy. But there is lots of redundancy not addressed by normalization. And there is lots of redundancy that is not bad. And duplication is not necessarily redundancy. Just because data is repeated doesn't mean "information" is repeated. What data says by being or not being in a table depends on the meaning of the table.

But you seem to think that just because duplicating data in a different table doesn't violate 3NF that it doesn't violate other principles of good design. That's wrong. Also, it's 5NF that matters. The only reason lower NFs are used is that SQL DBMSs don't support 5NF well.

Or should i just put in the values into the other Tables seperately and reason what attribute is identifying with the Primary Key of the Table?

I guess you are trying to say, Should I only put the values in one table each and reconstruct the second table via queries involving shared keys? Ie, if you can get the values in a column by querying the rest of the database then should you avoid having that column? Generally speaking, yes.

Your question assumes a misconception. It's not a matter of "(exclusive) or" here. You should do both.

I calculate Salary in Job by getting PerHour and HoursWorked from Person, then multiply them. I heard that this is also redundant Data, due to it being data that you could extrapulate from existing Data in the Tables.

It is redundant given the rest of the database, because you could use a query instead. And if you don't constrain salary values appropriately then that is bad redundancy. Even if you do the column and constraint complicate the schema.

But does it break 3NF Normalization?

No, because the NF of a table is independent of other tables. But that doesn't mean it's ok.

(If you added Salary to Person, the new table would not be in 3NF. But then, SQL DBMSs have computed columns that make that ok, by making the non-3NF table with Salary a view of the 3NF table without it.)

Learn some database design method(s) and how they apply principles of good design. Your tables needlessly address overlapping aspects of the application. Also learn about JOIN in writing queries.

philipxy
  • 14,867
  • 6
  • 39
  • 83