-2

Consider the following table:

enter image description here

The primary key is a composite key consisting of PatID and PhysName. My professor says this table is in 3rd normal form. I thought it's not even in second normal form because the non-key attribute, Name, is not dependent on the entire primary key. You can identify the Name simply by looking at PatID. It is not dependent on PhysName.

philipxy
  • 14,867
  • 6
  • 39
  • 83
foobar5512
  • 2,470
  • 5
  • 36
  • 52
  • 1
    Maybe your prof is somewhat of a philosopher: "Is a patient without a physician still a patient?". I'd follow your point. – Filburt Apr 20 '17 at 19:28
  • @Filburt So when determining the form should you consider it based on the data in the table or the logical conclusion? – foobar5512 Apr 20 '17 at 19:44
  • If Name depends on PatID and PhysName, then your professor is right. But what should then Name be? From these few lines and not knowing what the relationships among the columns are, we can't tell much. – user2314737 Apr 20 '17 at 19:45
  • @user2314737 I'm a little confused. Should we consider the data itself or the relationships between the columns? – foobar5512 Apr 20 '17 at 19:50
  • 2
    From the table one cannot say much about the functional dependencies: PatID could by the key of the relation, so it could determine the other two attributes, and the table could be in 3NF. Please specify the functional dependencies, and not a simple example, otherwise it is not possible to decide in which normal form is the table. – Renzo Apr 20 '17 at 19:51
  • 1
    @Renzo This is all we were given. But in general you look at functional dependencies and not the data given in the table? – foobar5512 Apr 20 '17 at 19:53
  • 1
    Yes, otherwise it is impossible to talk about normal form. For instance, one could say that PatID, PhysName -> Name, and that this is the only non-trivial functional dependency of the relation. Then the relation **is** in 3NF. If, on the other hand, we could say that PatID -> Name is the only dependency, then the relation is **not** in 3NF. – Renzo Apr 20 '17 at 19:55
  • @foobar5512 I would say you define the relationships and afterwards you can say whether your table is normalized or not according to those relationships – user2314737 Apr 20 '17 at 19:57
  • You *could* identify a patient's Name simply by looking at the PatID. However the primary key constraint implies that there might be another record like `1337 | Clinton, H. | Seuss, M.`. To answer your question: Do not look at *this data*. So I have to correct my first comment. – Filburt Apr 20 '17 at 19:57
  • 1
    Please use text whenever you can. We can't cut and paste an image. – philipxy Apr 20 '17 at 20:31
  • http://stackoverflow.com/questions/723998/what-are-1nf-2nf-and-3nf-in-database-design?rq=1 –  Apr 24 '17 at 18:05
  • @JarrodRoberson There's no evidence the op doesn't know the definitions, there's evidence they do, they just don't understand where the relevant FDs are coming from. PS None of those answers give a correct definition of 3NF. – philipxy Apr 25 '17 at 02:57

2 Answers2

4

In order to really know whether the table is in 2NF or not, you would have to have the functional dependencies explicitly laid out for you.

Inferring the FDs from a small sample of data is a risky business. The smaller the sample, the greater the risk.

We would have to see a patient with two physicians here to see whether the name is the same. I expect it would be, but that's only common sense.

When you move on from classroom exercises to million dollar projects, you'll find that common sense is an unreliable substitute for data analysis.

Walter Mitty
  • 18,205
  • 2
  • 28
  • 58
  • No example data for a variable can tell you that a FD holds. If a patient with two physicians has the same name with both you still do not know whether the FD holds in the variable. If you are told some other things like CKs or other FDs, you might be able to figure out whether a certain FD holds, but it isn't the data that's telling you. – philipxy Apr 27 '17 at 17:00
-1

Given a table value we can see what FDs (functional dependencies) hold in it, hence what its CKs (candidate keys) are and what NFs (normal forms) it satisfies (up to BCNF). (We can't know the CKs & NFs without knowing the FDs.)

A FD (or any constraint) holds in a variable when it holds in every value that can arise. Then its CKs and satisfied NFs are based on those FDs. So for a variable, example data tells us that certain FDs don't hold, and the "trivial" FDs must hold, but for the other FDs example data just doesn't tell us whether they hold.

Since the table value doesn't have {PatId, PhysName} as CK, your instructor must mean that that some variable with that value has that CK. (Of course, you should have got value vs variable straight anyway.) In order to consider that that variable has that CK, they must have decided something like:

  • the table holds rows that make a true statement from "a physician named PhysName tends a patient they identify as PatId and know by name PatName"
  • the physicians with a given name each only knows their patients with a given id by one name
  • (we don't know that it's false that) two different physicians could identify two different patients by the same id
  • likely that each physician has a unique name
  • likely that each physician identifies every one of their patients by an id
  • likely that a physician identifies just one patient via a given id
  • likely that a physician identifies a patient via only one id
  • likely that "identifies" always means a 1:1 correspondence of entities & ids
  • likely that each patient has only one name
  • etc

You need to know whether it's value vs variable, and it's pointless to argue about a variable and constraints (including FDs) until you agree on the predicate and the BRs (Business rules).


PS Re BRs, predicates & constraints:

A proposition is a statement about a situation: "a physician named 'Scholl, F.' tends a patient they identify as 99999 and know by name 'Gore, Z.'". A predicate is a statement template mapping from a row of column names & values to a proposition: "a physician named PhysName tends a patient they identify as PatId and know by name PatName". A table variable holds the rows that form true propositions in a situation.

BRs (business rules) give variable predicates and characterize what situations can arise. Hence what table variable values can arise, hence what FDs hold, hence the CKs, etc.

philipxy
  • 14,867
  • 6
  • 39
  • 83
  • This has received 2 downvotes. I am interested to know why. If you think something is incorrect, I'll gladly explain how it's not. – philipxy Apr 27 '17 at 16:53