-1

I am creating a database for a DVD rental shop, I have various entities that are related to this question, such as Film, FilmStar.

For each film, you record its unique number, title, the year in which it was made, its category (action adventure, science fiction, horror, romance, comedy, classic, children's), its director, and all stars that appeared in it. For each film, you also want to store the type of DVD hire (new release, classics, other).

I am mostly unsure about "all the stars that appeared in it". I first thought just having an attribute in the 'Film' Entity, for example filmStar and then each star would be inserted into that attribute, for example: "John Doe, Jane Doe" for each film. But then I realised that this wouldn't be 1NF as : "the domains of attributes must include only atomic values, the value of an attribute must be a single value from the domain of that attribute", as it contains more than one value and isn't atomic.

I then thought about having a separate entity that contains certain attributes such as: filmID, filmStarID. So John Doe would have the filmStarID of '0001' (all of this would be in the FilmStar entity, which is a separate entity). But then the same problem would occur, for example the filmID attribute would have all of the filmID's that the filmStar has starred in, for example: John Doe would have "101, 115, 009". Which again wouldn't be 1NF.

I was just wondering what your thoughts are on this?

philipxy
  • 14,867
  • 6
  • 39
  • 83
Tom
  • 41
  • 6
  • ["1NF" has no single meaning.](https://stackoverflow.com/a/40640962/3404097) Nor does "unnormalized" or "UNF" or "0NF" or for that matter "relation". So you need to tell us your definitions & preferably also textbook name & edition. Then apply the definition & ask 1 specific question where stuck. PS You are not clear about what you are starting with. Your title says "Normalize database table to 1NF" but you don't say what that table is & in the body you seem be trying to ask about what table(s) you need. Please make your question clear. – philipxy Jan 26 '20 at 06:05
  • Does this answer your question? [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad) – philipxy Jan 30 '20 at 04:34

1 Answers1

1

What you're describing is a many-to-many relationship. Storing such a relationship would require a connecting table between the two related entities.

So you have two essential entities here:

Film
--------
ID
Title
etc.

CastMember
--------
ID
Name
etc.

Neither of these can store their relations to the other, because that would be a list of values rather than a single value. So the relationship itself essentially becomes an entity independent of the main entities. Something like this:

FilmCastMember
--------
FilmID
CastMemberID
NameInFilm
etc.

This relationship entity would be where you store any information specific to the relationship itself, but not descriptive of the entities being related. The lines above, for example, include NameInFilm which would be the character name played by that cast member in that film.

David
  • 208,112
  • 36
  • 198
  • 279
  • Thank you for your useful answer. So with FilmCastMember, this means that the film can only contain one CastMember though? – Tom Oct 08 '15 at 18:50
  • @Tom: Not at all. The point of this third connecting table is to store the relationship between the other entities. There are many `FilmCastMember` records per `Film` record, and many `FilmCastMember` records per `CastMember` record. So a film can have many cast members, and a cast member can be in many films. – David Oct 08 '15 at 18:52
  • Ok that makes more sense. So what would the relationships be. Film, FilmCastMember, CastMember. With FilmCastMember in the middle. So what would the relationship be between them (one to one, one to many etc) – Tom Oct 08 '15 at 19:01
  • @Tom: The connecting table represents a many-to-many relationship between `Film` and `CastMember`. If you add information to the connecting table (such as in my example) then the relationship itself becomes an entity which is many-to-one related to the other entities which it connects. – David Oct 08 '15 at 19:03
  • Thank you I understand that. One last question - I have to write an SQL Query for this: List the unique numbers and names of all stars that have appeared in at least one comedy. So how would this query work with your example? (I understand that a join will be need to be used, p.s the Film entity has an attribute called filmCatagory) – Tom Oct 08 '15 at 19:07
  • @Tom: That may be a bit more involved than can be answered in a simple comment. I suggest you look up `JOIN` syntax, make a few passes at this, and see if you can formulate a new question where you get stuck. Essentially you might start by querying the `FilmCastMember` table, joining the other tables for more data, filtering for category and other criteria, then filtering for uniqueness. That order of efforts should get the results you're looking for. – David Oct 08 '15 at 19:09
  • Thank you I will look into it, for the FilmCastMember what would you have as the primary key? – Tom Oct 08 '15 at 19:47
  • @Tom: The combination of the two foreign keys can act as a composite primary key. Or, if you'd prefer, you can give the table its own identity column as a primary key. – David Oct 08 '15 at 19:48
  • I thought so. So the SQL for FilmCastMember would be: create table FilmCastMember (filmID number(2) not null, filmStarName char(25), filmStarID number(2), primary key (??)); How would I put the composite primary key in? – Tom Oct 08 '15 at 19:52
  • I have one issue with your suggestion. With the FilmCastMember, that means I will have to insert every film and every film star, and as many films can contain many film stars, the filmID and filmStarID maybe inserted many times so this would create a lot of duplicate data and also would mean that they would not work as a primary key – Tom Oct 08 '15 at 20:36
  • @Tom: That would only fail if the same `FilmStar` was in the same `Film` more than once. Which, come to think of it, may be possible. (The same actor can play multiple roles in the same film. Such as Eddie Murphy in Coming To America.) In that case, the table should have its own identity. Sometimes the combination of foreign keys would suffice as a primary key, but it would appear that in this domain the relationship table should have its own identity as the primary key. Good catch. – David Oct 08 '15 at 20:38
  • Thats what I thought. So finally it would be: FilmStar 1.1 1.* FilmCastMember 1.* 1.1 Film Is that correct? – Tom Oct 08 '15 at 20:48
  • @Tom Merely having values appear multiple places in a table is not per se bad. (Whatever you mean by "duplicate data" other than having values appear multiple places in a table.) Re the PK, it is the *pair* of columns and despite films and stars appearing multiple times the PK says that a given *pair* of values only appears once. See [this](http://stackoverflow.com/a/32036030/3404097). @ David I think Tom is just referring to (although misunderstanding) the typical many-to-many case, eg that having both rows (f1,s1) & (f1,s2) is problematic from the f1 being problematically "duplicated". – philipxy Oct 08 '15 at 23:16
  • @philipxy No I am not misunderstanding the case, but regarding the pair of columns being the PK (I already know that this is the case) but how do I implement this. Would it just be: primary key (filmID, filmstarID)? – Tom Oct 09 '15 at 13:26
  • @David I need a little bit more help. FilmStar, FilmCastMember, Film. What exactly would the relationship and cardinality ratios between them, would it be one FilmCastMember to many films, or many FilmCastMember to one Film? – Tom Oct 09 '15 at 16:42
  • @Tom: Many relationship records to one entity record. Consider that a star can be in many films, and a film can have many stars. – David Oct 09 '15 at 17:37
  • @David Yes I understand that, but I need to record the cardinality ratios for everything, as this will be displayed on an ERD. So FilmStar 1.1--- 1.* FilmCastMember 1.* --- 1.1 Film OR FilmStar 1.* --- 1.1 FilmCastMember 1.1 --- 1.* Film. Because I think the second version makes more sense but according to your answer it would be the first way? – Tom Oct 09 '15 at 17:48
  • @Tom: I'm not familiar with whatever notation you're using for your homework. But for any given entity (Film or Star) there can be anywhere from zero to many records in the connecting table. Create the tables and see for yourself. It's a standard many-to-many relationship. – David Oct 09 '15 at 19:56
  • @David I'm studying for my PhD so it's not really homework. Are you familiar with ERD? It's basic UML notation? I know its a many-to-many relationship and that's not the point. For a table to be normalised there shouldn't be any many-to-many relationships – Tom Oct 09 '15 at 21:20
  • @Tom: `"For a table to be normalised there shouldn't be any many-to-many relationships"` - A "table" doesn't have any relationships at all, it's only a single entity. A normalized data model with multiple entities, however, is perfectly capable of having many-to-many relationships between those entities. That's exactly what this very simple example has illustrated. The real-life domain being expressed (movies and actors) has a many-to-many relationship. So the data model expressing that domain will also have a many-to-many relationship. – David Oct 09 '15 at 21:25
  • @David Within relational data model it isn't capable to have many-to-many relationships, of course it would work but every many-to-many relationships should be split up with a single entity in the middle (as this scenario states with FilmCastMember). That is basic data modelling and ERD. – Tom Oct 09 '15 at 21:30
  • @Tom: So... If you already know how to create a simple many-to-many relationship with a connecting table... Then what exactly is your question? – David Oct 09 '15 at 21:33
  • 1
    @Tom Re 'Would it just be: primary key (filmID, filmstarID)?' Yes. (No such pair can appear more than once.) Re 'FilmStar, FilmCastMember, Film' Cardinalties are used differently by different methods. Give a reference to *exactly* what convention you are using. Re 'For a table to be normalised there shouldn't be any many-to-many relationships' No. You need to learn about normalization. Still sounds like you are confounding multiple appearances of a value in a column or table with relationalization and normalization. See my link above. – philipxy Oct 09 '15 at 21:34
  • @David as I stated in my question, the way I was doing it went against 3NF and so couldn't be done like that – Tom Oct 09 '15 at 22:09
  • @philipxy No I understand normalisation, I mean that many-to-many relationships shouldn't be present within an ERD or any part of a logical database design or relational data model. That is how I've always been taught and now I'm studying my masters this has been reiterated, that there shouldn't be many-to-many relationships within an ERD – Tom Oct 09 '15 at 22:11
  • 1
    @Tom It is *clear* from your comments that you don't understand. And I know what you meant, and it's wrong, as is your 'there shouldn't be many-to-many relationships within an ERD'. Please edit your question with a complete detailed example including tables and example data and diagramming/ER method & terminology references and clearly explain/ask in terms of it. Then we can show you where you are going wrong in understanding notions/terminology. [How do I ask a good question?](http://stackoverflow.com/help/how-to-ask) (among [others](http://stackoverflow.com/help/asking)). – philipxy Oct 09 '15 at 22:27
  • @Tom (revised) Please ask a new question when you want to ask a new question. (And research it first.) Eg 'write an SQL Query' or 'how do I implement this'. PS First find your relationships. Eg for FilmCastMember: '[FilmStar] appeared in film [Film]' or '[FilmStar]'s mom hates [Film] made in 1949'. We must characterize each relationship as such a meaning/predicate to use a database. (Parameters become columns. Tables hold the rows that make a true statement.) *Then* diagram aspects (eg cardinalities) of each particular relationship. And give that all here. – philipxy Oct 09 '15 at 22:43
  • @philipxy I think I'm going to go with my PhD supervisor and various other doctors, professionals who have told me this. Thank you for your help but i'm afraid you are wrong. Maybe you should read up on basic UML and ERD's and then try and answer a simple question. – Tom Oct 10 '15 at 00:38
  • @Tom I've drafted a few comments but since I haven't come up with sufficient hypothetical misunderstanding(s) to explain what you write I'll just say: 1. "Relationship" gets used for "association" *and* "foreign key" aka is-a association, always binary & many-to-0-or-1. 2. An "associative entity" ( type/class or set/extent or instance) is *the same thing* as a "relationship" qua "association" ( type/class or set/extent or instance). Please keep me posted. – philipxy Oct 10 '15 at 05:01