MS Access Schema Decision: Multiple Tables or a lot of Null Values?

Question

I found this thread which helps my understanding somewhat, but does not answer my question:

SQL: Using NULL values vs. default values

My Question: If I am creating a schema (in an MS Access Database) that is designed to store contact information for employees, would it be better to have a single table for telephone numbers, then a single table for addresses, then another single table for email addresses, OR would it be better to have a single table that stores all of these records, but might have NULL values for several of the fields in more than half of the records?

I would like to store the different elements of a street address into separate fields: For Addresses: one field for the street number and name, one filed for the city, one for the state, one for the country, one for the zip code, and also one for any other name for the address ("ATTN:" or similar), and maybe more; For Telephone Numbers: essentially one for a name and one for a number; For Emails: essentially the same as Telephone - name and number. This would leave many NULL/Blank values in the list for telephone numbers... in fact, I would estimate probably 70% of the records would have 5 or more null values, on the scale of 5,000 to 10,000 records.

I would want to be able to display them both in separate lists as well as in a combined list, filtered and grouped. Either structure could support this (through JOINS/UNIONS and WHERE clauses). In terms of simplicity of table structure, a single list would seem obvious - ONE table is "neater" than three or more tables.

The answer, I think, should hinge on the efficiency of "storing" potentially tens of thousands of NULL values vs. the efficiency of indexing different tables, and spending time ensuring UNIONs line up with datatypes and constructing various other methods to combine data that is already SOMEWHAT related.

I hope I have presented my thoughts clearly enough! I welcome links, answers, and comments as well as questions.

I vote for @HansUp's answer, but would clarify that I'd store email and telephone numbers in the same table. Whether you want to store area code separately or not is up to you. I think you're likely overthinking the whole thing, though. — David-W-Fenton, May 29 '11 at 00:01

score 3 · Accepted Answer · answered May 23 '11 at 18:57

3

I would approach the design with a bias favoring separate tables for each entity class. Person is an entity class. If you have no more than a single phone number for each person, you can make this work to store it as an attribute of the Persons table.

However, what I usually see is the desire for the flexibility to store multiple types of phone numbers for each person: home; work; cell; fax; etc. Storing those in a single table (Person_ID, work_phone, home_phone, cell_phone) leads to a brittle design. When the managers tell you to add a field for another phone number type, you're forced to revise the table structure, as well as queries, forms, and reports which use that table.

I would lean towards a separate table with one-to-many relationship between People and PhoneNumbers --- so that each phone number and its type is a separate row in the PhoneNumbers tables. That design avoids the brittleness of the single table approach. And it also avoids your concern over storing so many Null values --- if there is no phone number for a Person, you don't have a row for that Person in PhoneNumbers.

However I really don't know whether this suggestion is appropriate for your situation. I think it depends on the complexity of your data needs.

As for the "convenience" of a single table, that seems inconsequential to me. Access is relational, so you use a query to gather up the related pieces from multiple tables into a full view of the data you need ... which can resemble a single table. If you're deliberately avoiding that relational capability, perhaps you wouldn't lose much by storing your contact information in a spreadsheet instead.

answered May 23 '11 at 18:57

HansUp

95,961
11
77
135

I would like the flexibility of being able to store up to ten numbers (or more) for one person or only one, and anywhere number of addresses rather than only one (some people work in more than one location, and may have a P.O. Box as well as a street address). In my mental concept, this is accounted for with a field that specifies the type of contact info (phone number, address, email) and another field that then further specifies the type of email or phone number, etc. There is already a Profile table that holds names and other 1-to-1 items, then several other tables that contain (continued) – Code Jockey May 23 '11 at 21:29
... talents, work history, degrees and awards (and more) in a generally relational but not exactly fully normalized schema. If I were to make a new table to hold contact information, and have fields that differentiated emails from addresses, with the specifications to separate city from state, etc, there would be no particular use for many of the fields needed to store a street address to an email address (especially if I added room number, floor, apartment etc.) Does anyone know how much space 50,000 NULL values "takes up" in an Access Database? Would it change if migrated to SQL server? – Code Jockey May 23 '11 at 21:35
specifically @HansUp: I agree with the one-to-many relationship of person to telephone number, but since there are many types of contact information - not just phone numbers - should all numbers and addresses be stored in the same or in separate tables? **In short:** if you had to store an unknown number of phone numbers, email addresses, and street/office addresses for a certain number of profiles, would you use one table? two tables? three? four? (four because home/personal addresses often have different components than work addresses, from apt#, Address2, and ATTN: to room/building/floor #) – Code Jockey May 23 '11 at 21:42
At first blush, I would use one table for telephone numbers and another separate table for addresses. I would use an inclusive structure for the addresses and not worry that some of those attributes would be Null for a given address. But it seems to me this is considerably more complex than my understanding of your initial description ... so I think at a minimum the "single table for everything approach" is not under consideration. – HansUp May 23 '11 at 22:12
As far as storage of Nulls, I don't think their storage space is a significant concern. I can't remember whether the engine stores Null as an actual Null byte, or simply the absence of a value ... or some other method. Regardless of the method, I don't believe it should be a significant concern compared the amount of storage required for your non-Null data elements. – HansUp May 23 '11 at 22:16
Okie doke - thanks for the thoughts, I will probably use at least two tables, keeping everything in strings, and deal with the slightly more complex query structure. I probably did not effectively relate the complexity, so thanks everyone for your patience and contributions! – Code Jockey May 24 '11 at 15:27

score 0 · Answer 2 · answered May 23 '11 at 18:32

0

Unlike tracking information for business customers, companies usually have simple requirements for storing employee information. There's no need to get into billing, shipping, or office addresses and various phone numbers. It's just not that complex.

For most of your employees the Address2 field may not be needed, but so what? I don't think personal email addresses are necessary once someone is hired (Would be on CV/Resume and used during the interview process.). 2-3 phone numbers should cover it.

I'm just not sure there is any business need for the amount of complexity you'd be adding with different tables.

answered May 23 '11 at 18:32

JeffO

7,957
3
44
53

I suppose I'm not sure there is a real need for my customer to know the things they say they want to know, but they have specified that they want to be able to sort and group by city and state; I COULD simply parse out the city/state from a single field, but I'm thinking the more I can let Access do, the better. I merely assume that this indicates a potential volatility in requirements that could lead to more requirements I don't fully understand, and I'm trying to cover my bases. (:/) Was that supposed to answer my question, or tell me why there should not be requirements that make me ask it? – Code Jockey May 23 '11 at 18:57
If it was merely a comment (as I said I welcomed comments), then I should say: Thank you, but I don't think my customers would be happy with what I think I can provide through such a schema and interface. – Code Jockey May 23 '11 at 18:59

MS Access Schema Decision: Multiple Tables or a lot of Null Values?

2 Answers2