1

I have a design question.

I have to store approx 100 different attributes in a table which should be searchable also. So each attribute will be stored in its own column. The value of each attribute will always be less than 200, so I decided to use TINYINT as data type for each attribute.

Is it a good idea to create a table which will have approx 100 columns (Each of TINYINT)? What could be wrong in this design?

Or should I classify the attributes into some groups (Say 4 groups) and store them in 4 different tables (Each approx have 25 columns)

Or any other data storage technique I have to follow.

Just for example the table is Table1 and it has columns Column1,Column2 ... Column100 of each TINYINT data type.

Since size of each row is going to be very small, Is it OK to do what I explained above?

I just want to know the advantages/disadvantages of it.

If u think that it is not a good idea to have a table with 100 columns, then please suggest other alternatives.

Please note that I don;t want to store the information in composite form (e.g. few xml columns)

Thanks in advance

  • Is there a one to one relationship between the attribute and what it is associated with? – Corey Sunwold Jul 27 '09 at 15:00
  • What is the database application? If MySQL, tag MySQL. Also, I don;t understand "composite form." – Smandoli Jul 27 '09 at 15:03
  • The table has one non-nullable field based on which decided which attributes to read. Each attribute is independent. –  Jul 27 '09 at 15:09
  • So far there are 4 answers (not counting Kumar's which is a comment really). They are all based on INADEQUATE INFORMATION, and I am down-voting the question because of this problem. – Smandoli Jul 27 '09 at 15:52
  • Good related question: http://stackoverflow.com/questions/870808/entity-attribute-value-database-vs-strict-relational-model-ecommerce-question – jcollum Jul 27 '09 at 18:13
  • I'm still not clear on this: are all of the columns in the table required for every row (i.e. not null)? – jcollum Jul 27 '09 at 18:19
  • @jcollum -- that's a fascinating link you provided on EAV. But I don't see an issue here of what I'll call massive flexibility. In fact I don't see any issues ... given the inadequate specification. – Smandoli Jul 27 '09 at 18:39

6 Answers6

5

Wouldn't a many-to-many setup work here?

Say Table A would have a list of widget, which your attributes would apply to

Table B has your types of attributes (color, size, weight, etc), each as a different row (not column)

Table C has foreign keys to the widget id (Table A) and the attribute type (Table B) and then it actually has the attribute value

That way you don't have to change your table structure when you've got a new attribute to add, you simply add a new attribute type row to Table C

Clay Mitchell
  • 362
  • 3
  • 10
  • 2
    It makes the search query a _little_ more complicated, but still something anybody with at least 1yr experience with sql should be able to handle with no trouble at all. – Joel Coehoorn Jul 27 '09 at 15:12
  • I think it would be relevant to ask if he will be adding/removing attributes often before suggesting this method. – Joe Phillips Jul 27 '09 at 15:26
  • 1
    I do work with these "vertical" tables on a daily basis and they are painful to deal with. Stay away unless it makes sense to use them! – Joe Phillips Jul 27 '09 at 15:39
  • 1
    This is called "Entity-attribute-value model". Terribly inefficient to write queries. Google up and see for yourself. – A-K Jul 27 '09 at 17:47
  • How is it painful? It's also significantly more flexible than building one giant static table... – Clay Mitchell Jul 27 '09 at 17:48
  • I've worked with the EAV model in the past. Not that bad, as long as you don't take it too far (ONE_TABLE model). Just have strict limits on what can go in the EAV and you'll be fine. It's easy to build views that render the EAV to tables. – jcollum Jul 27 '09 at 18:21
  • Suppose you need five columns with some conditions on three of them: col1=1 or col2=2 or col3=3. Why don't you write a five-way full outer join to produce the required result set and see for yourself how horribly ugly your query looks like. – A-K Jul 27 '09 at 20:40
  • It seems like this is the basic design approach behind all of the non-relational databases that all the kids are talking about these days. – Rafe Jul 28 '09 at 15:17
3

Its ok to have 100 columns. Why not? Just employ code generation to reduce handwriting of this columns.

Mike Chaliy
  • 25,801
  • 18
  • 67
  • 105
2

I wouldn't worry much about the number of columns per se (unless you're stuck using some really terrible relational engine, in which case upgrading to a decent one would be my most hearty recommendation -- what engine[s] do you plan/need to support, btw?) but about the searchability thereby.

Does the table need to be efficiently searchable by the value of an attribute? If you need 100 indexes on that table, THAT might make insert and update operations slow -- how frequent are such modifications (vs reads to the table and especially searches on attribute values) and how important is their speed to you?

If you do "need it all" there just possibly may be no silver bullet of a "perfect" solution, just compromises among unpleasant alternatives -- more info is needed to weigh them. Are typical rows "sparse", i.e. mostly NULL with just a few of the 100 attributes "active" for any given row (just different subsets for each)? Is there (at least statistically) some correlation among groups of attributes (e.g. most of the time when attribute 12 is worth 93, attribute 41 will be worth 27 or 28 -- that sort of thing)?

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
1

BAsed on your last, it seems to me that you may have a bad design. WHat is the nature of these columns? Are you storing information together that shouldn't be together, are you storing information that shoul be in related tables?

So really what we need to best help you is to see what the nature of the data you have is.

what would be in
column1,column3,column10 vice column4,column15,column20,column25

HLGEM
  • 94,695
  • 15
  • 113
  • 186
1

I had a table with 250 columns. There's nothing wrong. For some cases, it's how it works.

unless some of the columns you are defining have a meaning "per se" as independent entities and they can be shared by multiple rows. In that case, it makes sense to normalize out the set of columns in a different table, and put a column in the original table (possibly with a foreign key constraint)

Stefano Borini
  • 138,652
  • 96
  • 297
  • 431
-2

I think the correct way is to have a table that looks more like:

CREATE TABLE [dbo].[Settings](
    [key] [varchar](250) NOT NULL,
    [value] tinyint NOT NULL
) ON [PRIMARY]

Put an index on the key column. You can eventually make a page where the user can update the values.

Having done a lot of these in the real world, I don't understand why anyone would advocate having each variable be its own column. You have "approx 100 different attributes" so far, you don't think you are going to want to add and delete to this list? Every time you do it is a table change and a production release? You will not be able to build something to hand the maintenance off to a power user. Your reports are going to be hard-coded too? Things take off and you reach the max number of columns of 1,024 are you going to rework the whole thing?

It is nothing to expand the table above - add Category, LastEditDate, LastEditBy, IsActive, etc. or to create archiving functionality. Much more awkward to do this with the column based solution.

Performance is not going to be any different with this small amount of data, but to rely on the programmer to make and release a change every time the list changes is unworkable.

JBrooks
  • 9,901
  • 2
  • 28
  • 32
  • 3
    Do you ever saw queries against such tables? Trust me they are ugly. – Mike Chaliy Jul 27 '09 at 15:08
  • Ugly? Depends on what you are doing, usually it is pretty straight forward. It is better than changing your table structure every time you add a new key. You can always pivot this with case statements. – JBrooks Jul 27 '09 at 15:20
  • 2
    this is the same principal as clay's answer... i don't know why this answer is -1, and the other is +3 – John Jul 27 '09 at 15:21
  • 1
    This is called "Entity-attribute-value model". Terribly inefficient to write queries. Google up and see for yourself. – A-K Jul 27 '09 at 17:47
  • For 100 rows? I think SQL Server can handle it. – JBrooks Sep 12 '09 at 03:59