3

Are there any cases where I can have a textual field such as a description in a fact table?

I currently have a fact table of meeting events (grain: row per meeting) with a number of dimensions such as date, client, location etc. I need to put the meeting subject in the fact table. Is this ok even though it is not a measure (I have not seen any examples of this). Can't move it to a separate dimension as it will always be the same size (no of rows) as the fact.

Any ideas or advice from past experience?

Thanks

S.Lott
  • 384,516
  • 81
  • 508
  • 779
NabilS
  • 1,421
  • 1
  • 19
  • 31
  • I am by no means knowledgeable in the data-warehousing realm, but I've been to a few talks about it. Is it not normal convention in this case to move your subject to a "junk dimension"? – Cᴏʀʏ Jun 26 '09 at 02:03
  • I did have a look at junk dimensions and it does look like a solution however the following quite swayed me: "If the number of rows in the junk dimension approaches or exceeds the number of rows in the fact table, the design should be clearly re-evaluated.". This is the case in my example. http://www.ralphkimball.com/html/designtipsPDF/DesignTips2003/KimballDT48DeClutter.pdf – NabilS Jun 26 '09 at 08:51

2 Answers2

2

It can, in the form of a "degenerate dimension:" a dimension so insignificant that there's no need to create a table for it. A common example would be invoice numbers: they're not metrics, but because they're so unique it'd be a false economy to have 32-bit FK to an Invoices table with a 128-bit CHAR(16) field, with as many records as your fact table. This should be done cautiously, as they make fact table rows wider.

Junk dimensions are usually a better option, if you have more than a couple degenerate dimensions. Of course, if there's a dimension to which you could reasonably attach the text instead, that's better still.

1

What you are describing sounds like a dimension derived from other dimensions rather than a fact table. I have done this many times where I have a structure of a primary key, a combination of foreign keys and a string column to denote a name. Product Definition comes to mind as an example. Shipping location (with its various lookups tied in) comes in as another.

Consider the following example: Locations: Fort Lauderdale, West Palm Beach, Miami

Each location may have multiple shipping locations. A shipping location has various attributes as in Packing system, Converyor belt system, Product Weight Range, Types of Pickups. All of these are in lookup tables.

So, I have a table called ShippingLocation with the following columns
- ShippingLocationId(PK)
- PackagingSystemId (FK)
- ConveyorBeltTypeId (FK)
- ProductWeightRangeId (FK)
- ShippingLocationName VarChar (200)

It seems very logical to me that the name of the shipping location would be in the same place where the shipping location is defined and it's attributes are defined. The only possible normalization I see here is that I can take it to a 1-to-1 table. IMO, that is a useless normalization.

Raj More
  • 47,048
  • 33
  • 131
  • 198
  • Not sure about that. There are 100's of new meetings every day, each on of them has a subject. If I put meetings into a dimension, I would have each row in my fact table (it would probably be a factless fact table) mapping one-to-one to a row in the dimension. This surely is not right??? – NabilS Jun 25 '09 at 22:04
  • The shipping location example is definitely a dimension as it is something that I would group on and not an event or transaction. However the meeting is an event. It is more detail than something I would group on. – NabilS Jun 25 '09 at 23:11