What is the most efficient way to organise these tables + rows? - MySQL

Question

I'm going to be running lotteries on my website, and hence there needs to be somewhere to store the tickets and numbers. I will definitely have a table, called tickets in which each row will have their own ticket id, their associated lottery id and all other information (like id of the user to whom it belongs).

However, my question is whether I should make another field in tickets to hold the numbers chosen on the ticket. It's not an option to create multiple fields such as number1, number2 etc, as each lottery will have different types of tickets (i.e lottery1 may ask you to choose 4 numbers, and lottery2 may ask you to choose 6).

So I can either make a new field which is either VARCHAR or TEXT to accept comma-separated ticket numbers, i.e: 1,2,3,4,5,6 or make another new table called numbers where each row would have the ticket id and number associated with it. However I'm not sure if this method is very efficient, as for just one ticket of 6 numbers, there would need to be 1 row in the tickets table, and 6 rows in the numbers table.

Which of these options is most efficient? Or is there an even better way to do it than this? Please remember that at the end of the lottery the code will need to cycle through each ticket to check if they have won - so option 2 might be too resource-hogging there.

The number "is a property of" a ticket, no? In which case, make it part of the ticket. Using a simple CSV or DSV shouldn't *automatically* be considered denormalized because the number itself *is an entity*, just like a SSN or a phone number. If the lotto-numbers are stored `min,..,max` then a simple *regular equality and index* can be used. For "sub lotto number" querying, e.g. "tickets that contain the number N", then that's another use-case/issue .. — , Jul 27 '12 at 00:09
The CSV method would, to me, seem to be the most efficient and easiest, because you could just `explode()` the data once you get it. — David, Jul 27 '12 at 00:16
For what it's worth, _be careful_ with the local laws where you will be conducting your lottery. — David, Jul 27 '12 at 00:16
.. the number of rows isn't really an issue (SQL databases scale across rows well), but it sounds like the issue is being overly complicated. I urge considering numbers are entities. — , Jul 27 '12 at 00:16
How would I use an index in this case? While it's true that no one member can purchase multiple identical tickets, the same ticket could be purchased by more than one member. — Keir Simmons, Jul 27 '12 at 00:17
@KeirSimmons Then there is a Purchase M-1 Ticket relationship. The index would be over the Normalized Ticket # `min,..,max`. Each person might *also* have their own "chosen" number (for display), e.g. `b,a,c` and `c,a,b` would both have the Ticket Value `a,b,c`. — , Jul 27 '12 at 00:18
Sorry, what do you mean by "The index would be over the Normalized Ticket # min,..,max" ? — Keir Simmons, Jul 27 '12 at 00:22
[programmers.SE](http://programmers.stackexchange.com/) is a better fit for whiteboard (design) issues. SO is more for questions about existing code. — outis, Jul 27 '12 at 01:20
Possible dup of [Is it really worth it to normalize the “Toxi” way? ( 3NF )](http://stackoverflow.com/q/2063820/), [What is the best way to store multiple values for lists when values are of varying amounts and not repeated throughout rows?](http://stackoverflow.com/q/5126133/). See also [Comma separated values in a database field](http://stackoverflow.com/q/738133/), [issue in sql Query](http://stackoverflow.com/q/1881382/), [SQL Server: store more integers in a single value](http://stackoverflow.com/q/1966664/). — outis, Jul 27 '12 at 01:46

score 1 · Accepted Answer · 2012-07-27T03:06:01.053

In the following "Ticket[Number]" should be taken to mean "Selected Set of Lottery Numbers". Remember that Set(a,b,c) is equal to Set(c,b,a).

I would have it like this:

Purchase
  -PersonID // associate Person (one person can have many purchases)
  -TicketID // associate Ticket (a purchase is for one "ticket",
            //                   which can be purchased many times)
  -DisplayTicketNumber // for Human Display

Ticket
  -TicketNumber

That is, Purchase:M-1:Ticket

The DisplayTicketNumber is the number, as the user selected it, e.g. "3,1,2" while, on the other hand, the TicketNumber is the normalized ticket number where the small values are put first. The final form is thus min,..,max or similar. That is, any number of DisplayTicketNumbers that have the same set of values (in any order) will have the same TicketNumber:

DisplayTicketNumber  TicketNumber
1,2,3                1,2,3
2,3,1                1,2,3
3,2,1                1,2,3
3,2,1,4              1,2,3,4 .. and etc

Then put an index on TicketNumber so what a simple WHERE TicketNumber = @normalizedTicketNumber will be a very fast index.

I would actually argue this is an acceptably normalized design and the TicketNumber (along with say a Raffle number) forms a Key. My arguments for this are thus:

A TicketNumber is an opaque value that uniquely identifies a Ticket (per Raffle). One does not need to "know the details" inside the DB model. (There might be a need in some cases, but not here.)
The DisplayTicketNumber is an artifact of the users input; yet multiple DisplayTicketNumbers can represent the same TicketNumber. While this does represent possible "duplication" it is important to realize that this is a Friendly Display value that represents a list (which has more information than a set) of numbers chosen.
1. In a case like this I would make the DisplayTicketNumber (and TicketNumber) immutable with triggers so that, after creation, no database inconsistencies can be introduced here.
2. If a FK can be computed then the constraint between DisplayTicketNumber and TicketNumber can be enforced without immutability.

(I have omitted various details like having different TicketNumbers for different Raffles, etc. I also show a TicketId for a FK, but I also hinted that RaffleId,TicketNumber is an acceptable [non-surrogate] Key.)

Also, the Ticket table could be eliminated: since very few Lottery Number Sets will be shared so, if there is no extra associated Ticket information, then removing it might be an acceptable denormalization. One advantage of this is then the TicketNumber could be moved into the Purchase table and then turned into a computed column (which is still indexed) that normalized the Ticket value.

And, if MySQL allows using a computed column in a FK then using the relationship PK(Ticket.TicketNumber) -> FK(Purchase.TicketNumber), where Purchase.TicketNumber is computed, could be used to increase model integrity without eliminating the Ticket table. (I do not use MySQL, however, so I cannot say if this is viable or not.)

Happy coding.

score 0 · Answer 2 · answered Jul 27 '12 at 00:13

I'd use the second option where you make the new table, called numbers, and you have the numbers associated with it.

But I'd also add a field in the tickets table, where you indicate the amount of numbers that can be selected, and a condition where you check if the amount of numbers inserted using that ticket (using COUNT in a query) is less than the amount of numbers that can be selected, then insert.

What is the most efficient way to organise these tables + rows? - MySQL

2 Answers2