0

Sorry if this question is too silly or neurotic... But I can't figure it out by myself. So I want to see how others deal with it.

My question is:

I want to write a program show progress of do some thing. So I need to record which state is it currently in so that someone can check it by anytime. there are two method:

  1. Use two field to represent the progress state: step and is_finished.

  2. Just one filed: step. For example, if this thing need 5 step, then 6 means finished. ( 0 means not started? )

Compare above two methods.

two field:

  1. Seems more clear. And the most important is that logically speaking step and finished or not are two concepts? I'm not sure about this.

  2. If thing are finished. We change is_finished field to true ( or 1 as you like ). But what to do with step field now? Plus one, or just not touch it because it has no meaning any more now?

one field:

  1. Simple, space saving. But not very intuitive. For example, we don't know what 6 really means by just looking at this field because it may represent finish or middle step. It need other information e.g. total step to determine. And potentially this meaning is not very stable if the total steps will change ( is_finished field in two field method would not affected by this).

So How do you will deal with it? Thanks!

UPDATE:

I forgot some point maybe useful in the previous post:

The story is: We provide a web-based service for customers. (This service has time limitation e.g. 1 year term) After customer purchase it our deployment programe prepare hardware(virtual machine) and deploy some software which need some time to finish. And we want to provide progress info for customer. When deployment is finished, the customer should be informed.

Database design:

  1. It need a usage state field to represent running normal, running but owe (expired), stop. What confusing me is should it include not deployed yet and deploying information or not?

  2. The progress info should include some other info e.g. the start time so we can tell how much time elapsed since start. But this info is no need to be persistent because we won't care about these info as long as it's finished. So I decide to store these progress info in a separate (temporary) table. Then I think it need another field in another more persistent table to tell if things are done . So can we combine it into the usage state field mentioned above?

fubupc
  • 63
  • 1
  • 8
  • 1
    For your second method you could also consider setting finished to -1 for instance and you would not need the `total_step` field – Pankrates Jan 30 '13 at 07:26
  • @Pankrates Yes. that's a good idea! So you prefer one field method, right? – fubupc Jan 30 '13 at 10:27

2 Answers2

1

It remains a matter of taste but I would suggest the second option of a single int field step. On inserting a new record, initialize the value of step to 0 which would indicate "not started yet". Any positive integer value would obviously denote the current step. As soon as the trajectory is completed I would set step to NULL. As you correctly stated this method does require solid documentation but I think that it is not too confusing

Pankrates
  • 3,074
  • 1
  • 22
  • 28
  • I like your idea which using special value to represent finished state. But later I thought of another reason to use two field on separate tables to represent. Please see my update.. Thanks! – fubupc Feb 01 '13 at 04:24
1

I like the one-field approach better, for the following reasons:

  • (Assuming you want to search on steps) you can "cover" all steps using only one simple index.
  • Should you ever want to attach some additional information to each of the steps, the one-field approach can easily accommodate a FOREIGN KEY towards a new table containing that information.
  • Requires slightly less storage space. Storage is cheap these days, but that's not the point - caching and network performance is.

Two-field approach:

  • (Assuming you want to search on steps) might require a "fatter" composite index or even two indexes (which takes space, lowers the cache effectiveness and incurs maintenance cost for INSERT/UPDATE/DELETE operations).
  • Requires a CHECK to defend the database from "impossible" combinations. Funny enough, some DBMSes don't enforce CHECKs (I'm looking at you, MySQL).
  • Requires slightly more storage space (and therefore slightly less of it fits into cache, takes up slightly more network bandwidth etc.).

NOTE: Should you choose to use NULLs, that could have "interesting" consequences under certain DBMSes (for example, Oracle doesn't index NULLs).

For example, we don't know what 6 really means

That doesn't really matter, as long as the client application knows what it means.

Design the database for applications, not humans.

And potentially this meaning is not very stable if the total steps will change

True, but you have the same problem with two-field approach as well, if new step is added in the "middle" of existing steps.

  • Either UPDATE the table accordingly,
  • or never change the step values. For example, if the step 5 is the last one, then newly added step 6 is considered earlier despite having greater value - your application (or the additional table I mentioned) will know the order of steps, even if their values are not ordered. If you really want "order by value" without resorting to UPDATE, make the steps: 10, 20, 30 etc, so you can insert new steps in the gaps (the old BASIC line number trick).
Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • thanks very much for your detailed answer. About using what number to represent `finished` state. I think -1 or some other special value should be better. Assuming at first total step is 5, and the table have some finished record so the `step` for them are 6. Then someday total step changes to 8. At this point meaning of `step` for old record will be wrong. – fubupc Jan 31 '13 at 05:29
  • @fubupc Yes, but only if the extra steps are added to the end, the existing values are unaffected. If they are added to the beginning or middle, then existing values would need to be "moved" somehow, whether you use special value for the last step or not. – Branko Dimitrijevic Jan 31 '13 at 11:32
  • Yes. I think all the non-finished record when total steps change will be the case as you said. But it seems already-finished record will be the always correct if using special value e.g -1 for last step? Actually I have another reason to use two field on separate tables to represent. Please see my update.. Thanks! – fubupc Feb 01 '13 at 04:22