Performance characteristics of a non-minimal DFA

Question

Much has been written about the performance of algorithms for minimizing DFAs. That's frustrating my Google-fu because that's not what I'm looking for.

Can we say anything generally about the performance characteristics of a non-minimal DFA? My intuition is that the run time of a non-minimal DFA will still be O(n) with respect to the length of the input. It seems that minimization would only affect the number of states and hence the storage requirements. Is this correct?

Can we refine the generalizations if we know something about the construction of the NFA from which the DFA was derived? For example, say the NFA was constructed entirely by applying concatenation, union, and Kleene star operations to primitive automatons that match either one input symbol or epsilon. With no way to remove a transition or create an arbitrary transition, I don't think it's possible to have any dead states. What generalizations can we make about the DFAs constructed from these NFAs? I'm interested in both theoretical and empirical answers.

This question can be seen as off-topic and might be more suitable for http://cs.stackexchange.com/ — Codor, Jan 13 '15 at 20:05

score 1 · Accepted Answer · edited May 23 '17 at 12:20

Regarding your first question, on the runtime of the non-optimal DFA. Purely theoretically your intuition that it should still run in O(n) is correct. However, imagine (as an example) the following pseudo-code for the Kleene-Star operator:

// given that the kleene-star operator starts at i=something
while string[i] == 'r':
    accepting = true;
    i++;
while string[i] == 'r':
    accepting = true;
    i++;
// here the testing of the input string can continue for i+1

As you can see, the first two while-loops are identical, and could be understood as a redundant state. However, "splitting" while loops will decrease (among other things) your branch-prediction accuracy and therefore the overall runtime (see Mysticial's brilliant explanation of branch prediction for more details here.

Many other, similar "practical" arguments can be made on why a non-optimal DFA will be slower; among them, as you mentioned, a higher memory usage (and in many cases, more memory means slower, for memory is - by comparison - a slower part of the computer); more "ifs", for each additional state requires input checking for its successors; possibly more loops (as in the example), which would make the algorithm slower not only on the basis of branch prediction, but simply because some programming languages are just very slow on loops.

Regarding your second question - here I am not sure on what you mean. After all, if you do the conversion properly you should derive a pretty optimal DFA in the first place.

EDIT: In the discussion the idea came up that there can be several non-minimal DFAs constructed from one NFA that would have different efficiencies (in whatever measure chosen), not in the implementation, but in the structure of the DFA. This is not possible, for there is only one optimal DFA. This is the outline of a proof for this:

Assuming that our procedure for creating and minimizing a DFA is optimal.
when applying the procedure, we will start by constructing a DFA first. In this step, we can create indefinitely many equivalent states. These states are all connected to the graph of the NFA in some way.
In the next step we eliminate all non-reachable states. This is indifferent to perfomance, for an unreachable state would correspond to "dead code" - never to be executed.
In the fourth step, we minimize the DFA by grouping equivalent states. This is where it becomes interesting - for the idea is that we can do this in different ways, resulting in different DFAs with different performance. However, the only "choice" we have is assigning a state to a different group. So, for arguments sake, we assume we could do that. But, by the idea behind the minimization algorithm, we can only group equivalent states. So if we have different choices of grouping a particular state, by transitivity of equivalence, not only would the state be equivalent to both groups, but the groups would be equivalent, too. So if we could group differently, the algorithm would not be optimal, for it would have grouped all states in the groups into one group in the first place. Therefore, the assumption that there can be different minimizations has to be wrong.

The last sentence of your answer is basically what I mean. I'm looking to narrow down the meanings of "properly" and "pretty optimal." — Kevin Krumwiede, Jan 13 '15 at 20:37
Sorry, that was sloppy on my part. There is no "pretty optimal" - there is just optimal. And for optimality of a DFA - that is extensively answered at Wikipedia (search for "Deterministic finite automaton" and "DFA minimization", or, if that is not sufficient, in numerous books (e.g. one by Rosen called Intro to Discrete Math as a more fundamental one, or Introduction to automata theory by Hopcroft for a more thorough and demanding study) — cleros, Jan 13 '15 at 21:20
Well, I'm sure there are degrees of optimality. I'm wondering if it's possible to characterize (even probabilistically) the non-optimality of a DFA constructed from an NFA that was constructed a certain way. That might suggest the best minimization strategy, or maybe that it's not worth minimizing. — Kevin Krumwiede, Jan 13 '15 at 22:05
Well, depending on how you define "optimal" you could, of course, say that each reducible state gives you a measure of optimality. The fewer reducible states, the more optimal the DFA. Other "degrees" of optimality could be measured against e.g. an implementation language - e.g. how does some specific realization behave (in terms of time and memory) in language xyz ... however, I have no idea whether measurements like these exist. Let me know if you did an analysis, that could be interesting! — cleros, Jan 13 '15 at 22:42

score 0 · Answer 2 · answered Jan 13 '15 at 20:04

The reasoning that the "runtime" for input acceptance will be the same, as usually one character of the input is consumed; I never heared the notion "runtime" (in the sense of asymptotic runtime complexity) in the context of DFAs. The minimization aims at minimizing the number of states (i.e. to optimize the "implementation size") of the DFA.

Performance characteristics of a non-minimal DFA

2 Answers2