Generate shorthand function signature

Question

Let's assume a function foo() with the following four overloads:

foo(a, b)
foo(a, b, d)
foo(a, c)
foo(a, c, d)

I want to generate a concise string that represents all overloads at once. In this case, the result should be foo(a, (b|c), [d]).

Edit: There is usually more than one concise representation. My goal is to get a representation that is as short as possible, counting only parameters. So foo(a, (b|c), [d]) has length 4 and is thus better than foo(a, ((b, [d])|(c, [d]))), which has length 5.

Is there an existing algorithm to solve this (or a similar) problem?

If not, can anyone sketch an approach?

I'm not picky about the programming language (I'm using C#, though).

The rules are:

Parameters with the same name represent the same thing for all overloads. a is a, b is b...
When collecting all distinct parameters over all overloads (in this case, a, b, c, d), every overload will adhere to this parameter order.
[...] means that the enclosed sub-expression can be omitted as a whole.
(...|...|...) means a choice of one of the sub-expressions. For readability's sake, such a sub-expression must not be empty.

To illustrate further: The (rather contrived) function bar()

bar(a, b,          f, g, h, i)
bar(a, b,          f, g, h)
bar(a, b,          f, g)

bar(a,    c,          g, h, i)
bar(a,    c,          g, h)
bar(a,    c,          g)

bar(a,       d,    f, g, h, i)
bar(a,       d,    f, g, h)
bar(a,       d,    f, g)

bar(a,          e, f, g, h, i)
bar(a,          e, f, g, h)
bar(a,          e, f, g)

should be represented as bar(a, (((b|d|e), f)|c), g, [h, [i]]).

How about 1. bar(a, b) and bar(b, a); 2. bar(a), bar(a, b), bar(a, c), and bar(a, b, c)? I guess, following the rules above, there may be some conditions causing conflicts. — Tao HE, Dec 05 '13 at 03:02
In your second example, `g` is always present as a parameter, so it cannot be inside parenthesis which imply a choice of parameters. This is not how you give the concise string. Am I right? Also you don't list `c` in the concise string at all. — under_the_sea_salad, Dec 05 '13 at 06:20
@Tao HE: That's what I meant with the 2nd bullet point: Conveniently, all overloads will have the same parameter order. All parameters form a sorted list, and each overload is a subset following the same order. — Daniel Wolf, Dec 05 '13 at 10:22
@ijkilchenko: You are right, I messed up the concise string for the 2nd example. Fixed. — Daniel Wolf, Dec 05 '13 at 10:23
@Tao HE: Your 2nd example can be represented as bar(a, [b], [c]). — Daniel Wolf, Dec 05 '13 at 10:50

MaMazav · Answer 1 · 2013-12-05T11:49:21.190

1

Actually that problem can be reduced to simplifying logic circuit. You can use Karnaugh map to perform the simplification:

http://en.wikipedia.org/wiki/Karnaugh_map

Edit: the circuit minimization problem: http://en.wikipedia.org/wiki/Circuit_minimization

The reduction from the overloading problem to circuit minimization based on the assumption that no order change exist between the function parameter. The reduction is performed by writing a True Table in which the input parameters of the circuit are exactly all the possible parameters of the function, and for each existing overloading the output of the circuit will be '1' for the row in which all (and exactly) the used parameters of the overloading are '1'.

edited Dec 05 '13 at 11:49

answered Dec 05 '13 at 11:40

MaMazav

1,773
3
19
33

Looks intriguing, but I'm not sure whether a minimal logic circuit automatically corresponds to a minimal concise string. Take the two overloads foo(a, b, d); foo(a, c, d). Karnaugh cannot simplify them and yields the same as its input: a&b&!c&d | a&!b&c&d. Translating this into "concise string" syntax, we get foo((a, b, d)|(a, c, d)). This is longer than the optimal solution, foo(a, (b|c), d). I guess the problem is that the available operators in a Boolean expression are not completely equivalent to those in a concise string. – Daniel Wolf Dec 06 '13 at 20:40
Actually you're right. However, I think that in the common case of sparse amount of valid overloading forms (which is equivalent to low number of '1' in the True Table) using "maxterms canonical form" rather minterms form will yield much better result. http://en.wikipedia.org/wiki/Canonical_form_%28Boolean_algebra%29 – MaMazav Dec 07 '13 at 16:46

score 0 · Answer 2 · answered Dec 05 '13 at 08:43

I don't know if there is a standard way to solve it, but here is a suggested heuristic. Notice that I don't consider performance in this suggestion.

You can always represent such overloads as a "trivial" form of "OR"ed expression of all possible combinations:
```
foo( (a, b) | (a, b, d) | (a, c) | (a, c, d) )
```
If you want to extract more simple forms, you can try a greedy algorithm. Start from the trivial form of ORed expressions. Then use the following basic step - compare pairs of expressions to see if they can be grouped by:
- Or |, . e.g. (a, b) | (a, c) -> (a, (b|c))
- Optionality [], . e.g. (a, b) | (a, b, d) -> (a, b, [d])
The basic step should be done:
- until no grouping is possible.
- recursively: the algorithm should traverse sub-expressions to check if internal pairs can be grouped.

The above algorithm does not guarantee optimal form. For example, here is a possible execution of the above algorithms on the foo input ('{' are for readability only, they are identical to '(' ):

    (a, b) | (a, b, d) | (a, c) | (a, c, d)
    (a, b, [d]) | (a, c) | (a, c, d)
    (a, b, (c | [d]) | (a, c, d)
    a, {(b, {c | [d]}) | (c, d)}

which is much complicated than the form you presented, a, (b|c), [d]. To make an optimal form of the expression you should first declare what is an optimal form. Based on such declaration you can decide whether you can use the greedy algorithm as a starting point and force the result to be optimized for your needs, or you should have another algorithm at all.

Here are to demonstrations of how you can get more optimized forms:

The algorithm can be forced to group by OR expressions before optionality. Then the above execution is invalid, and a typical execution will be like:

(a, b) | (a, b, d) | (a, c) | (a, c, d)
{a, (b | c)} | (a, c) | (a, c, d)
{a, (b | c)} | (a, b, d) | (a, c, d)
a, { (b | c) | (b, d) | (c, d) }
a, { (b | c) | (b | c, d) }
a, { (b | c) , [d] }

The algorithm can be back-traced for all possible orders of grouping operations to find the most optimal form exist.

The iterative approach looks good. I failed to mention it, but I was really hoping for an algorithm that guarantees an optimal solution. I just expanded my question with a definition of what constitutes an optimal solution. — Daniel Wolf, Dec 05 '13 at 10:43

score 0 · Accepted Answer · edited May 23 '17 at 10:32

First, let's assign some nomenclature.

[...] is an Option.
..., ... is a Sequence.
... | ... is a Choice.

It seems the problem is tricky for two reasons. First, the syntax just isn't the same as in a Boolean expressions. For instance, although a Choice is similar to an OR, it means "take any one", not "take at least one". So an algorithm that generates an optimal Boolean expression might result in a suboptimal result once "translated" to our syntax.

Second, the optimal solution might be something like a Sequence within a Choice within a Sequence within an Option. So any algorithm that can only create one kind of structure (like a Choice of Sequences) cannot possibly always return an optimal solution.

The following describes the solution I've found. There is also a working implementation.

First, we need to create a list of all distinct parameters over all overloads. As per the question, each overload will stick to this parameter order. So each overload can be represented as a Boolean array where each entry indicates whether the corresponding parameter is present. Now, the parameter list along with the list of overloads is given to a recursive function that works like this:

Remove duplicate overloads, so that each overload is distinct.
If there is only one overload, return a Sequence of the used parameters.
If one of the overloads is empty: call the function recursively with all other overloads and return the result within an Option.
Split the parameter list into constant areas (that are the same for all overloads) and independent areas. Each independent area should be kept as short as possible. An area is independent if you can take any overload and replace the flags within that area with those from any other overload. If this splitting results in at least one constant area, return a Sequence containing the constant parts and recurse for the independent areas.
If all this fails, it may be because the overloads are too dissimilar. So create multiple groups of similar overloads. To do this by brute force, generate all partitions (see How to find all partitions of a set). For each group of overloads within each partition, call the function recursively and combine the results with a Choice. Then return the shortest of these Choices.

I believe that for the reasons stated above, an algorithm that finds optimal solutions cannot be much simpler. I cannot prove that this algorithm does in fact always find the optimal solution, but for the signatures I've tried so far, it did.

Generate shorthand function signature

3 Answers3