1

I am a beginner to prolog. The task is to generate n-digit numbers without using ones and zeroes. How would this be done? Do you generate random numbers and then delete 1s and 0s (sounds inefficient)?

false
  • 10,264
  • 13
  • 101
  • 209
Crabzmatic
  • 115
  • 9

4 Answers4

4

What you describe is definitely one way to do it. As you mention, it is not a particularly good way to do it, since it is so inefficient.

Nevertheless, the approach you mention is so common that it has its own name: It is often referred to as generate and test, since we first generate, and then either reject or accept the solution, or modify it further so that it satisfies all constraints.

A typically much more efficient approach is to first constrain the whole solution so that all requirements are expressed, and then to let the system search only within the already constrained space. This is especially easy to do in Prolog since it provides built-in constraints that you can post before you even start the search for solutions, and they will be automatically taken into account before and also during the search.

For example, you could do it as follows, using your Prolog system's CLP(FD) constraints to express the desired requirements over integers:

n_digits(N, Ds, Num) :-
        length(Ds, N),
        Ds ins 2..9,
        reverse(Ds, Rs),
        foldl(pow10, Rs, 0-0, _-Num).

pow10(D, Pow0-S0, Pow-S) :-
        Pow #= Pow0 + 1,
        S #= D*10^Pow0 + S0.

Thus, we represent the number as a list of digits, and relate this list to the corresponding integer using suitable constraints. The key point is that all this happens before a single solution is actually generated, and all solutions that are found will satisfy the stated constraints. Moreover, this is a very general relation that works in all directions, and we can use it to test, generate and complete solutions.

Here is an example query, asking for how such numbers with 3 digits look like in general:

?- n_digits(3, Ds, N).

In response, we get:

Ds = [_11114, _11120, _11126],
_11114 in 2..9,
_11114*100#=_11182,
_11182 in 200..900,
_11182+_11236#=N,
_11236 in 22..99,
_11282+_11126#=_11236,
_11282 in 20..90,
_11120*10#=_11282,
_11120 in 2..9,
_11126 in 2..9,
N in 222..999.

We can use label/1 to obtain concrete solutions:

?- n_digits(3, Ds, N), label(Ds).
Ds = [2, 2, 2],
N = 222 ;
Ds = [2, 2, 3],
N = 223 ;
Ds = [2, 2, 4],
N = 224 ;
Ds = [2, 2, 5],
N = 225 ;
etc.

When describing tasks where integers are involved, CLP(FD) constraints are often a very good fit and allow very general solutions.

For example, it is easy to incorporate additional requirements:

?- n_digits(3, Ds, N),
   N #> 300,
   label(Ds).
Ds = [3, 2, 2],
N = 322 ;
Ds = [3, 2, 3],
N = 323 ;
Ds = [3, 2, 4],
N = 324 ;
etc.

We're not in Sparta anymore. See for more information!

mat
  • 40,498
  • 3
  • 51
  • 78
2

There is surely a lot of elegance in the clp(fd) solution. If you just want a random number with N digits 2..9, there is a more direct approach

n_digits(N, Ds, Num) :-
    length(Ds, N),
    maplist(random_between(0'2, 0'9), Ds),
    number_codes(Num, Ds).

If you use between instead of random_between, you generate numbers as above. I've produced a comparison at http://swish.swi-prolog.org/p/JXELeTrX.swinb, where we can see

?- time(n_digits(1000, _Ds, N)).
6,010 inferences, 0.003 CPU in 0.003 seconds (100% CPU, 1814820 Lips)

Vs. using the clp(fd) solution which runs out of memory (256Mb) after 42 seconds.

?- time((n_digits(1000, _Ds, N), label(_Ds))).
62,143,001 inferences, 42.724 CPU in 42.723 seconds (100% CPU, 1454531 Lips)
Out of global stack
Jan Wielemaker
  • 1,670
  • 10
  • 18
  • 1
    If you want higher performance, please use a more efficient CLP(FD) version for comparison. – mat Apr 15 '17 at 18:15
2

I would like to complement the existing answers with an additional version that I think is worth studying on its own.

Please consider the following:

n_digits(N, Ds, Expr) :-
        length(Ds, N),
        Ds ins 2..9,
        foldl(pow10, Ds, 0, Expr).

pow10(D, S, S*10+D).

Sample query:

?- time((n_digits(1000, Ds, Expr), label(Ds))).
% 104,063 inferences, 0.013 CPU in 0.017 seconds (75% CPU, 8214635 Lips)
Ds = [2, 2, 2, 2, 2, 2, 2, 2, 2|...],
Expr =  ((((... * ... + 2)*10+2)*10+2)*10+2)*10+2 .

Here, we also declaratively describe the list with constraints, and in addition build an arithmetic CLP(FD) expression that evaluates to the resulting number.

As with the other CLP(FD)-based version, this lets us easily post additional constraints on the number itself.

For example:

?- n_digits(1000, Ds, Expr),
   Expr #> 5*10^999,
   label(Ds).
Ds = [5, 2, 2, 2, 2, 2, 2, 2, 2|...],
Expr =  ((((... * ... + 2)*10+2)*10+2)*10+2)*10+2 .

When you encounter a CLP(FD)-based version that you find too slow for your use case, my recommendation is to look for ways to improve its efficiency. I cannot—at least not with a straight face—recommend to instead turn to lower-level approaches. I have already seen too many talented Prolog programmers getting stuck in the morass of low-level language constructs, and never finding their way to actually improving the higher-level aspects so that they become both general and efficient enough for all use cases that are relevant in practice. This is where their talents would have been needed the most!

mat
  • 40,498
  • 3
  • 51
  • 78
  • Even if you, one of the top very few people with deep knowledge of clp(fd) and its internals need two attempts to get this performance wise ok, I think you prove one of the objections I have against clp(fd): it is extremely hard to understand the performance implications of using clp(fd) in a particular way. It is great if you need general code, but if you have a clear task at hand that has a clear algorithmic solution I'd go for that. The simple algorithmic version is typically both easier to understand and faster. – Jan Wielemaker Apr 17 '17 at 07:24
  • 1
    I have seen this argument before, and it certainly is tempting. Its essence, it is: "We should not use CLP(FD) because we have no experience with it." I first always post a straight forward version. If someone finds it too slow, I typically easily find faster versions. That's the case no matter if constraints are involved or not. But you need to collect experience with the methods you want to apply in *all* cases! In my view, this effort is best channelled into the most general methods we have, since they carry within them the possibility of more general solutions with acceptable performance! – mat Apr 17 '17 at 08:53
  • Prolog allows you to write an algorithm that has a declarative reading. Clp(fd) allows you to write a purely declarative specification of solutions to problems over integers. That is great for many things. It comes at a price. No clp(fd) implementation can contain optimal algorithmic solutions to all possible problems over integers, let alone choose the right one. Where Prolog's search strategy is defined as part of the language and can be used to write _algorithms_, the user has little clue how clp(fd) will approach a given set of constraints. – Jan Wielemaker Apr 18 '17 at 09:40
2

I would like to add a further answer to consider the task from an additional perspective.

This time, I would like to start with Jan's answer:

n_digits(N, Ds, Num) :-
    length(Ds, N),
    maplist(random_between(0'2, 0'9), Ds),
    number_codes(Num, Ds).

Its primary merit is that it is very straight forward and also fast:

?- time(n_digits(1000, Ds, Num)).
% 6,007 inferences, 0.002 CPU in 0.002 seconds (92% CPU, 2736674 Lips)

In fact, it's so fast that it only works some of the time:

?- length(_, N), n_digits(3, Ds, 345).
N = 548,
Ds = [51, 52, 53] ;
N = 1309,
Ds = [51, 52, 53] ;
N = 1822,
Ds = [51, 52, 53] .

However, by this time we are already accustomed to the well-known fact that the correctness of solutions is only of secondary concern when compared to their performance, so let us continue as if the solution were correct.

We can quite directly map this to CLP(FD) constraints, by modifying the bold part as follows:

n_digits(N, Ds, Num) :-
        length(Ds, N),
        Ds ins 0'2..0'9,
        labeling([random_value(0)], Ds),
        number_codes(Num, Ds).

Does anyone have trouble understanding this? Please let me know, I will do what I can to make it clear to everyone who is interested. The best approach is to simply file a new question if anybody would like to know more.

For comparison, here is the previous query again, which illustrates that this predicate acts as we would expect from a relation:

?- length(_, N), n_digits(3, Ds, 345).
N = 0,
Ds = [51, 52, 53] ;
N = 1,
Ds = [51, 52, 53] ;
N = 2,
Ds = [51, 52, 53] .

In this concrete case, the cost of using CLP(FD) constraints is about one order of magnitude in performance, using a constraint solver that is not optimized for performance, mind you:

?- time(n_digits(1000, Ds, Num)).
% 134,580 inferences, 0.023 CPU in 0.026 seconds (87% CPU, 5935694 Lips)

I dare barely mention that the CLP(FD)-based version works more reliably, since this is of so little value in practice. Thus, depending on your use case, the overhead may well be prohibitive. Let us suppose that for this concrete case, we are willing to sacrifice 2 percent of a second in order to use CLP(FD) constraints.

Out of many possible examples, let us now consider the following slight variation of the task:

Let us describe lists of digits that satisfy all constraints, and are also palindromes.

Here is a possible declarative description of palindromes:

palindrome --> [].
palindrome --> [_].
palindrome --> [P], palindrome, [P].

With the CLP(FD)-based version, we can easily combine this constraint with the already stated ones:

n_digits(N, Ds, Num) :-
        length(Ds, N),
        Ds ins 0'2..0'9,
        phrase(palindrome, Ds),
        labeling([random_value(0)], Ds),
        number_codes(Num, Ds).

Sample query:

?- time(n_digits(1000, Ds, Num)).
% 12,552,433 inferences, 1.851 CPU in 1.943 seconds (95% CPU, 6780005 Lips)

For comparison, how can we incorporate this straight-forward additional constraint into Jan's version? It is tempting to do it as follows:

n_digits(N, Ds, Num) :-
    length(Ds, N),
    phrase(palindrome, Ds),
    maplist(random_between(0'2, 0'9), Ds),
    number_codes(Num, Ds).

Pretty straight-forward too, right? The only problem is that it yields:

?- time(n_digits(1000, Ds, Num)).
% 4,013 inferences, 0.041 CPU in 0.046 seconds (88% CPU, 97880 Lips)
false.

Why is this the case?

And now good luck explaining this to any newcomer to the language! When arguing against the purported complexity of explaining CLP(FD)-based versions, please take into account the much, much higher complexity of explaining such purely procedural phenomena, which cannot be understood by declarative reasoning alone.

Further, good look with solving the task at all with low-level features in such a way that additional constraints can still be easily added.

By the way, sometimes, the query will actually succeed!

?- length(_, N), time(n_digits(3, Ds, Num)).
% 28 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 848485 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1400000 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 1166667 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 26 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 1444444 Lips)
N = 10,
Ds = [52, 50, 52],
Num = 424 .

I can only say: This approach cannot in honesty be recommended, can it?

If palindromes are too unrealistic for you, consider the following task:

Let us describe lists of digits without 0, 1 and 5.

Again, using CLP(FD), this is straight-forward:

n_digits(N, Ds, Num) :-
        length(Ds, N),
        Ds ins 0'2..0'9,
        maplist(#\=(0'5), Ds),
        labeling([random_value(0)], Ds),
        number_codes(Num, Ds).

Sample query:

?- time(n_digits(1000, Ds, Num)).
% 203,529 inferences, 0.036 CPU in 0.038 seconds (93% CPU, 5662707 Lips)

For comparison, how would you do it without CLP(FD) constraints?


Now let us combine both requirements:

Let us describe lists of digits without 0, 1 and 5 that are also palindromes.

Again, using the CLP(FD) version, we can simply state the requirement:

n_digits(N, Ds, Num) :-
        length(Ds, N),
        Ds ins 0'2..0'9,
        maplist(#\=(0'5), Ds),
        phrase(palindrome, Ds),
        labeling([random_value(0)], Ds),
        number_codes(Num, Ds).

Sample query:

?- time(n_digits(1000, Ds, Num)).
% 20,117,000 inferences, 2.824 CPU in 2.905 seconds (97% CPU, 7123594 Lips)

These examples illustrate that with the general mechanism of constraints, you have actually somewhere to go in case you need slight variations of previous solutions. The code is quite straight-forward to adapt, and it scales reasonably well.

If there is anything that is too hard to understand, please file a question!

Now, as one last remark: Both CLP(FD)-based versions I posted earlier give you something that goes way beyond such a direct translation. With the other versions, you can post additional constraints not only on the list of digits, but also on the number itself! That's simply completely out of reach for any version that doesn't use constraints! Such constraints are also taken into account before the search for solutions even begins!

mat
  • 40,498
  • 3
  • 51
  • 78