Can a Neural Network Find the i-th Permutation of a fixed size list?

Question

Briefly

Can a neural network emulate factorial decomposition (or some other method) to provide a list permutation given the permutations unique index?

Application

I have a list of 10 things, and what they are is irrelevant. What I care about is that my 10 things can be placed into 3628800 (or 10!) unique orders, because then I can express any list order of my 10 things using an unsigned integer and factorial decomposition:

Order 0: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Order 1: 0, 1, 2, 3, 4, 5, 6, 7, 9, 8
Order ....
Order 3628799: 9, 8, 7, 6, 5, 4, 3, 2, 1, 0

This allows for the parallel distribution of analysis on different list orders of my 10 things.

A common example being the travelling salesman problem:

1. I give 500 different computers each a range of unsigned integers:
   0    -> 7257  for computer 0, 
   7257 -> 14516 for computer 1, 
   etc.

2. Each computer first calculates the list order from it's unsigned integer 
   index by using factorial decomposition.
   ie. Order 1 -> 0, 1, 2, 3, 4, 5, 6, 7, 9, 8

3. The distance between the cities placed in the order described is calculated.

4. The shortest distances from each computer is collected, and the shortest 
   of those is taken. Leaving us with a single unsigned integer index that 
   describes the shortest possible permutation of cities.

The same process can be used to solve virtually any boundable error surface, given often far more than feasible computational power.

Recursive Algorithmic Solution

We can calculate the N-th permutation of any fixed size list (granted we will need large integer support for bigger lists) using factorial decomposition (outlined here in php), and provided here in javascript for clarity:

function ithPermutationOfNElements (n, i)
{
   var j, k = 0;
   var fact = [];
   var perm = [];

   // compute factorial numbers
   fact[k] = 1;
   while (++k < n)
      fact[k] = fact[k - 1] * k;

   // compute factorial code
   for (k = 0; k < n; ++k)
   {
      perm[k] = Math.floor(i / fact[n - 1 - k]);
      i = i % fact[n - 1 - k];
   }

   // readjust values to obtain the permutation
   // start from the end and check if preceding values are lower
   for (k = n - 1; k > 0; --k)
      for (j = k - 1; j >= 0; --j)
         if (perm[j] <= perm[k])
            perm[k]++;

   return perm;
}
console.log(ithPermutationOfNElements(4, 23)); // [ 3, 2, 1, 0 ]

Neural Network Solution?

Can any neural network architecture & training combination emulate this function given i as it's only input neuron and n output neurons representing each element of the permutation?

Given that the correct answer is yes and the reasonable answer is no an excellent follow up question would be: How many layers would a NN need to accomplish this and what is the best way to train it. — placeybordeaux, Mar 10 '14 at 00:55
@Peter Micheal Lacey-Bordeaux Solving this problem using NN is completely different to the normal use of NN. The 'algorithm' way to do it would be with many neurons acting as logic gates, so now you have neuron nand gate / flipflops etc acting as adders/multipliers/latches etc, until you have essentially built a turing machine on a high level. It will in no way resemble a normal NN as they are used by the majority of the AI world. Further, you already have a perfectly good turing machine right in front of you. — andrelucas, Mar 12 '14 at 14:19
@andrelucas if you can incorporate and expand on that in your answer I'll select it as the winner — Adrian Seeley, Mar 12 '14 at 16:32
@andrelucas Yes I know, however many people are taught about NNs in such a way that this is not obvious. I would love to see some formal proofs of how inefficient NNs can be, in fact I think that having NNs solving problems like these could model a good lower bound for performance and maybe even drive better learning algorithms for them. — placeybordeaux, Mar 13 '14 at 15:24
@Peter You won't find such formal proofs. Simply because, if you consider the simple NN AND gate I have shown below as an actual, working NN, then by that definition, an actual hardware AND gate is also a NN. So now anything made from a combination of normal hardware logic gates is a NN (including a normal computer). So the performance/efficiency of factorial decomposition implemented in procedural logic will be identical to a NN implementation - if you use the loosest possible definition of the term. But nobody in the AI world would point to a hardware AND gate and say, 'yes, that is a NN'. — andrelucas, Mar 16 '14 at 22:37

andrelucas · Accepted Answer · 2014-03-12T17:27:34.053

5

A neuron can operate as a logic gate, and thus a Neural Network can perform any calculation a computer can. However in that sense it simply emulates logic gates inefficiently, using high level code, so is not a good solution for this problem.

In general, neural networks are good with 'real' or 'natural' data. They also generally operate with floats, not integers. So if there is a pattern to be learnt, a NN might learn it, but the output answer you will get will be eg 0.783267. You could then denormalize this to 89743, but its unlikely to be exactly right. For your requirement, one integer off the right answer is completely wrong.

By contrast, for a face recognition NN returning 0.787 or 0.786 for a particular image, both could be considered correct.

Your problem is better suited to a traditional, procedural code solution, with only one correct answer for each input. Generally in AI, you are looking for the correct answer, within a certain range or probability distribution.

Regarding implementing algorithms with NNs:
You can have many neurons acting as logic gates, so now you have neuron nand gate / flipflops etc acting as adders/multipliers/latches etc, until you have essentially built a turing machine, but explicitly using high level code. It will in no way resemble a normal NN as they are used by the majority of the AI world. Further, you already have a perfectly good turing machine right in front of you.

Here is the code for a Neural Network AND gate in Matlab. No training is required. I've used configure instead of train, and just set the weights manually. So making the other logic types you could build an entire turing machine.

and = feedforwardnet(1);

truthTable = [0 0 1 1; 0 1 0 1];
and_out = [0 0 0 1];

and = configure(and, truthTable, and_out);

vals = [-2 -2 -1  2 0];

and.IW{1} = vals(1:2); % input1 to hidden, input2 to hidden
and.LW{2,1} = vals(3); % hidden to output
and.b{1} = vals(4);     % bias to hidden
and.b{2} = vals(5);     % bias to output

y = sim(and, truthTable)
round (y)
mse = mean ((y - and_out) .^ 2)


y =
    0.0000    0.0180    0.0180    0.9820
ans =
     0     0     0     1
mse =
   2.4263e-04

edited Mar 12 '14 at 17:27

answered Mar 09 '14 at 15:16

andrelucas

678
3
10

Just a quick thought about that though, the probability distribution/range could limit the problem space that needs to be analysed. E.g. computer 0 is searching through [0, 7257], what if an NN could reduce that space to something like [500, 600] greatly reducing the computation need to solve the problem. Hopefully the same NN could reduce the space for each other computer as well, effectively making the whole search faster. Of course, to be useful, the NN would have to converge on some solution faster than just the straight procedural solution. – kand Mar 09 '14 at 16:41
Not for this particular problem. We're looking for the ith permutation of a fixed list. How do you know if the NN got the narrowed range correct? You have to use a procedural algorithm to prove it. And by that stage, you've already solved the original problem. – andrelucas Mar 09 '14 at 16:51
@SlaterTyranus care to expand on that? – andrelucas Mar 10 '14 at 13:43
1

@andrelucas an actual answer would take a huge amount of time, but here are a few points: 1. Neural nets have no requirements as to working with either floats or integers and can do both interchangeably, depending on the activation function you use. 1. Standard NNs are NOT static. 3. The number of input and output nodes are completely irrelevant, neural nets actually prefer smaller inputs for sake of computational complexity. 4. Functions and algorithms are equivalent (see: turing completeness). 5. The results are the relevant portion of an algorithm. – Slater Victoroff Mar 11 '14 at 19:17
@andrelucas Character limit stopped me, so continuing here: 6. The entire process of training a neural net involves supervised learning, which means you MUST show it the output in order to train it at all. 7. There's no reason that the neural net would have to be a recurrent neural net, which you fail to explain or mention. I agree that it's not a good solution, but it IS possible, which is about the only correct thing you said. – Slater Victoroff Mar 11 '14 at 19:20
I think @andrelucas might have been trying to say that given the inputs and the outputs - a NN would opt to learning the identity (ie a 1-to-1 mapping) of any provided training numbers. – Adrian Seeley Mar 11 '14 at 20:31
@Slater Tyranus There is no strict mathematical definition of what exactly is, and is not, a NN. This is completely different to eg the Riemann hypothesis, which can be proved exactly true for all possible cases, or not. You confidently assert an NN can operate on ints? I confidently assert that such a configuration, which is restricted to integer only activation functions, is no more than a collection of adders and multipliers (cannot use division, trig, the number e, etc) and does not meet the definition of a NN. Who's definition? Exactly - there is no strict definition. ...contd... – andrelucas Mar 12 '14 at 14:07
... Neither of us can be proved strictly right or wrong. The exception is embedded NN, using a small no. of logic gates, uses large ints which are essentially optimized hardware implementations of floats. Low level, float/int don't even exist - only 0s and 1s. Lower down, even 1 does not exist - only an analog voltage from 4.98-5.02V. We can continue like this for all 7 of your points. Even 'algorithm' has no strict, exact mathematical definition for all possible cases. OP, knowing nothing about NNs, asked for some general advice - and that's how I answered the question. – andrelucas Mar 12 '14 at 14:07
@andrelucas There are in fact strict mathematical definitions of neural nets. Specifically you've got an activation function, a weight matrix, and a bias matrix. I confidently assert that neural nets work on ints because the most common activation function is a step function, and binarization has been shown to be instrumental (Hinton 2010). Addition and multiplication are the only operations a neural net is ever capable of! The whole point of turing completeness is that you can construct any function (including trig functions) from those simple operations (see: CORDIC) – Slater Victoroff Mar 12 '14 at 16:09
I can mathematically prove that neural nets are capable of things you argue they are not capable of. Your last tangent is nonsensical, and algorithm absolutely has a strict mathematical definition. I really don't think further arguing is helpful, but I really wish you would bother researching neural nets before spreading this kind of gross misinformation. – Slater Victoroff Mar 12 '14 at 16:13
1

Ok, look at it this way. Give me an example of _any_ computing paradigm, which includes at minimum two numbers (pick any data type) and one processing unit (adder or any type) that you do _not_ consider to be a NN. No matter what you say, I will then say that your suggestion, is in fact also a NN, _by your own definition_. A definition which holds that _any_ computing paradigm, from a simple adder, to a cray supercomputer, is a NN. Its not a useful definition. How do you differentiate it to a genetic algorithm, or a web server? By your definition, both of those are NNs. – andrelucas Mar 12 '14 at 16:52

score 4 · Answer 2 · answered Mar 09 '14 at 20:47

A little known fact is that recurrent neural nets are Turing complete, and can thus perform any computation a computer can (see result by Siegelmann).

This does neither mean (a) that you can easily find the necessary weights by a learning algorithm or (b) that a feed forward net, as you probably are looking at, can do it.

Nevertheless, this seems like a task you don't want to do with a neural net.

score 4 · Answer 3 · answered Mar 09 '14 at 21:32

4

Generally, neural networks are universal function approximators, so in theory yes.

More specifically, notice that for a particular (fixed) i value, the neural network that solves it is trivial (and in fact, requires no hidden nodes or activation functions. It is a linear problem).

As a brute force, naive solution, for the general problem with unfixed i: a neural network large enough to encode all 10! possible linear encodings, with a hidden layer essentially acting as a mux based on the i input, would solve the problem. More efficient networks probably exist, and it would be interesting to try a recurrent architecture for this problem.

In any case, while a solution certainly exists, the better question is whether this is a good way to solve it. If a problem boils down to some simple psuedocode, I would avoid a neural network implementation unless it is for academic purposes.

answered Mar 09 '14 at 21:32

Zaez

91
1
6

A NN given all 10! encodings would be emulating the results of the algorithm, not the algorithm itself. Factorial decomposition is an algorithm, not a function. – andrelucas Mar 10 '14 at 13:45
To help with any confusion, I am not suggesting that all 10! possible results be inputs into the net, but rather that the hidden layer is large enough to independently solve all 10! results concurrently, and a final layer which simply highly weights the sets corresponding to the correct ordering based on the i input. It would be a massively giant network, and would never train properly, but it does show that a network exists that could solve the problem. – Zaez Mar 10 '14 at 23:15
see my response regarding NN algorithm / logic gates to Peter Micheal Lacey-Bordeaux's comment above. – andrelucas Mar 12 '14 at 14:23

score 0 · Answer 4 · answered Mar 10 '14 at 02:36

I think it might be possible. Input your data and a single number (starting at 0 or 1). Have it produce a single number representing the element number (round it). Add that number to your list. Then do it again, except increase the number you feed to the neural network by 1 (i.e. the number that represents the element in the list that you want to find.)

A recursive neural network would be ideal. But I'm still not certain if the underlying function can be learned or approximated effectively. I think that it could be.

Can a Neural Network Find the i-th Permutation of a fixed size list?

Briefly

Application

Recursive Algorithmic Solution

Neural Network Solution?

4 Answers4