8

I am trying to build a simple evolution simulation of agents controlled by neural network. In the current version each agent has feed-forward neural net with one hidden layer. The environment contains fixed amount of food represented as a red dot. When an agent moves, he loses energy, and when he is near the food, he gains energy. Agent with 0 energy dies. the input of the neural net is the current angle of the agent and a vector to the closest food. Every time step, the angle of movement of each agent is changed by the output of its neural net. The aim of course is to see food-seeking behavior evolves after some time. However, nothing happens.

I don't know if the problem is the structure the neural net (too simple?) or the reproduction mechanism: to prevent population explosion, the initial population is about 20 agents, and as the population becomes close to 50, the reproduction chance approaches zero. When reproduction does occur, the parent is chosen by going over the list of agents from beginning to end, and checking for each agent whether or not a random number between 0 to 1 is less than the ratio between this agent's energy and the sum of the energy of all agents. If so, the searching is over and this agent becomes a parent, as we add to the environment a copy of this agent with some probability of mutations in one or more of the weights in his neural network.

Thanks in advance!

user1767774
  • 1,775
  • 3
  • 24
  • 32
  • 1
    What exactly do you mean by "nothing happens"? – timday Feb 21 '13 at 19:38
  • The agents move randomly, change direction from time to time, but not looking for the food. – user1767774 Feb 21 '13 at 20:40
  • 1
    BTW if you haven't come across it yet and need some inspiration for this sort of project, go read: http://ttapress.com/553/crystal-nights-by-greg-egan/ – timday Feb 21 '13 at 21:41
  • I would recommend checking out [NeuralFit](https://neuralfit.net/), it is a neuro-evolution for Python and you can give it your own evaluation function. In your case that would be evaluating the performance of the agent in your scavenging environment. I am working on an example similar to what you describe, so it should be there in a month or so. – Thomas Wagenaar Jan 13 '23 at 20:44

2 Answers2

6

If the environment is benign enough (e.g it's easy enough to find food) then just moving randomly may be a perfectly viable strategy and reproductive success may be far more influenced by luck than anything else. Also consider unintended consequences: e.g if offspring is co-sited with its parent then both are immediately in competition with each other in the local area and this might be sufficiently disadvantageous to lead to the death of both in the longer term.

To test your system, introduce an individual with a "premade" neural network set up to steer the individual directly towards the nearest food (your model is such that such a thing exists and is reasobably easy to write down, right? If not, it's unreasonable to expect it to evolve!). Introduce that individual into your simulation amongst the dumb masses. If the individual doesn't quickly dominate, it suggests your simulation isn't set up to reinforce such behaviour. But if the individual enjoys reproductive success and it and its descendants take over, then your simulation is doing something right and you need to look elsewhere for the reason such behaviour isn't evolving.

Update in response to comment:

Seems to me this mixing of angles and vectors is dubious. Whether individuals can evolve towards the "move straight towards nearest food" behaviour must rather depend on how well an atan function can be approximated by your network (I'm sceptical). Again, this suggests more testing:

  • set aside all the ecological simulation and just test perturbing a population of your style of random networks to see if they can evolve towards the expected function.
  • (simpler, better) Have the network output a vector (instead of an angle): the direction the individual should move in (of course this means having 2 output nodes instead of one). Obviously the "move straight towards food" strategy is then just a straight pass-through of the "direction towards food" vector components, and the interesting thing is then to see whether your random networks evolve towards this simple "identity function" (also should allow introduction of a readymade optimised individual as described above).

I'm dubious about the "fixed amount of food" too. (I assume you mean as soon as a red dot is consumed, another one is introduced). A more "realistic" model might be to introduce food at a constant rate, and not impose any artificial population limits: population limits are determined by the limitations of food supply. e.g If you introduce 100 units of food a minute and individuals need 1 unit of food per minute to survive, then your simulation should find it tends towards a long term average population of 100 individuals without any need for a clamp to avoid a "population explosion" (although boom-and-bust, feast-or-famine dynamics may actually emerge depending on the details).

timday
  • 24,582
  • 12
  • 83
  • 135
  • Thank you, it's great idea, but I am not sure about how should I choose the right weights. I use tanh(x) as an activation function. Let a,b,c be the weights and alpha,dx,dy the angle of the agent, the horizontal distance and the vertical distance to closest food, respectively. I want such a,b,c so that tanh(a*alpha+b*dx+c*dy) = epsilon*(tan(dy/dx)-alpha) - I want the output to be some constant times the difference between the angle to the food (tan(dy/dx)) to the current angle. This is one equation with 3 unknowns. By the way, what do you think about the reproduction mechanism i described? – user1767774 Feb 21 '13 at 21:38
  • 1
    I have followed your suggestion and made the network output a vector, and after few minutes food-seeking behavior was evolved! Thank you (-: what is the reason of this significant change? intuitively the vector hold both the information about the angle and the less-relevant information about the magnitude, so one could except worse results than when the output is simple scalar (angle)... – user1767774 Feb 21 '13 at 22:17
  • 1
    My guess: because its much easier for your neural network model to evolve towards the simple "passthrough" of food direction to movement direction than it is for them to evolve an atan function. – timday Feb 22 '13 at 12:01
4

This sounds like a problem for Reinforcement Learning, there is a good online textbook too.

Dave
  • 7,555
  • 8
  • 46
  • 88