What is holding genetic programming back?

Question

I have done a fair amount of work with genetic algorithms quite successfully and thus far ignored genetic programming. As far as I know, most programs remain written by programmers, and I'm curious to know what is holding genetic programming back?

Some possible explanations I thought of are:

The search space is just too large to find useful programs among the noise
Most real applications can't supply sufficient data to allow fitness evaluation of such a space.
It is difficult to reduce the efficacy of many real applications down to a single fitness measure. In other words, writing a suitable fitness function would probably entail the same amount of work as writing the actual program.

Any ideas?

This answer should have been asked at [Artificial Intelligence Stack Exchange](https://ai.stackexchange.com/), but, unfortunately, it did not exist 9-10 years ago. — nbro, Nov 27 '20 at 15:45

score 40 · Accepted Answer · edited Dec 07 '10 at 21:17

This is something I have been considering in my own research, and I'd say there are many reasons:

The vast majority of research in the GP field has focused on producing formulas rather than the sort of software that gets produced by most programmers. There are plenty of computer scientists in the field, but very few are focused on producing the sort of programs you would expect, so advances have been slow in that area.
There is a significant over emphasis on using LISP because it can produce nice tree structures which are easy to manipulate and unfortunatly imperative programs have been neglected because they involve solving some tricky problems. For GP to be taken seriously by programmers it has to produce imperative programs.
Real programs contain looping constructs, but loops are difficult to implement in GP without all sorts of ugly constraints to prevent infinite looping.
Genetic Programming does not scale well. It is fine for small problems, with a small available language, but as you say in your first point - the search space becomes too large very quickly.
Compared to a human programmer, GP can be very slow. It is however very parallelisable so is likely to benefit substantially as larger numbers of processor cores become the norm.

Another valid answer would be that it is difficult to trust code has been automatically generated. This is true, but in practice I doubt this has much impact because GP is not able to produce the right sort of programs in the first place.

+1 for point #1. In my experience most programmers produce data-in/data-out/calculation applications where all of the logic is relatively simple. I don't think that the primary bottlenecks in software development (at least business software development) has anything to do with finding more efficient algorithms for specific tasks. — John Bledsoe, Dec 07 '10 at 21:28

score 26 · Answer 2 · edited Feb 09 '13 at 02:36

The hard part about genetic programming is writing a good scoring function:

The scoring function must correctly judge whether the algorithm has the desired properties. This is harder than it sounds -- much harder than writing a test suite. The algorithm may adapt to any quirk of your scoring function and optimize it. Trying to evolve strcmp? Your algorithm may instead discover a mathematical pattern to the lengths of your pass/fail test cases.
The scoring function must be capable of judging whether the algorithm under test is partially working. Genetic programming relies on "hill climbing". A tiny beneficial mutation needs to cause a tiny incremental improvement in score. If your scoring function can only output true/false then you're searching randomly, not genetically.

If you try your hand at it you'll find that genetic programming is the ultimate in lateral thinking: Your program will tend to solve the problem in every way you didn't think of, most of them unexpected and (for serious applications) probably useless. One famous case involved an attempt to evolve an oscillator using basic electronic components. It was judged on the simplicity of the circuit and whether the output oscillated. It produced something so simple the researchers were sure it couldn't work, but it did: It was picking up and amplifying radio waves from the environment.

Edit to cite:

Graham-Rowe, Duncan. "Radio emerges from the electronic soup." New Scientist, vol.175, no.2358, p.19 (August 31, 2002). Available online at http://www.newscientist.com/news/news.jsp?id=ns99992732

However the link now appears to be broken.

New link is http://www.newscientist.com/article/dn2732-radio-emerges-from-the-electronic-soup.html

Upvoted cuz this is just good example of GA being applied to a practical problem. It points the way to GA being used in many areas. — , Dec 28 '12 at 18:54

score 10 · Answer 3 · answered Jan 29 '13 at 08:51

I know I'm late to this party, but I'd just like to make a couple of points. I had the incredible good fortune to work with John Koza on his Genetic Programming 4 book.

The kind of programming that most of us do day to day - hooking up GUI events, pushing pixels, databasing, etc etc are most certainly not the type of programs that GP aims to build.

What John Koza does with his cluster of about a hundred machines (if I remember right!) is to search for solutions to interesting problems (think NP-hard). It's sad but most of the problems us programmers work on day to day are not very interesting or difficult problems, mostly just irritating and time consuming.

What Ben Jackson said about a genetically engineered electronic oscillator is the closest thing in this thread to what Dr. Koza and team actually do. There were many examples of circuits in the GP series of books.

What Tom Castle said here about imperative programs is not exactly true. The seminal example of this from John and company is the antenna design run. It is pretty much a 3d drawing program with commands like the LOGO drawing language that designs an antenna. It has moveto, lineto type commands for pushing and popping matrices on a stack. The GP package I just looked at last week, jgap, has direct support: a container type nonterminal that can contain void return statements and then has a return statement at the end. I believe it even has something like local variables though I didn't look too closely.

The original LISP design that early GP runs centered around is kind of a pain from time to time, that is certainly true. It's something of a rite of passage I think to be annoyed about translating the output of a GP run into more usable code.

TL;DR: GP is not really an automatic programming system. It's an automated invention system.

Guess a "just" hundred-headed Beowulf was the initial, DEC-Alpha based small prototype. The main portion of his research tasks were evolved on a way larger toy, ~ 1000-dual-CPU diskless-machines hierarchically connected into a GNU / Beowulf cluster. Anyway, **must 've been a great time to work with J.R.Koza**, cool to have found a time to mention this piece of your personal experience here. **Thanks** @fletch — user3666197, Sep 03 '17 at 21:52

Andre Holzner · Answer 4 · 2010-12-07T19:45:30.237

5

I'd say 1. and 3.

In particular, concerning point 3, I would say that in most cases it is even easier to write the actual program than to come up with a suitable target function and check that this leads to the correct or an acceptable solution (how do you know that an algorithm derived from genetic programming works as expected ?)

Concerning point 1, I would say that the search space has an infinite number of dimensions.

edited Dec 07 '10 at 19:45

answered Dec 07 '10 at 19:33

Andre Holzner

18,333
6
54
63

Good points, although I think it only has an infinite number of dimensions if you consider your program as being potentially infinintely long. If we consider the program as a binary string of 100 bits, there are 2^100 programs, and you could say there are 100 dimensions. Of course we probably don't know how big the program will be. – zenna Dec 07 '10 at 20:05

Julien · Answer 5 · 2016-11-17T12:51:00.087

It's simply because it has been proven to be theoretically impossible (at least for correct programs).

Let's assume you have an infinite computing power discarding search space size and slowness issues and other 'speed' stuff. Now you face only two problems : - Will the generated program stops ? - How to be sure that the generated program behave the way I want them to ?

The first problem is a central question in computability theory and is called the halting problem . Turing proved in 1936 that this problem is undecidable for all program-input pairs. This means it is possible in some cases, but not for all. There is no automated process that can determine wether a program halts or not. So for this, you can't do much ;)

The second issue related to program correctness. In genetic programming, validation is usually made through example values which does not constitute at all any proof of correctness. This is comparable to unit testing, gives ok for a number of examples, but not no general proof. For example if I write a prime number checker, test it only with 3 5 7 and 11, then a program that returns true for every odd number would pass the test.

Going one step further would be to use automated proofs. Automated proof of correctness for algorithms is in fact deeply related to mathematical automated theorem proving. You describe your program using an axiomatized system and try to automatically proof the correctness of your statement. Here again you face strong theoretical barriers, which are the Gödel Incompleteness theorems. These theorems state among others things that for even very simple axiomatized systems, able to perform arithmetic on natural numbers, there is no algorithm (called effective procedure) able to proof all the theorems regarding these natural numbers. Concretely, this means that even for simple programs, you will not be able to prove its correctness.

Even without a proven correctness, using sample test cases to validate the genetic program is very prone to over-specialization, the phenomenon known as overfitting in machine learning. That is, the learnt program will do fine on the provided test cases but may go totally ballistic for some other inputs.

This answer gets regularly downvoted, feel free to add a constructive comment ! — Julien, Nov 23 '15 at 19:38
The halting problem is a theoretical one. A program which does not halt in a **practical** timeout (The halting solution :P) is unlikely to be of much use. A huge amount of software in use today is far from "correct" by anyone's measure. If GP can evolve software cheaper than people and attain the same or better quality then it may gain wider traction. A purely theoretical approach to correctness & evolution is currently too difficult as similarly seen in many AI areas. The big data approach to machine learning coupled with a "software engineering" perspective may enliven the field. — Brendan Cody-Kenny, May 11 '16 at 20:25
This is old, but I think the main problem with your answer is that you formulate it as fact "it is because", while it is actual a conjecture (and not a bad one IMHO). Sure you state facts, but the question is "why did it not take off"? It may be because of those flaws, but those flaws actually apply to normal programming as well, and it does fine in spite of it. — Mike Wise, Jul 14 '16 at 05:30
I do no see it as a conjecture but as a fact, that holds in most programming cases. See in the accepted answer the problem of infinite loop, which is only a single case of halting issue. Wether you like it or not, theory in CS, like nature in physics, will always be there to tell you the truth. The difference with normal programming is that you don't generate randomly your program, but you (should) follow a somehow rational method to complete your programming task. To my opinion, formal methods are a more practical approach to program generation than genetic programming. — Julien, Oct 14 '17 at 21:45

score 4 · Answer 6 · edited May 23 '17 at 12:17

Three things:

As Andre says, it's very hard to write a fitness function that is sufficient. This is the ultimate version of Test Driven Development. You'd have to write unit tests that provide 100% coverage of an as-yet-unwritten program. Even then, 100% coverage by itself is unlikely to be sufficient. When we've solved the problem of formally specifying the precise behaviour of all aspects of a complex system, and then verifying that a given program satisfies that specification, then we might have a chance.
Genetic Programming is non-deterministic and better suited to generating approximate solutions rather than exact solutions.
Genetic Programming, when applied to any problem of reasonable complexity, is phenomenally computationally expensive. Back in 1999, Genetic Programming Inc was using a 1,000-node cluster for their work in the field.

DampeS8N · Answer 7 · 2010-12-08T12:36:26.430

GP can't quickly solve novel problems that are beyond the knowledge of the person creating the fitness function. So, it can only currently be used to solve problems we already know how to solve, but are not ideal to fully solve, due to search space. Such as Traveling Salesman. Which can be more quickly solved with a GA.

The reason is actually pretty simple. To solve novel problems, the tools you give the GP need to be simple enough, or complete enough, that the GP search space becomes a true analogue to real genetics.

Genetic Programming, like real genetics, is subject to the same growth pattern that all platform-growth systems are. Which means that a GP will progress to a point where the core utilities included hit a platform, it levels off, and then takes a LONG time before it stumbles onto a new paradigm to build up to a new platform.

This pro-evolution video illustrates these platform-growth patterns. http://www.youtube.com/watch?v=mcAq9bmCeR0 It takes a long while to develop a new hand, and once it does, an additional hand becomes the new thing and quickly advances to an optimal example of more hands. (I should mention that this video is most-likely using a GA, not GP, but genetics are genetics)

This is all about the same logic that goes into Singularity Theory, BTW.

But what this means for GP is that it pretty-much is only good for research, not for practical application within a program. With a few simple exceptions where the requirements are slightly more in-depth than a GA can solve. Such as some scheduler programs. Where the programming search space is pretty small, and where the tools needed to solve it are well understood, and where there can be a well-defined fitness function.

hvgotcodes · Answer 8 · 2010-12-07T19:25:55.023

2

Maybe that most programmers are programmers, and not computer scientists?

Genetic programming requires serious smarts. You need to be able to break the problem down appropriately, and you need an appropriate problem to start with. And you need to know enough to know that genetic algorithms are an option. And the problem needs to not already have a well known solution.

Not everything needs a fancy algorithm. Of all the code that is written in the world, do 'standard' webapps, OSs, device programming, really need genetic algorithms?

And when it comes down to it, most programming effort is devoted to solving simple problems where a complicated approach is not needed.

edited Dec 07 '10 at 19:25

answered Dec 07 '10 at 19:12

hvgotcodes

118,147
33
203
236

Whether or not that's true, that statement still holds that there are computer scientists, and thus doesn't really answer the question. – Kendrick Dec 07 '10 at 19:15
@kendrick, i think it does, at least rhetorically. If something is being held back, why? Is it because there are not enough people who know how to do it? – hvgotcodes Dec 07 '10 at 19:16
I'll give you that. My opinion is that there are certainly enough people working in the field of computer science that more resources could be dedicated to the field if it were interesting enough and/or the value of applying resources to the problems was deemed high enough. – Kendrick Dec 07 '10 at 19:21

score 2 · Answer 9 · answered Dec 07 '10 at 19:29

I think a big part of the reason is risk management. Bear with me: when a programmer sits down to write a program, they have at least some idea of how long it'll take and of what they can do.

When instead a programmer sits down to write a program that will generate a program (using genetic programming), the uncertainty shoots through the roof: it's unclear how long the process will take, and it's unclear how good the end program can be.

There is also uncertainty in other places: how easy will it be to adjust the program later, or to fix bugs? Generated code can be nigh-impossible to debug.

Misgevolution · Answer 10 · 2015-08-07T12:36:39.090

1

There are some projects tackling the above mentioned problems in GP. An example would be the opencog MOSES

edited Aug 07 '15 at 12:36

answered Oct 12 '13 at 07:47

Misgevolution

825
10
22

score 1 · Answer 11 · answered Dec 08 '10 at 19:47

1

Primordial soup is suspicious and unappetizing. For my production code I prefer Intelligent Design.

answered Dec 08 '10 at 19:47

Josephine

1,409
1
9
11

1

I wonder if anyone has successfully used GA/GP to produce quality UI? Using things like average task success rate, stickiness indexing, and visitor-to-sale ratios. – DampeS8N Dec 09 '10 at 12:48
Interesting thought dampe. I think this is a problem that a GA could tackle, but most of the work would have to be done by the programmer, the GA would probably just be changing the position or size of elements. If you could use GP to write the HTML / CSS from scratch, that would be something. – zenna Dec 11 '10 at 12:59

score 1 · Answer 12 · answered Dec 11 '11 at 12:23

1

My personal view after couple years in GP research at university and following the field afterwards: GP fails to scale.

Simple problems represented by simple fitness functions GP can solve all right. But as mentioned before - formulating a fitness function that describes the task of evolving strcmp or a calculator or even a simple text editor is almost impossible using classic approaches.

I do like the notion of GP fitness function being TDD in perfection, though :-) Maybe some bright mind will come up with a better way of directing the simulated evolution some day but that has yet to happen..

I have some hopes for GP in the area of implicit fitness functions where I'm currently doing some 'private research'. We'll see how far it'll get!

Cheers, Jay

answered Dec 11 '11 at 12:23

Jay

6,572
3
37
65

1

Jay, how about self-modifying the fitness function so it evolves as the program does? This way you need not develop the perfect fitness function, just an algorithm that permits a perfect fitness function to evolve. – Dec 28 '12 at 18:49
@user922475 An amusing thought. Would we need then a fitness function to evaluate the fitness function? – Nicholas Morley Feb 22 '18 at 15:21
That's been done in the form of co-evolving populations. No dedicated fitness function, just 'being better' than programs in the other population. Has also been done before in a somewhat similar form in a major experiment commonly referred to as *life on Earth* :-) – Jay Mar 14 '18 at 07:48

score 0 · Answer 13 · answered Nov 17 '16 at 11:33

GP and GA, or in general Evolutionary Algorithms are more like hobbies. They are not used for real-life application, barring exceptions. The reason is that these algorithms are based on randomness. There is no certainty of getting a good answer.

Moreover, this area is flooded with sub-par research work, because its easy to understand and implement compared to other mathematically intense techniques. So students just find an objective to optimize in some engineering or science application and you have a new work - to get published.

Just compare the sponsors for their conferences with those of other AI/ML/Optimization conferences. It will be clear about their "current" importance in industry.

But its an area of research, and its upto us to make it work. At present its just a hobby!

Definitely agree with your vision. Sub-par research unfortunately applies to many research fields in applied computing. — Julien, Nov 17 '16 at 13:10
@EulDjulius The problem is that these researchers form a community of their own and within that keep publishing their work. totally unmindful of the fact that they need to compare with other state-of-the-art algorithms. Rarely will you find genetic programming mentioned in ICML or other top conferences like IJCAI. — kosmos, Nov 18 '16 at 04:30

tObi · Answer 14 · 2012-03-11T11:05:20.180

Nowadays programming is defining the fine specification in a machine readable way. You tell the programm what exactly you want it to do and how the result should look like. Its not much more anymore. That means if you want to generate the result by e.g. genetic programming you still have to do this machine readable fine specification in form of a fitness function. This would lead to at least the same but probably bigger amount of work. So its just the wrong field for genetic programming which is made for problems with an easy to define but a hard to reach specification.

edit: you could now say that in most projects this fine specification is done anyway in form of tests. i would say for a genetic programming approach tests are way to imprecise to specify you're needs. They are example based and not like programming based on a formal specification language. Genetic programming would probably produce results that suit the test cases and behave wrong in real situations with new inputs. So to get a specification level with tests thats as exact as a specification with a programming language you would need a huge amount of test cases for every eventuality. So finally you would end up in doing much more work than programming that stuff.

What is holding genetic programming back?

14 Answers14