20 votes

Genetic Algorithms

Introduction to Genetic Algorithms

Genetic algorithms can be used to solve problems that are difficult, or impossible to solve with traditional algorithms. Much like neural networks, they provide good-enough solution in short amount of time, but rarely find the best one. While they're not as popular as neural networks nor as widely used, they still have their place, as we can use them to solve complicated problems very fast, without expensive training rigs and with no knowledge of math.

Genetic algorithms can be used for variety of tasks, for example for determining the best radio antenna shape, aerodynamic shapes of cars and planes, wind mill shapes, or various queing problems. We'll use it to print "Hello, World!".

How does it work?

Genetic algorithm works in three steps.

  1. Generate random solutions
  2. Test how good they are
  3. Pick the best ones, breed and mutate them, go to step 2

It works just like evolution in nature. First, we generate randomised solutions to our problem (in this case: random strings of letters).

Then, we test each solution and give it points, where better solutions gain more points. In our problem, we would give one point for each correct letter in the string.

Afterwards, we pick the best solutions and breed it together (just combine the strings). It's not bad idea to mutate (or randomize) the string a bit.

We collect the offsprings, and repeat the process until we find good enough solution.

Generate random solutions

First of all, we need to decide in which form we will encode our solutions. In this case, it will be simply string. If we wanted to build race cars, we would encode each solution (each car) as array of numbers, where first number would be size of the first wheel, the second number would be size of the second wheel, etc. If we wanted to build animals that try to find food, fight and survive, we would choose a decision tree (something like this).

So let's start and make few solutions, or entities. One hundred should be enough.

from random import randint

goal = "Hello, World!"
allowed_characters = list("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM ,!")

def get_random_entity(n, string_length):
    entities = []
    for _ in range(0, n):
        entity = ""
        for _ in range(0, string_length):
            entity += allowed_characters[randint(0, len(allowed_characters)-1)]
        entities.append(entity)
    return entities

print(get_random_entity(100, 13))

Test how good they are

This is called a "fitness function". Fitness function determines how good a solution is, be it a car (travel distance), animal (food gathered), or a string (number of correct letters).

The most simple function we can use right now will simply count correct letters. If we wanted, we could make something like Levenshtein distance instead.

def get_fitness(entity):
    points = 0
    for i in range(0, len(entity)):
        if goal[i] == entity[i]:
            points += 1
    return points

Crossover and mutation

Now it's time to select the best ones and throw away the less fortunate entities. Let's order entities by their fitness.

Crossover is a process, when we take two entities (strings) and breed them to create new one. For example, we could just give the offspring one part from one parent and another part from second parent.

There are many ways how to do this, and I encourage you to try multiple approaches when you will be doing something like this.

P:  AAAABBB|BCCCC
P:  DDDDEEE|FGGGG

F1: AAAABBB|FGGGG

Or we can just choose at random which letter will go from which parent, which works the best here. After we have the offsprint (F1), we should mutate it. What if we were unfortunate, and H (which we need for our Hello, World!) was not in any of the 100 entities? So we take the string and for each character of the string, there is a small chance to mutate it - change it at random.

F1:  ADDDEBEFGCGG
F1`: ADHDEBEFGCGG

And it's done. Now kill certain part of old population. I don't know which percentage is best, but I usually kill about 90% of old population. The 90% that we killed will be replaced by new offsprings.

There is just one more thing: which entities do we select for crossover? It isn't bad idea - and it generally works just fine - to just give better entities higher chance to breed.

def get_offspring(first_parent, second_parent, mutation_chance):
    new_entity = ""
    for i in range(0, len(first_parent)):
        if randint(0, 100) < mutation_chance:
            new_entity += allowed_characters[randint(0, len(allowed_characters)-1)]
        else:
            if randint(0, 1) == 0:
                new_entity += first_parent[i]
            else:
                new_entity += second_parent[i]
    return new_entity

When we add everything together, we get this output:

Generation 1, best score: 2 ::: QxZPjoptHfNgX
Generation 2, best score: 3 ::: XeNlTOQuAZjuZ
Generation 3, best score: 4 ::: weolTSQuoZjuK
Generation 4, best score: 5 ::: weTgnC uobNdJ
Generation 5, best score: 6 ::: weTvny uobldb
Generation 6, best score: 6 ::: HellSy mYbZdC
Generation 7, best score: 7 ::: selOoXBWoAKn!
Generation 8, best score: 8 ::: HeTloSoWYZlh!
Generation 9, best score: 8 ::: sellpX WobKd!
Generation 10, best score: 9 ::: welloq WobSdb
Generation 11, best score: 9 ::: selloc WoZjd!
Generation 12, best score: 10 ::: wellxX WoVld!
Generation 13, best score: 10 ::: welltX World!
Generation 14, best score: 10 ::: welltX World!
Generation 15, best score: 10 ::: welltX World!
Generation 16, best score: 11 ::: zellov Wobld!
Generation 17, best score: 11 ::: Hellty World!
Generation 18, best score: 11 ::: welloX World!
Generation 19, best score: 11 ::: welloX World!
Generation 20, best score: 11 ::: welloX World!
Generation 21, best score: 12 ::: welloX World!
Generation 22, best score: 12 ::: Helloy World!
Generation 23, best score: 12 ::: Helloy World!
Generation 24, best score: 12 ::: Helloy World!
Generation 25, best score: 12 ::: Helloy World!
Generation 26, best score: 12 ::: Helloy World!
Generation 27, best score: 12 ::: Helloy World!
Generation 28, best score: 12 ::: Helloy World!
Generation 29, best score: 12 ::: Helloy World!
Generation 30, best score: 12 ::: Helloy World!
Generation 31, best score: 12 ::: Helloy World!
Generation 32, best score: 12 ::: Helloy World!
Generation 33, best score: 12 ::: Helloy World!
Generation 34, best score: 13 ::: Helloy World!
Generation 35, best score: 13 ::: Hello, World!

As we can see, we find pretty good solution very fast, but it takes very long to find perfect solution. The complete code is here.

Maintaining diversity

When we solve difficult problems, it starts to be increasingly important to maintain diversity. When all your entities are basically the same (which happened in this example), it's difficult to find other solutions than those that are almost the same as the currently best one. There might be a much better solution, but we didn't find it, because all solutions that are different to currently best one are discarded. Solving this is the real challenge of genetic algorithms. One of the ideas is to boost diverse solutions in fitness function. So for every solution, we compute distance to the current best solutions and add bonus points for distance from it.

8 comments

  1. [3]
    stu2b50
    Link
    Not a major issue, but I do think the comparison with neural networks is a bit disingenuous. Hypothetically, you could even train a neural network with a genetic algorithm. It would not even get...

    Not a major issue, but I do think the comparison with neural networks is a bit disingenuous. Hypothetically, you could even train a neural network with a genetic algorithm. It would not even get close to a finding any kind of optima, but you could.

    Genetic algorithms are just a type of optimization. With neural networks, the more direct comparison would be genetic optimization vs gradient descent (and tbh, gradient descent typically wins), i.e the optimization techniques used. However, genetic algorithms are great where you're dealing with a non-continuous, non-differentiable, or otherwise blackbox objective function.

    3 votes
    1. Staross
      Link Parent
      I'm wouldn't be so sure about that: https://arxiv.org/abs/1712.06567 And some of the more advanced evolution strategies algorithm are really good and are well understood (iirc CMA-ES has a proof...

      It would not even get close to a finding any kind of optima, but you could.

      I'm wouldn't be so sure about that:

      Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion. The Deep GA successfully evolves networks with over four million free parameters, the largest neural networks ever evolved with a traditional evolutionary algorithm.

      https://arxiv.org/abs/1712.06567

      And some of the more advanced evolution strategies algorithm are really good and are well understood (iirc CMA-ES has a proof of global convergence for example).

      4 votes
    2. cos
      Link Parent
      I've only just started delving into machine learning, but aren't genetic algorithms better at escaping local minima, especially in comparison to gradient descent?

      I've only just started delving into machine learning, but aren't genetic algorithms better at escaping local minima, especially in comparison to gradient descent?

      1 vote
  2. DataWraith
    Link
    With regard to diversity, one approach that has worked well for me (on toy-ish problems at least) is Lexicase selection. It is simple to implement and maintains diversity without the need for an...

    With regard to diversity, one approach that has worked well for me (on toy-ish problems at least) is Lexicase selection. It is simple to implement and maintains diversity without the need for an explicit diversity measure.

    It selects individuals for the next generation one at a time by repeatedly testing the entire previous population on a set of small test cases (e.g. "did it get the third letter right?"). Individuals that fail a test are removed from consideration for the current reproductive slot until only one is left or you run out of test cases, in which case one of the remaining individuals is chosen at random.

    Since the order of the test cases is randomly permuted for each selection, equal weight is given to all cases (on average), but individuals that do well on difficult cases that few others can solve will be disproportionally rewarded, leading to more diversity in the population.

    3 votes
  3. wanda-seldon
    Link
    I have added this post to the layperson's introduction to meta-post, with @Soptik's approval.

    I have added this post to the layperson's introduction to meta-post, with @Soptik's approval.

    2 votes
  4. [3]
    alexandria
    Link
    Something I've specifically wrestled with is, maintaining randomness with tree-based genotypes. Is random-walk sufficient to choose the mutated structure? Can I do better?

    Something I've specifically wrestled with is, maintaining randomness with tree-based genotypes. Is random-walk sufficient to choose the mutated structure? Can I do better?

    1 vote
    1. [2]
      Soptik
      Link Parent
      When I last used trees, I had big problems with mantaining diversity. While mutation (replace this subtree with randomly generated) worked, eventually I ended up with one solution unable to find...

      When I last used trees, I had big problems with mantaining diversity. While mutation (replace this subtree with randomly generated) worked, eventually I ended up with one solution unable to find anything else.

      There is probably a better way how to mutate a structure, but I’ve never heard of any.

      Btw, I asked someone who interests a lot in this and he recommended me to give bonus points to solutions that differ from best solution. I didn’t yet try it, since I’m just rewriting the program, but I’m curious how it will work out.

      1. alexandria
        Link Parent
        Yeah. I had a skim through Karl Simm's papers on this (He seemed to get pretty good results?) and I didn't see much about the method that he used. I might dig out his mail address and ask him. The...

        Yeah. I had a skim through Karl Simm's papers on this (He seemed to get pretty good results?) and I didn't see much about the method that he used. I might dig out his mail address and ask him. The book I've read on it seems to pretty exclusively target binary-based phenotypes, which is frustrating.

        Btw, I asked someone who interests a lot in this and he recommended me to give bonus points to solutions that differ from best solution. I didn’t yet try it, since I’m just rewriting the program, but I’m curious how it will work out.

        Oh, interesting! I'll have to make a note for that :)