Introduction to Genetic Algorithms Genetic algorithms can be used to solve problems that are difficult, or impossible to solve with traditional algorithms. Much like neural networks, they provide...
Introduction to Genetic Algorithms
Genetic algorithms can be used to solve problems that are difficult, or impossible to solve with traditional algorithms. Much like neural networks, they provide good-enough solution in short amount of time, but rarely find the best one. While they're not as popular as neural networks nor as widely used, they still have their place, as we can use them to solve complicated problems very fast, without expensive training rigs and with no knowledge of math.
Genetic algorithms can be used for variety of tasks, for example for determining the best radio antenna shape, aerodynamic shapes of cars and planes, wind mill shapes, or various queing problems. We'll use it to print "Hello, World!".
How does it work?
Genetic algorithm works in three steps.
- Generate random solutions
- Test how good they are
- Pick the best ones, breed and mutate them, go to step 2
It works just like evolution in nature. First, we generate randomised solutions to our problem (in this case: random strings of letters).
Then, we test each solution and give it points, where better solutions gain more points. In our problem, we would give one point for each correct letter in the string.
Afterwards, we pick the best solutions and breed it together (just combine the strings). It's not bad idea to mutate (or randomize) the string a bit.
We collect the offsprings, and repeat the process until we find good enough solution.
Generate random solutions
First of all, we need to decide in which form we will encode our solutions. In this case, it will be simply string. If we wanted to build race cars, we would encode each solution (each car) as array of numbers, where first number would be size of the first wheel, the second number would be size of the second wheel, etc. If we wanted to build animals that try to find food, fight and survive, we would choose a decision tree (something like this).
So let's start and make few solutions, or entities. One hundred should be enough.
from random import randint goal = "Hello, World!" allowed_characters = list("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM ,!") def get_random_entity(n, string_length): entities =  for _ in range(0, n): entity = "" for _ in range(0, string_length): entity += allowed_characters[randint(0, len(allowed_characters)-1)] entities.append(entity) return entities print(get_random_entity(100, 13))
Test how good they are
This is called a "fitness function". Fitness function determines how good a solution is, be it a car (travel distance), animal (food gathered), or a string (number of correct letters).
The most simple function we can use right now will simply count correct letters. If we wanted, we could make something like Levenshtein distance instead.
def get_fitness(entity): points = 0 for i in range(0, len(entity)): if goal[i] == entity[i]: points += 1 return points
Crossover and mutation
Now it's time to select the best ones and throw away the less fortunate entities. Let's order entities by their fitness.
Crossover is a process, when we take two entities (strings) and breed them to create new one. For example, we could just give the offspring one part from one parent and another part from second parent.
There are many ways how to do this, and I encourage you to try multiple approaches when you will be doing something like this.
P: AAAABBB|BCCCC P: DDDDEEE|FGGGG F1: AAAABBB|FGGGG
Or we can just choose at random which letter will go from which parent, which works the best here. After we have the offsprint (
F1), we should mutate it. What if we were unfortunate, and
H(which we need for our
Hello, World!) was not in any of the 100 entities? So we take the string and for each character of the string, there is a small chance to mutate it - change it at random.
F1: ADDDEBEFGCGG F1`: ADHDEBEFGCGG
And it's done. Now kill certain part of old population. I don't know which percentage is best, but I usually kill about 90% of old population. The 90% that we killed will be replaced by new offsprings.
There is just one more thing: which entities do we select for crossover? It isn't bad idea - and it generally works just fine - to just give better entities higher chance to breed.
def get_offspring(first_parent, second_parent, mutation_chance): new_entity = "" for i in range(0, len(first_parent)): if randint(0, 100) < mutation_chance: new_entity += allowed_characters[randint(0, len(allowed_characters)-1)] else: if randint(0, 1) == 0: new_entity += first_parent[i] else: new_entity += second_parent[i] return new_entity
When we add everything together, we get this output:
Generation 1, best score: 2 ::: QxZPjoptHfNgX Generation 2, best score: 3 ::: XeNlTOQuAZjuZ Generation 3, best score: 4 ::: weolTSQuoZjuK Generation 4, best score: 5 ::: weTgnC uobNdJ Generation 5, best score: 6 ::: weTvny uobldb Generation 6, best score: 6 ::: HellSy mYbZdC Generation 7, best score: 7 ::: selOoXBWoAKn! Generation 8, best score: 8 ::: HeTloSoWYZlh! Generation 9, best score: 8 ::: sellpX WobKd! Generation 10, best score: 9 ::: welloq WobSdb Generation 11, best score: 9 ::: selloc WoZjd! Generation 12, best score: 10 ::: wellxX WoVld! Generation 13, best score: 10 ::: welltX World! Generation 14, best score: 10 ::: welltX World! Generation 15, best score: 10 ::: welltX World! Generation 16, best score: 11 ::: zellov Wobld! Generation 17, best score: 11 ::: Hellty World! Generation 18, best score: 11 ::: welloX World! Generation 19, best score: 11 ::: welloX World! Generation 20, best score: 11 ::: welloX World! Generation 21, best score: 12 ::: welloX World! Generation 22, best score: 12 ::: Helloy World! Generation 23, best score: 12 ::: Helloy World! Generation 24, best score: 12 ::: Helloy World! Generation 25, best score: 12 ::: Helloy World! Generation 26, best score: 12 ::: Helloy World! Generation 27, best score: 12 ::: Helloy World! Generation 28, best score: 12 ::: Helloy World! Generation 29, best score: 12 ::: Helloy World! Generation 30, best score: 12 ::: Helloy World! Generation 31, best score: 12 ::: Helloy World! Generation 32, best score: 12 ::: Helloy World! Generation 33, best score: 12 ::: Helloy World! Generation 34, best score: 13 ::: Helloy World! Generation 35, best score: 13 ::: Hello, World!
As we can see, we find pretty good solution very fast, but it takes very long to find perfect solution. The complete code is here.
When we solve difficult problems, it starts to be increasingly important to maintain diversity. When all your entities are basically the same (which happened in this example), it's difficult to find other solutions than those that are almost the same as the currently best one. There might be a much better solution, but we didn't find it, because all solutions that are different to currently best one are discarded. Solving this is the real challenge of genetic algorithms. One of the ideas is to boost diverse solutions in fitness function. So for every solution, we compute distance to the current best solutions and add bonus points for distance from it.20 votes
Hi everyone! In this challenge, we will build simple genetic algorithm. The goal is to create genetic algorithm that will learn and output predefined text ("Hello World!"). The goal can be...
Hi everyone! In this challenge, we will build simple genetic algorithm.
The goal is to create genetic algorithm that will learn and output predefined text ("Hello World!").
The goal can be achieved with any language and you'll need just simple loops, collection and knowledge how to create and use objects, even beginners can try to complete this challenge.
I'll try to explain it as best as I can. Genetic algorithms are approximation algorithms - they often do not find the best solution, but they can find very good solutions, fast. It's used when traditional algorithms are either way too slow, or they even don't exist. It's used to, for example, design antennas, or wind turbines. We will use it to write "Hello World".
First of all, we define our
Entity. It is solution to given problem, it can be list of integers that describe antenna shape, decision tree, or string ("Hello World"). Each entity contains the solution (
string solution) and fitness function. Fitness function says, how good our entity is. Our fitness function will return, how similar is entity
solutiontext to "Hello World" string.
But how will the program work? First of all, we will create list of entities
List<Entity>. We will make, for example, 1000 entities (randomly generated). Their
Entity.solutionwill be randomized string of length 11 (because "Hello World" is 11 characters long).
Once we have these entities, we will repeat following steps, until the best entity has
fitness == 1.0, or 100% similarity to target string.
First of all, we compute fitness function of all entities. Then, we will create empty list of entities of length 1000. Now, we will 1000-times pick two entities (probably weighted based on their fitness) and combine their strings. We will use the string to create new entity and we will add the new entity to the new list of entities.
Now, we delete old entities and replace them with entities we just made.
The last step is mutation - because what if no entity has the "W" character? We will never get our "Hello World". So we will go through every entity and change 5% (or whatever number you want) of characters in their solution to random characters.
We let it run for a while - and it is done!
So to sum up what we did:
entities <- 1000 random entities while entities.best.fitness < 1: for every entity: compute fitness newEntities <- empty list 1000-times: choose two entities from "entities", based on their fitness combine solutions of these entities and make newEntity newEntities.add(newEntity) for every entity: mutate // Randomly change parts of their strings print(entities.best.solution) // Hello World!
Now go and create the best, fastest, and most pointless, genetic algorithm we've ever seen!23 votes