Students from the Computer Technology Department and staff of the Computer Technologies International Laboratory are long-time participants of the GECCO conference which is organized by ACM SIGEVO - a subdivision of the Association for Computing Machinery which focuses on genetic and evolutionary computations. The conference has been around since 1999; at this event, scientists and engineers from all over the world discuss the most relevant topics in such fields as genetic algorithms, complex systems, digital entertainment industry and arts, machine learning and many others.
At GECCO-2017, Maksim Buzdalov and Benjamin Doerr presented the results of their research where they succeeded in defining the conditions at which the recently proposed genetic algorithm (1+(λ, λ)) effectively solves the Boolean satisfiability problem (SAP) and proving this mathematically (you can read the full article here).
GECCO–2017 Credit: social networks
Genetic algorithms are commonly used for solving optimization tasks. Let's say there is a task that can be solved in different ways. A particular value of the objective function that needs to be optimized corresponds to each solution. The algorithm can calculate the function and use the results to choose several optimal solutions, create new ones and repeat the process unless it finds the satisfactory one. If we were to explain this method to Darwin's evolutionary theory, then a solution is a species, a set of solutions - population, objective function is the fitness function and the genetic algorithm is natural selection. You can read more about genetic algorithms here.
Genetic algorithms, as well as other algorithms from the field of evolutionary computations are in essence instruments for solving practical tasks. Yet, for applying them to practice, one has to understand why they are effective and to which extent, which calls for theoretical research. In the last 20 years, it became clear that one can get the best results when studying simple optimization tasks and simple evolutionary algorithms. Execution time here is commonly seen as the number of computations of the fitness function.
A good example of that are the OneMax problem and the (1+1) evolutionary algorithm. The OneMax problem is a simple problem consisting in maximizing the number of ones of a bitstring. The (1+1) evolutionary algorithm is one of the simplest evolutionary algorithms, that has "parent" and "offspring" solutions, the latter is derived from the former by using a mutation operator when each gene of the "parent" is replaced with its opposite with a certain degree of probability. If the "offspring" turns out to be at least as effective, it replaces the “parent”.
Credit: social networks
For quite a long time, the (1+1) evolutionary algorithm was deemed best for solving the OneMax problem; more advanced algorithms (for instance, genetic algorithms that make use of the crossover operator that works with two solutions at the same time) were less effective. An important question was whether there was any need for the crossover operator at all. What is more, theoretically the (1+1) evolutionary algorithm is not the optimal one - when the optimization is almost finished, it derives less information from each fitness function computation, as it can't derive information from ineffective solutions, and the effective ones become really similar to each other.
How can one compensate these drawbacks?
In 2013, Benjamin Doerr and his co-authors proposed a new genetic algorithm (1+(λ, λ)) which compensated the aforementioned drawbacks. The algorithm's iterations are two-stage - during the first stage, the species are generated using a more aggressive mutation operator (which allows to derive information from ineffective species), during the second - the best of the “offspring” gives its best traits to the "parent" via the crossover operator. Thanks to that, the new algorithm solves the OneMax problem asymptomatically faster in both practice and theory.
The algorithm's efficiency is also higher than that of other algorithms in solving other problems as well. For instance, in 2014 it was proved that this algorithm is good at solving the satisfiability problem (SAP). This problem is most important in the computational complexity theory. In another article in 2015, it was proved that under certain conditions, the SAP problem is similar to the OneMax problem, which may well mean that the (1+(λ, λ)) algorithm will be more effective at solving the SAP problem as well. During the GECCO conference, Maksim Buzdalov and Benjamin Doerr spoke about how they succeeded in proving it.
Best Paper Award
"Not only did we theoretically prove that the (1+(λ, λ)) algorithm works fast, but also showed it practically. We've learned to solve SAP problems with 1,000,000 variables and tens of millions of disjunctions in just 45 minutes on a common laptop, and OneMax problems with 16 million variables in just a couple of minutes. Our competitors at the conference lagged behind in both the amount of variables and the time spent on the task by at least 10 times," shares Maksim Buzdalov.
He added that though their algorithm still has to be finalized, it is already used by programmers for research. In future, Maksim and his colleagues plan to collaborate with another research team on developing an algorithm that will combine all the best traits from their results thus far in order to solve the problems of discrete optimization. The solutions developed by Maksim Buzdalov and Benjamin Doerr can also be used in deep machine learning algorithms during neural networks' structure synthesis.
Another of ITMO's scientists' achievements at GECCO-2017 being named the best collaborative project in the student section. This award was given to Master's student Nina Bulanova and Maksim Buzdavlov. In theoretical research on evolutionary algorithms, during the crossover of two solutions in a genetic algorithm, researchers usually derive one solution. In practice, however, crossover operators that result in getting two solutions are the ones more commonly used. Researchers from ITMO university proved that one can speed up theoretical research on problems such as OneMax and several others by using both of the "offspring”.