进化算法跟遗传算法_进化计算机科学:遗传算法简介

微生嘉祥
2023-12-01

进化算法跟遗传算法

by Ben Mmari

通过本·马里

进化计算机科学:遗传算法简介 (The Computer Science of Evolution: an Introduction to Genetic Algorithms)

Being a computer scientist with an interest in evolution and biological processes, the topic of genetic algorithms, and more broadly, evolutionary computation is to me what a candy shop is to a 5-year-old: Heaven. The mere possibility of being able to merge two of my interests in such a seamless manner has been extremely exhilarating, and it would be wrong for me to keep this knowledge and excitement all to myself.

作为计算机科学家,他对进化和生物过程,遗传算法以及更广泛的意义上的进化感兴趣,对我来说,对于五岁的糖果店来说,它就像糖果店:天堂。 能够以一种无缝的方式合并我的两个兴趣的可能性非常令人振奋,而我将这种知识和兴奋全部保留给我自己是错误的。

So in an attempt to test out some of my learnings thus far, and share my findings with the rest of the world, I have decided to put together a series of articles on this topic.

因此,为了尝试检验到目前为止的一些知识并与世界其他地方分享我的发现,我决定就该主题撰写一系列文章。

In this post, I will provide a brief introduction to genetic algorithms and explain how they imitate the same natural processes that have been taking place on Earth for billions of years.

在这篇文章中,我将简要介绍遗传算法,并解释它们如何模仿地球上数十亿年以来发生的相同自然过程。

地球上的生命 (Life on Earth)

Over the past 3.5 billion years, mother nature, father time, evolution and natural selection have collaborated together to produce all of the specialized forms of life that we see on earth today: like the carnivorous Venus Flytrap plant; the ocean-dwelling Atlantic Flying Fish; echolocation-using bats; long-necked giraffes; super-quick cheetahs, dancing Honeybees; and of course, yours truly, the street smart Homo sapiens.

在过去的35亿年中,大自然的母亲,父亲的时间,进化和自然选择共同协作,产生了我们今天在地球上看到的所有特殊的生命形式:例如肉食性的维纳斯捕蝇草; 居住在海洋中的大西洋飞鱼; 回声定位蝙蝠; 长颈长颈鹿; 超级快速的猎豹,跳舞的蜜蜂; 当然,您的确是街头智慧的智人。

Needless to say, life on Earth is one of, if not the most successful experiments ever run in our universe; and judging from the impressive outcomes of this experiment, it is clear that evolution is clearly onto something.

不用说,地球上的生命即使不是最成功的实验,也是我们宇宙中进行的实验之一。 从该实验令人印象深刻的结果来看,很明显,进化显然已经存在。

Recently, we humans — just one of the many end products of this process — realized that we could also take advantage of this ingenious approach to progressive problem solving, and since the 1950s, computer scientist, geneticists, mathematicians, and biologist, have attempted to mimic these biological processes through the implementation of computer simulations. With the aim of producing optimal solutions for difficult, non-trivial problems, in an efficient manner.

最近,我们人类-这个过程的众多最终产品之一-意识到我们还可以利用这种独创的方法来逐步解决问题,并且自1950年代以来,计算机科学家,遗传学家,数学家和生物学家都试图通过执行计算机模拟来模仿这些生物过程。 为了以有效的方式为困难的,不重要的问题提供最佳解决方案。

盲人制表师 (The blind watchmaker)

One of the first books I came across that sparked my interest in the field of evolutionary biology was The Blind Watchmaker, by Richard Dawkins. In this book, Richard Dawkins explains how complex mechanisms like echolocation (a process that bats use to navigate, hunt and forage, also known as bio-sonar), complex structures like spiderwebs (which spiders use to attract and catch their prey), and complex instruments like the human eye (those two spherical objects that you are currently using to read this article) are simply the result of thousands, if not millions of years of evolution and adaptation.

理查德·道金斯(Richard Dawkins)撰写的The Blind Watchmaker是我发现的第一本引起我对进化生物学领域兴趣的书。 在本书中,理查德·道金斯(Richard Dawkins)解释了诸如回声定位 (蝙蝠用来导航,狩猎和觅食的过程,也称为生物声纳)的复杂机制,诸如蜘蛛网(蜘蛛用来吸引和捕捉猎物)等复杂结构的机制,以及诸如人眼这样的复杂仪器(您当前正在使用的两个球形物体)只是数千甚至数千万年的进化和适应的结果。

Even though these marvels of nature give the impression that they were built with a purpose from the get-go (i.e by a conscious ‘maker’), they are actually just a result of iterations upon iterations of trial and error, bundled up with ever-changing selection pressure (i.e a change in climate, habitat, or the behaviour and capabilities of predators or prey). So while they may look and behave like the outcome of precise, forward-thinking engineering, they are actually the result of a completely blind process, a process that does not know beforehand what the perfect ‘solution’ would be.

尽管这些自然奇观给人的印象是它们是从一开始就建立起来的目的(即由有意识的“创造者”制造),但它们实际上仅仅是反复试验和反复的结果,与以往-改变选择压力(即气候,栖息地或掠食者或猎物行为和能力的变化)。 因此,尽管它们看起来和行为类似于精确的,具有前瞻性的工程结果,但实际上它们是完全盲目的过程的结果,而该过程事先并不知道完美的“解决方案”是什么。

什么是遗传算法,为什么我们需要它们? (What are genetic algorithms and why do we need them?)

Genetic algorithms are a technique used to generate high-quality solutions to optimization and search problems, which are based on fundamental biological processes. These algorithms are used in situations where the possible range of solutions is very large, and where the more basic approaches to problem-solving like exhaustive search/brute force would consume too much time and effort.

遗传算法是一种用于生成基于最基本生物学过程的优化和搜索问题的高质量解决方案的技术。 这些算法用于解决方案的可能范围非常大的情况,并且更彻底的问题解决方法(如穷举搜索/蛮力)将花费大量时间和精力。

The traveling salesman problem asks the following question: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?” It is an NP-hard problem in combinatorial optimization.

旅行推销员问题提出以下问题:“给出一个城市列表以及每对城市之间的距离,访问每个城市并返回原城市的最短路线是什么?” 这是组合优化中的一个NP难题。

We can use genetic algorithms to provide high-quality solutions to this problem, at a much lower cost than the more primitive problem-solving techniques, like exhaustive search, which would require you to permute through all possible solutions.

我们可以使用遗传算法为这个问题提供高质量的解决方案,而其成本要比更原始的解决问题的技术(例如穷举搜索)低得多,后者需要您仔细研究所有可能的解决方案。

遗传算法如何工作? (How do genetic algorithms work?)

An algorithm works by iterating through a number of steps, up until it reaches a predefined termination point. Each iteration of the genetic algorithm produces a new generation of possible solutions, which, in theory, should be an improvement on the previous generation.

算法通过迭代多个步骤来工作,直到达到预定的终止点为止。 遗传算法的每次迭代都会产生新一代可能的解决方案,从理论上讲,这应该是对上一代解决方案的改进。

The steps are as follows:

步骤如下:

1. Create an initial population of N possible solutions (the primordial soup)

1.创建N个可能的解决方案(原始汤)的初始种群

The first step of the algorithm is to create an initial group of solutions that serve as the base solutions in generation 0. Each solution in this initial population will carry a set of chromosomes, which are made up of a collection of genes, where each gene is assigned to one of the possible variables of the problem. It is important that the solutions in the initial population are created with randomly assigned genes, in order to have a high degree of genetic variation.

该算法的第一步是创建一组初始溶液,用作第0代中的基本溶液。该初始种群中的每个溶液将携带一组染色体,这些染色体由一组基因组成,其中每个基因被分配给问题的可能变量之一。 重要的是要在初始种群中创建具有随机分配基因的解决方案,以实现高度的遗传变异。

2. Rank the solutions of the population by fitness (survival of the fittest, part 1).

2.按适合度(适者生存,第1部分)对总体的解决方案进行排名。

In this step, the algorithm needs to be able to determine what makes one solution more ‘fit’ than another solution. This is determined by the fitness function. The aim of the fitness function is to evaluate the genetic viability of the solutions within the population, placing those with the most viable, favorable & superior genetic traits at the top of the list.

在此步骤中,算法需要能够确定是什么使一个解决方案比另一种解决方案更“适合”。 这由适应度函数确定。 适应度函数的目的是评估解决方案在群体中的遗传生存力,将具有最可行,最有利和最优异遗传特征的解决方案列为首位。

In the traveling salesman problem, the fitness function could be a calculation of the total distance traveled by the solution. Where a shorter distance equates to higher fitness.

在旅行商问题中,适应度函数可以是解决方案旅行的总距离的计算。 距离越短意味着身体素质越高。

3. Cull the weaker solutions (survival of the fittest, part 2)

3.选择较弱的解决方案(适者生存,第2部分)

In this step, the algorithm removes the less fit solutions from the population. The ‘fittest’ does not necessarily mean the strongest, the fastest or the fiercest, as humans usually tend to assume. Survival of the fittest simply means that the better equipped an organism is to survive in its environment, the more likely it is to live long enough to reproduce and spread its genes onto the next generation.

在这一步骤中,该算法从总体中删除不太适合的解。 “优胜劣汰”并不一定意味着人类通常会假设的最强,最快或最猛。 优胜劣汰的生存仅仅意味着有机体在其环境中生存的能力越强,生存的时间就越长,足以繁殖并向下一代传播其基因。

Steps 3 and 4 are collectively known as selection.

步骤3和4统称为选择

4. Breed the stronger solutions (survival of the fittest, part 3)

4.培育更强大的解决方案(适者生存,第3部分)

The remaining solutions are then paired with each other in order to mate and reproduce offspring. During this process, in its most basic form, each parent will contribute a % of their genes (in nature it is a 50/50 split) to each of their offspring, where P1(G)% +P2(G)% = 100%. The process of determining which of the parents’ genes will be inherited by the offspring is known as crossover.

然后将其余溶液彼此配对,以配对并繁殖后代。 在此过程中,以其最基本的形式,每个父母将为其每个后代贡献一定比例的基因(本质上是50/50的分裂),其中P1(G)%+ P2(G)%= 100 %。 确定后代将继承哪些父母基因的过程称为杂交

5. Mutate the genes of the offspring (mutation)

5. 突变后代的基因( 突变 )

The offspring will contain a percentage of the ‘mother’s’ genes, and a percentage of the ‘fathers’ genes and occasionally there will be a ‘mutation’ of one or more of these genes. A mutation is essentially a genetic abnormality, a copying error which causes one or more of the offspring’s genes to differ from the genes it inherited from its parents. In genetic algorithms, in some cases a mutation will increase the fitness of the offspring, in other cases, it will reduce it.

后代将包含一定百分比的“母亲”基因和一定比例的“父亲”基因,并且偶尔会有一个或多个这些基因的“突变”。 突变本质上是一种遗传异常,是一种复制错误,导致一个或多个后代的基因与其从父母那里继承的基因不同。 在遗传算法中,在某些情况下,突变会提高后代的适应性,在其他情况下,则会降低后代的适应性。

It is important to note that there does not need to be a mutation with each offspring, the required mutation frequency can also be a parameter of the algorithm.

重要的是要注意,每个后代都不需要突变,所需的突变频率也可以是算法的参数。

In genetic algorithms, selection, crossover, and mutation are known as genetic operators.

在遗传算法中,选择,交叉和变异被称为遗传算子

6. Termination

6.终止

Steps 2 to 5 will be repeated up until a predefined termination point. This termination point can be one of the following:

步骤2至5将重复进行,直到预定义的终止点为止。 该终止点可以是以下之一:

  1. Maximum time/resource allocation reached.

    达到最大时间/资源分配。
  2. Fixed number of generations have passed.

    固定的世代已过去。
  3. The fitness of the dominant solution cannot be surpassed by any future generations.

    主导解决方案的适用性是任何后代都无法超越的。

解决方案融合 (Solution convergence)

1. Global optimum

1.全球最优

In the ideal situation, the fittest solution will have the highest fitness value possible, i.e it will be the optimal solution, meaning that there will be no need to continue with the algorithm and produce further generations.

在理想情况下,最适合的解决方案将具有尽可能高的适应度值,即它将是最佳解决方案,这意味着无需继续进行算法并产生更多的代数。

2. Local optimum

2.局部最优

In some cases, if the parameters of the algorithm are not reasonable, the population may tend towards a premature convergence upon a less optimal solution, which is not the global optimum that we are after, but rather a local one. Once here, continuing the algorithm and producing further generations may be futile.

在某些情况下,如果算法的参数不合理,则总体可能会趋向于在次优解上过早收敛,最优解不是我们追求的全局最优,而是局部最优。 一旦到达这里,继续算法并产生更多的代可能是徒劳的。

如果没有突变会怎样? (What would happen if there were no mutations?)

On first glance, mutations may seem like an unnecessary, irrelevant part of the process. But without this fundamental aspect of randomness, evolution by natural selection would be completely restricted to the genetic variety set by the initial population, and there would be no new traits introduced into the population after that. This would severely hinder nature’s problem-solving capabilities, and life on earth would not be able to ‘adapt’ to its environment, at least not physically.

乍一看,突变似乎是过程中不必要的,不相关的部分。 但是如果没有这种随机性的基本方面,自然选择的进化将完全局限于初始种群设定的遗传多样性,此后就不会有新的性状引入种群。 这将严重阻碍自然界解决问题的能力,地球上的生命将至少在物理上无法适应环境。

If this was the case in our genetic algorithm, at some point in our simulation, the future generations of the population would not be able to explore part of the solution space that their predecessors did not explore. A simulation without any mutations would severely restrict the genetic variation within the population, and in most cases — depending on the initial population — prevent us from ever reaching a global optimum.

如果在我们的遗传算法中是这种情况,那么在模拟的某个时刻,子孙后代将无法探索其前辈未曾探索过的部分解空间。 没有任何突变的模拟将严重限制种群内部的遗传变异,并且在大多数情况下(取决于初始种群),使我们无法达到全局最优。

如果人口规模不够大怎么办? (What would happen if the population size was not large enough?)

I was recently at the Jukani Wildlife Sanctuary in Plettenberg, where I had the privilege of meeting a white tiger. He was a truly majestic animal. He was large, he looked ferocious, and, he was also 80% blind and getting progressively worse as the years went by.

我最近在普莱滕贝格(Plettenberg)的尤卡尼野生动物保护区(Jukani Wildlife Sanctuary),有幸遇见了一只白老虎。 他是一个真正的雄伟动物。 他很大,看上去很凶猛,而且失明的人也占80%,并且随着岁月的流逝变得越来越糟。

Why was he blind? Because he is a product of generations of inbreeding. These white tigers are only produced when two tigers that carry a recessive gene controlling coat color are bred together. Thus, in order to ensure the continuation of these tigers in captivity, people have been breeding these tigers within a very limited population in order to either show them off at circuses, parade them at zoos, or keep them as household pets.

他为什么瞎了? 因为他是几代近交的产物。 仅当将两只携带隐性基因控制毛色的虎一起繁殖时,才会产生这些白虎。 因此,为了确保圈养这些老虎,人们一直在非常有限的种群中繁殖这些老虎,以便在马戏团里炫耀它们,在动物园里游行或作为家庭宠物饲养。

But one of the negative effects of inbreeding is that you severely limit the genetic variation within the species, which progressively increases the chances that harmful recessive traits will be passed onto the offspring.

但是近交的负面影响之一是,您严重限制了物种内部的遗传变异,从而逐渐增加了有害隐性性状传给后代的机会。

Even in the wild, inbreeding can still be a massive problem. Over the past few decades, the rhino population in Southern Africa has been significantly impacted due to poaching, and if the population size reaches a low enough number it would mean that maintaining the genetic diversity of these threatened rhino species would be extremely difficult. So even if poaching doesn’t completely lead them to extinction, inbreeding could.

即使在野外,近交仍然是一个巨大的问题。 在过去的几十年中,南部非洲的犀牛种群由于盗猎而受到了严重影响,如果种群数量达到足够低的水平,这将意味着维持这些濒临灭绝的犀牛物种的遗传多样性将极为困难。 因此,即使偷猎并没有完全导致他们灭绝,近亲繁殖还是有可能。

Of course, humans are no strangers to inbreeding. One famous result of continuous inbreeding within our own species is Charles (Carlos) the II of Spain.

当然,人类对近交并不陌生。 西班牙二世的查尔斯(卡洛斯)在我们自己的物种中进行近亲繁殖的一个著名成果是。

“The Habsburg King Carlos II of Spain was sadly degenerated with an enormous misshapen head. His Habsburg jaw stood so much out that his two rows of teeth could not meet; he was unable to chew. His tongue was so large that he was barely able to speak. His intellect was similarly disabled.”

“西班牙的哈布斯堡王卡洛斯二世不幸地堕落了,脑袋畸形。 哈布斯堡王朝的下巴显得如此突出,以至于他的两排牙齿无法汇合。 他无法咀嚼。 他的舌头太大,几乎无法说话。 他的智力也同样残疾。”

‘Inbreeding’ in our genetic algorithm, essentially means the breeding of solutions that have a very similar genetic makeup, which, thankfully, in this case, would not result in offspring with a predisposition to any physical abnormalities. But if the population is very small and if all of the solutions share a very similar genetic makeup then the fitness of the future generations of the population will be severely restricted. Meaning that it could take much longer to converge upon a globally optimal solution if we even get there at all.

在我们的遗传算法中,“近交”本质上是指具有非常相似的遗传组成的解决方案的繁殖,幸运的是,在这种情况下,不会导致后代易患任何身体异常。 但是,如果人口很小 ,并且所有解决方案都具有非常相似的遗传组成,那么子孙后代的适应性将受到严格限制。 这意味着,即使我们根本无法达成共识,收敛到全局最优解决方案也可能需要更长的时间。

Inbreeding is not always a bad thing, it just depends on which stage of the simulation you are in. In very advanced stages of the simulation, as the population converges towards a global/local optima, it is obviously very hard to avoid inbreeding, because, in some cases, many of the dominant solutions will be very similar to each other, and thus, will share a lot of the same genetic traits.

近交并不总是一件坏事,它仅取决于您处于模拟的哪个阶段。在模拟的非常高级的阶段,随着种群向全局/局部最优收敛,显然很难避免近交,因为在某些情况下,许多优势解决方案将彼此非常相似,因此将共享许多相同的遗传特征。

结语 (Wrapping up)

Alright, that should cover the basics. If you have any questions, requests, or genetic mutations to contribute, please leave a comment below.

好吧,这应该涵盖基础知识。 如果您有任何疑问,要求或遗传突变可以做出贡献,请在下面发表评论。

In the next post, we will delve into some code as we look at how each of the genetic operators outlined above plays out in the world of programming. I used the Ruby programming language for the software simulation that I worked on, and in it, I show how in only a few generations, a genetic algorithm can produce a predefined word or phrase from an initial collection of complete and utter gibberish. All of the code will be hosted on Github.

在下一篇文章中,我们将研究上面概述的每个遗传算子如何在编程世界中发挥作用,从而深入研究一些代码。 我使用Ruby编程语言进行了软件仿真,并在其中展示了遗传算法如何在短短的几代时间内就可以从完整且完全乱码的初始集合中产生预定义的单词或短语。 所有代码都将托管在Github上。

翻译自: https://www.freecodecamp.org/news/the-computer-science-of-evolution-an-introduction-to-genetic-algorithms-b3871286c7e7/

进化算法跟遗传算法

 类似资料: