遺傳算法中幾種不同選擇算子及Python實(shí)現(xiàn)

作者：iPytLab 2017-09-22 15:03:08

遺傳算法(genetic algorithms, GAs)是一種自適應(yīng)的啟發(fā)式搜索算法, 它模仿達(dá)爾文進(jìn)化論中的“適者生存”的原則, 最終獲取優(yōu)化目標(biāo)的最優(yōu)解。

前言

本文對遺傳算法中的幾種選擇策略進(jìn)行了總結(jié), 其中包括:

Proportionate Roulette Wheel Selection
Linear Ranking Selection
Exponential Ranking Selection
Tournament Selection

對于每種選擇策略我都使用Python進(jìn)行了相應(yīng)的實(shí)現(xiàn)并以內(nèi)置插件的形式整合進(jìn)了本人所寫的遺傳算法框架GAFT中。對需要使用遺傳算法優(yōu)化問題以及學(xué)習(xí)遺傳算法的童鞋可以作為參考。

項(xiàng)目鏈接:

GitHub: https://github.com/PytLab/gaft
PyPI: https://pypi.python.org/pypi/gaft

遺傳算法中的選擇算子

遺傳算法(genetic algorithms, GAs)是一種自適應(yīng)的啟發(fā)式搜索算法, 它模仿達(dá)爾文進(jìn)化論中的“適者生存”的原則, 最終獲取優(yōu)化目標(biāo)的最優(yōu)解。下圖描述了一個簡單的遺傳算法流程:

對于種群中需要進(jìn)行雜交的物種選擇方法有很多，而且選擇一種合適的選擇策略對于遺傳算法整體性能的影響將是很大的。如果一個選擇算法選擇多樣性降低，便會導(dǎo)致種群過早的收斂到局部最優(yōu)點(diǎn)而不是我們想要的全局最優(yōu)點(diǎn)，也就是所謂的”早熟”。而選擇策略過于發(fā)散則會導(dǎo)致算法難以收斂到最優(yōu)點(diǎn)。因此在這兩點(diǎn)中我們需要進(jìn)行平衡才能使遺傳算法以一種高效的方式收斂到全局最優(yōu)點(diǎn)。

GAFT框架中的算子插件

GAFT是我根據(jù)自己需求開發(fā)的一個遺傳算法框架，相關(guān)介紹的博客可以參見《GAFT-一個使用Python實(shí)現(xiàn)的遺傳算法框架》,《使用MPI并行化遺傳算法框架GAFT》。該框架提供了插件接口，用戶可以通過自定義算子以及on-the-fly分析插件來放到gaft框架中運(yùn)行遺傳算法流程對目標(biāo)問題進(jìn)行優(yōu)化。

本部分我稍微介紹下gaft關(guān)于遺傳算子相關(guān)接口規(guī)范，以及編寫能用于gaft的算子的編寫方法。

在gaft中遺傳算子的編寫都是需要繼承框架內(nèi)置的基類，然后根據(jù)基類提供的接口，實(shí)現(xiàn)自己的算子。其中基類的定義都在/gaft/plugin_interfaces/operators/目錄下，下面我以選擇算子為例，介紹下接口。

gaft中選擇算子的基類為GASelection，其中在遺傳算法流程中會調(diào)用該類實(shí)例的select方法，進(jìn)而根據(jù)算子中的相關(guān)選擇策略完成從種群中選取一對物種作為父親和母親產(chǎn)生子代。基類的定義為：

class GASelection(metaclass=SelectionMeta): 
    ''' 
    Class for providing an interface to easily extend the behavior of selection 
    operation. 
    ''' 
    def select(self, population, fitness): 
        ''' 
        Called when we need to select parents from a population to later breeding. 
        :param population: The current population. 
        :type population: GAPopulation 
        :return parents: Two selected individuals for crossover. 
        :type parents: Tuple of tow GAIndividual objects. 
        ''' 
        raise NotImplementedError

select的方法的參數(shù)為當(dāng)前種群population以及相應(yīng)的適應(yīng)度函數(shù)fitness，其中population需要是GAPopulation對象，fitness也必須是callable的對象。

當(dāng)然，這些在Python這種動態(tài)類型語言中貌似看起來有些雞肋，但是為了能夠更加規(guī)范使用者，我利用Python的元類在實(shí)例化類對象的時候?qū)涌诘膶?shí)現(xiàn)以及接口的參數(shù)類型加以限制。具體的實(shí)現(xiàn)都在/gaft/plugin_interfaces/metaclasses.py中，有興趣的童鞋可以看看實(shí)現(xiàn)方法。

具體自定義算子的編寫方法我將在下一部分同選擇策略一起貼出來。

不同的選擇策略

本部分我主要對四種不同的選擇策略進(jìn)行總結(jié)并加以gaft插件形式的Python實(shí)現(xiàn)。

選擇算子決定了哪些個體將會從種群中被選擇出來用于繁衍下一代種群中的新個體。其主要的原則就是:

the better is an individual; the higher is its chance of being a parent

選擇算子在遺傳算法迭代中將適應(yīng)度函數(shù)引入進(jìn)來，因?yàn)檫m應(yīng)度函數(shù)式標(biāo)定一個個體是否足夠“好”的重要標(biāo)準(zhǔn)。但是選擇過程又不能僅僅完全依賴于適應(yīng)度函數(shù)，因?yàn)橐粋€種群中的最優(yōu)物種并不一定是在全局最優(yōu)點(diǎn)附近。因此我們也應(yīng)該給相對來說并那么“好”的個體一點(diǎn)機(jī)會讓他們繁衍后代, 避免“早熟”。

Proportionate Roulette Wheel Selection

此輪盤賭選擇策略，是最基本的選擇策略之一，種群中的個體被選中的概率與個體相應(yīng)的適應(yīng)度函數(shù)的值成正比。我們需要將種群中所有個體的適應(yīng)度值進(jìn)行累加然后歸一化，最終通過隨機(jī)數(shù)對隨機(jī)數(shù)落在的區(qū)域?qū)?yīng)的個體進(jìn)行選取，類似賭場里面的旋轉(zhuǎn)的輪盤。

每個個體ai被選中的概率為:

好了，下面可以將此算法寫成一個可以gaft中執(zhí)行的算子。

from random import random 
from bisect import bisect_right 
from itertools import accumulate 
  
from ...plugin_interfaces.operators.selection import GASelection 
  
class RouletteWheelSelection(GASelection): 
    def __init__(self): 
        ''' 
        Selection operator with fitness proportionate selection(FPS) or 
        so-called roulette-wheel selection implementation. 
        ''' 
        pass 
  
    def select(self, population, fitness): 
        ''' 
        Select a pair of parent using FPS algorithm. 
        ''' 
        # Normalize fitness values for all individuals. 
        fit = [fitness(indv) for indv in population.individuals] 
        min_fit = min(fit) 
        fit = [(i - min_fit) for i in fit] 
  
        # Create roulette wheel. 
        sum_fit = sum(fit) 
        wheel = list(accumulate([i/sum_fit for i in fit])) 
  
        # Select a father and a mother. 
        father_idx = bisect_right(wheel, random()) 
        father = population[father_idx] 
        mother_idx = (father_idx + 1) % len(wheel) 
        mother = population[mother_idx] 
  
        return father, mother

過程主要分為下面幾個:

繼承GASelection類
實(shí)現(xiàn)select方法
select的參數(shù)為GAPopulation實(shí)例和適應(yīng)度函數(shù)
根據(jù)算法選擇出兩個需要繁衍的物種并返回即可

Tournament Selection

由于算法執(zhí)行的效率以及易實(shí)現(xiàn)的的特點(diǎn)，錦標(biāo)賽選擇算法是遺傳算法中最流行的選擇策略。在本人的實(shí)際應(yīng)用中的確此策略比基本的輪盤賭效果要好些。他的策略也很直觀，就是我們再整個種群中抽取n個個體，讓他們進(jìn)行競爭(錦標(biāo)賽)，抽取其中的最優(yōu)的個體。參加錦標(biāo)賽的個體個數(shù)成為tournament size。通常當(dāng)n=2便是最常使用的大小，也稱作Binary Tournament Selection.

Tournament Selection的優(yōu)勢:

更小的復(fù)雜度O(n)
易并行化處理
不易陷入局部最優(yōu)點(diǎn)
不需要對所有的適應(yīng)度值進(jìn)行排序處理

下圖顯示了n=3的Tournament Selection的過程:

可以開始寫成自定義算子在gaft運(yùn)行了:

from random import sample 
  
from ...plugin_interfaces.operators.selection import GASelection 
  
class TournamentSelection(GASelection): 
    def __init__(self, tournament_size=2): 
        ''' 
        Selection operator using Tournament Strategy with tournament size equals 
        to two by default. 
        ''' 
        self.tournament_size = tournament_size 
  
    def select(self, population, fitness): 
        ''' 
        Select a pair of parent using Tournament strategy. 
        ''' 
        # Competition function. 
        complete = lambda competitors: max(competitors, key=fitness) 
  
        # Check validity of tournament size. 
        if self.tournament_size >= len(population): 
            msg = 'Tournament size({}) is larger than population size({})' 
            raise ValueError(msg.format(self.tournament_size, len(population))) 
  
        # Pick winners of two groups as parent. 
        competitors_1 = sample(population.individuals, self.tournament_size) 
        competitors_2 = sample(population.individuals, self.tournament_size) 
        father, mother = complete(competitors_1), complete(competitors_2) 
  
        return father, mother

下面兩個介紹的選擇策略都是基于排序的選擇策略，上面提到的第一種基本輪盤賭選擇算法，有一個缺點(diǎn)，就是如果一個個體的適應(yīng)度值為0的話，則被選中的概率將會是0, 這個個體將不能產(chǎn)生后代。于是我們需要一種基于排序的算法，來給每個個體安排相應(yīng)的選中概率。

在Linear Ranking Selection中，種群中的個體首先根據(jù)適應(yīng)度的值進(jìn)行排序，然后給所有個體賦予一個序號，最好的個體為N, 被選中的概率為Pmax, 最差的個體序號為1, 被選中的概率為Pmin，于是其他的在他們中間的個體的概率便可以根據(jù)如下公式得到:

實(shí)現(xiàn)代碼:

from random import random 
from itertools import accumulate 
from bisect import bisect_right 
  
from ...plugin_interfaces.operators.selection import GASelection 
  
class LinearRankingSelection(GASelection): 
    def __init__(self, pmin=0.1, pmax=0.9): 
        ''' 
        Selection operator using Linear Ranking selection method. 
        Reference: Baker J E. Adaptive selection methods for genetic 
        algorithms[C]//Proceedings of an International Conference on Genetic 
        Algorithms and their applications. 1985: 101-111. 
        ''' 
        # Selection probabilities for the worst and best individuals. 
        self.pmin, self.pmax = pmin, pmax 
  
    def select(self, population, fitness): 
        ''' 
        Select a pair of parent individuals using linear ranking method. 
        ''' 
        # Individual number. 
        NP = len(population) 
        # Add rank to all individuals in population. 
        sorted_indvs = sorted(population.individuals, key=fitness, reverse=True) 
  
        # Assign selection probabilities linearly. 
        # NOTE: Here the rank i belongs to {1, ..., N} 
        p = lambda i: (self.pmin + (self.pmax - self.pmin)*(i-1)/(NP-1)) 
        probabilities = [self.pmin] + [p(i) for i in range(2, NP)] + [self.pmax] 
  
        # Normalize probabilities. 
        psum = sum(probabilities) 
        wheel = list(accumulate([p/psum for p in probabilities])) 
  
        # Select parents. 
        father_idx = bisect_right(wheel, random()) 
        father = population[father_idx] 
        mother_idx = (father_idx + 1) % len(wheel) 
        mother = population[mother_idx] 
  
        return father, mother

類似上面的Linear Ranking選擇策略，這種指數(shù)排序便是在確定每個個體的選擇概率的時候使用了指數(shù)形式的表達(dá)式, 其中c為底數(shù)，滿足0<c<1:

實(shí)現(xiàn)代碼:

from random import random 
from itertools import accumulate 
from bisect import bisect_right 
  
from ...plugin_interfaces.operators.selection import GASelection 
  
class ExponentialRankingSelection(GASelection):  
    def __init__(self, base=0.5): 
        ''' 
        Selection operator using Exponential Ranking selection method. 
        :param base: The base of exponent 
        :type base: float in range (0.0, 1.0) 
        ''' 
        if not (0.0 < base < 1.0): 
            raise ValueError('The base of exponent c must in range (0.0, 1.0)') 
        self.base = base 
  
    def select(self, population, fitness): 
        ''' 
        Select a pair of parent individuals using exponential ranking method. 
        ''' 
        # Individual number. 
        NP = len(population) 
        # NOTE: Here the rank i belongs to {1, ..., N} 
        p = lambda i: self.base**(NP - i) 
        probabilities = [p(i) for i in range(1, NP + 1)] 
        # Normalize probabilities. 
        psum = sum(probabilities) 
        wheel = list(accumulate([p/psum for p in probabilities])) 
        # Select parents. 
        father_idx = bisect_right(wheel, random()) 
        father = population[father_idx] 
        mother_idx = (father_idx + 1) % len(wheel) 
        mother = population[mother_idx] 
        return father, mother

總結(jié)

本文對于遺傳算法中四種不同的選擇策略進(jìn)行了介紹和總結(jié)，同時對于本文所寫的遺傳算法框架的自定義算子接口進(jìn)行了簡要的介紹，針對本文中的選擇策略分別根據(jù)接口的要求實(shí)現(xiàn)了相應(yīng)的算子，這些算子也作為GAFT框架的內(nèi)置算子放入到GAFT中，對于使用GAFT的童鞋可以直接拿來使用。

參考

Shukla, Anupriya, Hari Mohan Pandey, and Deepti Mehrotra. “Comparative review of selection techniques in genetic algorithm.” Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), 2015 International Conference on. IEEE, 2015.

責(zé)任編輯：龐桂玉來源： Python開發(fā)者