为4x4 Tictaoe游戏实现Alpha-Beta修剪

2024-09-24 22:18:00 发布

您现在位置:Python中文网/ 问答频道 /正文

首先,我最近为一个3x3的tic-tac-toe游戏编写了一个minimax算法,但现在我想为一个4x4游戏编写一个函数。我试图改变代码,我为4x4的3x3它不会工作,因为从一个9!到16岁!现在,我正试图将Alpha-beta剪枝算法移植到minimax算法中,以减少所有的可能性

这是我现在拥有的适用于3x3的minimax代码

def minimax(state, depth, player):
    """
    AGENT function that choose's the best move to pick
    The AGENT is the Maximizing player
    """
    if depth == 0 or game_over(state):
        score = evaluate(state)
        return [-1, -1, score]
    if player == AGENT:
        best = [-1, -1, -inf]
    else:
        best = [-1, -1, +inf]
        
    for cell in empty_cells(state):
        x, y = cell[0], cell[1]
        state[x][y] = player
        score = minimax(state, depth - 1, -player)
        state[x][y] = 0
        score[0], score[1] = x, y

        if player == AGENT:
            if score[2] > best[2]:
                best = score  # max value
        else:
            if score[2] < best[2]:
                best = score  # min value

    return best

看看维基百科中的Alpha-beta剪枝算法伪代码(fail-soft-Alpha-beta)

function alphabeta(node, depth, α, β, maximizingPlayer) is
    if depth = 0 or node is a terminal node then
        return the heuristic value of node
    if maximizingPlayer then
        value := −∞
        for each child of node do
            value := max(value, alphabeta(child, depth − 1, α, β, FALSE))
            α := max(α, value)
            if value ≥ β then
                break (* β cutoff *)
        return value
    else
        value := +∞
        for each child of node do
            value := min(value, alphabeta(child, depth − 1, α, β, TRUE))
            β := min(β, value)
            if value ≤ α then
                break (* α cutoff *)
        return value

下面的代码是我跟随psecode的尝试

def alphabeta(state,depth,alpha,beta,player):
    if depth == 0 or game_over(state):
        score = evaluate(state)
        return [-1, -1, score]
    if player == AGENT:
        best = [-1, -1, -inf]
    for cell in empty_cells(state):
        x, y = cell[0], cell[1]
        state[x][y] = player
        score = max(best, alphabeta(depth - 1, -player, alpha, beta, False))
        state[x][y] = 0
        score[0], score[1] = x, y
        alpha =max(alpha, best)
        if best >= beta:
            break
        return best
    else:
        best = [-1, -1, +inf]
        for cell in empty_cells(state):
            x, y = cell[0], cell[1]
            state[x][y] = player
            score = min(best, alphabeta(depth - 1, -player, alpha, beta,True))
            state[x][y] = 0
            score[0], score[1] = x, y
            beta = min(beta, best)
            if best <= alpha :
                break
            return best

我只是想知道我是否正确地实现了Alpha-beta修剪。任何帮助都将不胜感激


Tags: alphanodereturnifvaluecellbetaagent