2020年4月10日 19:28:18未分类评论58阅读模式

英文:

State maintenance goes wrong in minimax search on board game

问题

我正在实现一个棋盘游戏上的极小化极大算法，但在搜索状态树时遇到问题。我相当肯定问题要么出在以下方面：

我对极小化极大算法的实现有误，或者
我构建下一个状态的方式，极小化极大将在其中进行搜索，或者
是1和2的结合

我已经以最佳的面向对象方式编写了游戏代码。首先，这是为了练习，但也是为了尽可能地解耦各个部分。但是，这个决策可能已经成为我在实现极小化极大算法时进展的一个障碍，因为游戏本身的代码变得有些混乱。这使得在游戏中调试状态变得有些复杂。然而，通过仔细的调试，我确信在我玩一个交替双方人类玩家的游戏时，游戏状态在我进行状态切换时是维持良好的。但是一旦我调用极小化极大搜索，游戏状态就会迅速崩溃，就好像我做极小化极大递归或构建下一个状态的方式是错误的。

在展示与极小化极大相关的工作之前，我将首先介绍一下如何对游戏进行建模，以确保您拥有所需的信息。之后，我将展示带有激活了极小化极大的样本游戏输出。我已经尝试对输出进行了注释，以便您可以看到出了什么问题。

让我们开始吧。

游戏是一个3 x 4单元格的棋盘游戏，如下所示：

+---+---+---+
|   |   |   | 0 <- 黑色玩家的起始行
+---+---+---+
|   |   |   | 1
+---+---+---+
|   |   |   | 2
+---+---+---+
|   |   |   | 3 <- 红色玩家的起始行
+---+---+---+
  0   1   2

我现在将介绍我构建的类。我会省略掉构造函数和get/set方法之类的通用代码，只突出会改变类实例状态的重要方法。

Board 类的模型如下：

public class Board
{
    public static final int ROWS = 4;
    public static final int COLUMNS = 3;
    private Cell[][] cells;

    public Board()
    {
        initialize();
    }

    private void initialize()
    {
        cells = new Cell[ROWS][COLUMNS];
        for (int row = 0; row < ROWS; row++)
        {
            for (int column = 0; column < COLUMNS; column++)
            {
                cells[row][column] = new Cell(row, column);
            }
        }
    }

    public Board getDeepCopy(State state)
    {
        Board boardCopy = new Board();

        for (int row = 0; row < ROWS; row++)
            for (int column = 0; column < COLUMNS; column++)
                boardCopy.cells[row][column] = cells[row][column].getDeepCopy(state);

        return boardCopy;
    }

    //其他通用逻辑
    //...
}

我在几个类中创建了一个 getDeepCopy(State state) 方法，用于构建当前游戏状态的副本，极小化极大将在其中进行搜索，以免更改游戏的“真实”状态。因此，这个方法将在其他类中重复出现，如您所见。

Cell 类如下所示：

public class Cell
{
    private String content;
    private Piece piece;
    private int row;
    private int column;

    public Cell getDeepCopy(State state)
    {
        Cell c = new Cell(row, column);
        c.content = content;
        c.piece = piece != null ? piece.getDeepCopy(state) : null;
        return c;
    }

    //其他通用逻辑
    //...
}

Piece 类定义如下：

public class Piece
{
    private Point point;
    private String symbol;

    public Piece getDeepCopy(State state)
    {
        return state.getBoard().getCells()[this.getPoint().getY()][this.getPoint().getX()].getPiece();
    }

    //其他通用逻辑
    //...
}

游戏有两名玩家，一名红色玩家和一名黑色玩家。他们各自有4个可用的棋子。

Player 类如下所示：

public class Player
{
    private Color color;
    private int score;
    private Stack<Piece> pieces;
    private ArrayList<Piece> placedPieces;
    public static final int MAX_PIECES = 4;

    private void initializePieces()
    {
        pieces = new Stack<>();
        placedPieces = new ArrayList<>();
        for (int i = 0; i < MAX_PIECES; i++)
        {
            pieces.push(new Piece(color.getSymbol()));
        }
    }

    public Piece getAvailablePiece()
    {
        return pieces.empty() ? null : pieces.pop();
    }

    public void returnPiece(Piece piece)
    {
        if (pieces.size() != MAX_PIECES)
        {
            pieces.push(piece);
        }
    }

    public void updatePlacedPieces(Piece piece)
    {
        if (!placedPieces.contains(piece))
            placedPieces.add(piece);
        else
            for (Piece match : placedPieces)
                if (piece.equals(match))
                    match.setPoint(piece.getPoint());

    }

    //其他通用逻辑
    //...
}

现在是游戏状态和极小化极大算法的较重要部分。所有游戏状态在一个名为 State 的类中表示，如下所示（注意 getDeepCopy() 方法，我怀疑这是主要问题之一）：

public class State
{
    private int points;
    private Board board;
    private Player turn;
    private Player opponent;
    private Player redPlayer;
    private Player blackPlayer;
    private Move move;


<details>
<summary>英文:</summary>

I am working on implementing minimax on a board game but am having issues when searching the state tree. I am fairly certain it is either

1. my implementation of minimax which is wrong, or
2. the way I build my next state on which minimax will do its search, or
3. a combination of 1 and 2

I sat out writing the code for the game in the best object-oriented way I could. This is first and foremost for practice but also to try and decouple things as best as possible. But this decision has *maybe* become a bit of an obstacle for my progress with minimax because the code for the game itself has become a little convoluted. This has made debugging state in the game a little complex. However, through careful debugging I feel confident that the state of the game is maintained well when I play a game where I alternate between two human players -- i.e. I do not do any minimax search on the state of the game. But as soon as I invoke minimax search, then the game state start to fall apart rather quickly, as if the way I do my minimax recursion or build my next state is wrong.

Before I show the work I have done in relation to minimax I&#39;ll first introduce how the game is modeled to make sure you have the information you need. After this I will show a sample output of the game running with minimax activated. I have tried to comment the output so you can see what is going *wrong*.

Let&#39;s get to it.

The game is a board game of 3 x 4 cells like so:

+---+---+---+
| | | | 0 <- Starting row of Black player
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| | | | 3 <- Starting row of Red player
+---+---+---+
0 1 2


I&#39;ll introduce the classes I have constructed now. I will leave out generic code like constructors and get/set methods and only highlight important methods which change the state of the class instance.

The `Board` class is modeled in the following way:

public class Board
{
public static final int ROWS = 4;
public static final int COLUMNS = 3;
private Cell[][] cells;

public Board()
{
    initialize();
}

private void initialize()
{
    cells = new Cell[ROWS][COLUMNS];
    for (int row = 0; row &lt; ROWS; row++)
    {
        for (int column = 0; column &lt; COLUMNS; column++)
        {
            cells[row][column] = new Cell(row, column);
        }
    }
}

public Board getDeepCopy(State state)
{
    Board boardCopy = new Board();

    for (int row = 0; row &lt; ROWS; row++)
        for (int column = 0; column &lt; COLUMNS; column++)
            boardCopy.cells[row][column] = cells[row][column].getDeepCopy(state);

    return boardCopy;
}

//Other generic logic
...

}


I have created a `getDeepCopy(State state)` method in a couple of the classes which I use to construct a copy of the current state of the game on which minimax will do its search, so as not to change the *real* state of the game. So this method will repeat itself in other classes, as you will see.

The `Cell` class looks like follows

public class Cell
{
private String content;
private Piece piece;
private int row;
private int column;

public Cell getDeepCopy(State state)
{
    Cell c = new Cell(row, column);
    c.content = content;
    c.piece = piece != null ? piece.getDeepCopy(state) : null;
    return c;
}

//Other generic logic
...

}


The `Piece` class is defined as

public class Piece
{
private Point point;
private String symbol;

public Piece getDeepCopy(State state)
{
    return state.getBoard().getCells()[this.getPoint().getY()][this.getPoint().getX()].getPiece();
}

//Other generic logic
...

}


The game has 2 players, a Red and a Black player. They each have 4 pieces available

The `Player` class looks like so:

public class Player
{
private Color color;
private int score;
private Stack<Piece> pieces;
private ArrayList<Piece> placedPieces;
public static final int MAX_PIECES = 4;

private void initializePieces()
{
    pieces = new Stack&lt;&gt;();
    placedPieces = new ArrayList&lt;&gt;();
    for (int i = 0; i &lt; MAX_PIECES; i++)
    {
        pieces.push(new Piece(color.getSymbol()));
    }
}

public Piece getAvailablePiece()
{
    return pieces.empty() ? null : pieces.pop();
}

public void returnPiece(Piece piece)
{
    if (pieces.size() != MAX_PIECES)
    {
        pieces.push(piece);
    }
}

public void updatePlacedPieces(Piece piece)
{
    if (!placedPieces.contains(piece))
        placedPieces.add(piece);
    else
        for (Piece match : placedPieces)
            if (piece.equals(match))
                match.setPoint(piece.getPoint());

}

//Other generic logic
...

}


Now to the more heavy parts of the game, the game state and the minimax algorithm. All game state is represented in a `State` class as below (take note of the `getDeepCopy()` method, which I suspect is one of the main culprits)

public class State
{
private int points;
private Board board;
private Player turn;
private Player opponent;
private Player redPlayer;
private Player blackPlayer;
private Move move;
private State parent;
private ArrayList<Move> availableMoves;
private int depth;

// Root state (s_0)
public State(int points)
{
    this.points = points;
    redPlayer = new Player(Color.Red);
    blackPlayer = new Player(Color.Black);
    turn = redPlayer;
    opponent = blackPlayer;
    depth = 0;
    board = new Board();
    parent = null;
}

// Child states from root state (s_1, s_2, ... s_n)
public State(State parent, Move move)
{
    this.parent = parent;
    this.move = move;
    redPlayer = parent.getRedPlayer();
    blackPlayer = parent.getBlackPlayer();
    depth = parent.depth + 1;
    points = parent.getPoints();
    board = parent.getBoard();
    passTurn(parent.getTurn());
    availableMoves = Logic.getAllMoves(this);
}

public void passTurn(Player lastTurn)
{
    if (lastTurn.equals(redPlayer))
    {
        turn = blackPlayer;
        opponent = redPlayer;
    }
    else
    {
        turn = redPlayer;
        opponent = blackPlayer;
    }
    System.out.println(String.format(&quot;\nTurn: %s&quot;, turn.toString()));
}

public State getDeepCopy(State state, Move move)
{
    State stateCopy = new State();
    stateCopy.move = move;
    stateCopy.parent = state.getParent();
    stateCopy.points = state.getPoints();
    stateCopy.redPlayer = state.getRedPlayer();
    stateCopy.blackPlayer = state.getBlackPlayer();
    stateCopy.turn = state.getTurn();
    stateCopy.opponent = state.getOpponent();
    stateCopy.depth = state.getDepth();
    stateCopy.board = state.getBoard().getDeepCopy(state); // Here the deep copy begins
    stateCopy.availableMoves = state.getAvailableMoves();
    return stateCopy;
}

//Other generic logic

}


The game is running in a main class called `Game` where I alternate between a human move and an AI move. All game logic is handled in a class `Logic`, with only `static` methods. This class implements the rules of the game, e.g. if a player wants to insert a piece, and is assisted by a `BoardManager` class (also only with `static` methods) which does operations on the board (placing a piece, removing a piece, etc.). So the `Logic` class holds no state, neither does the `BoardManager` class, they only modify current state based on the Game&#39;s instance of state. 

Here is the `Game `class with its most important methods highlighted

public class Game
{
private State state;
private boolean playing;

public void play() throws IOException
{
    drawBoard();
    System.out.println(String.format(&quot;First player to make a move is %s&quot;, state.getTurn().toString()));
    playing = true;

    while (!isGameOver() &amp;&amp; playing)
        if (isHumanPlayer())
            doHumanMove();
        else
            doAIMove();
}

private void doHumanMove() throws IOException
{
    Choice choice = getChoice();

    boolean isMoveOk = false;
    Move move = null;

    switch (choice)
    {
        case INSERT:
            System.out.print(&quot;Enter cell to insert on: &quot;);
            Point to = getPointFromUser();
            move = new Move(to, state.getRedPlayer(), Move.Type.Insert);
            isMoveOk = Logic.doMove(move, state);
            break;
        case ...
    }
    if (isMoveOk)
    {
        drawBoard();
        state = new State(state, move);
    }
}

private void doAIMove()
{
    Move move = MiniMax.search(0, state, false);
    state = new State(state, move);
    drawBoard();
}

}


Now to the nitty-gritty part, the minimax class. I invoke minimax using the `search()` method, as can be seen above in the `play()` method of `Game`. I then call `miniMax()` on a `depth`, the current `State` and whether the current player is the `maximizer`. I hope it is self-explanatory by looking at the code

public class MiniMax
{
private static final int maxDepth = 2;

public static Move search(int depth, State state, boolean isPlayerMaximizer)
{
    System.out.println(&quot;Starting MiniMax search...&quot;);
    Value v = miniMax(depth, state, isPlayerMaximizer);
    return v.getMove();
}

private static Value miniMax(int depth, State state, boolean isPlayerMaximizer)
{
    state.getBoard().draw();

    if (depth == maxDepth || state.getTurn().getScore() == state.getPoints())
        return getScore(state);
    
    if (isPlayerMaximizer)
        return getMax(depth, state);

    else
        return getMin(depth, state);
}

private static Value getMax(int depth, State state)
{
    int bestScore = Integer.MIN_VALUE;
    Move bestMove = null;

    state.setAvailableMoves(Logic.getAllMoves(state)); // Is this wrong?
    for (Move move : state.getAvailableMoves())
    {
        State nextState = state.getDeepCopy(state, move); // Copy of state to work on
        Logic.doMove(move, nextState);
        nextState.passTurn(nextState.getTurn());
        Value val = miniMax(depth + 1, nextState, false);

        if (val.getValue() &gt;= bestScore)
        {
            bestScore = val.getValue();
            bestMove = move;
        }
    }

    if (bestMove != null)
        Logic.doMove(bestMove, state);

    return new Value(bestScore, bestMove);
}

private static Value getMin(int depth, State state)
{
    int bestScore = Integer.MAX_VALUE;
    Move bestMove = null;
    state.setAvailableMoves(Logic.getAllMoves(state));
    for (Move move : state.getAvailableMoves())
    {
        State nextState = state.getDeepCopy(state, move);
        Logic.doMove(move, nextState);
        state.passTurn(nextState.getTurn());
        Value val = miniMax(depth + 1, nextState, true);

        if (val.getValue() &lt;= bestScore)
        {
            bestScore = val.getValue();
            bestMove = move;
        }
    }

    if (bestMove != null)
        Logic.doMove(bestMove, state);

    return new Value(bestScore, bestMove);
}

private static Value getScore(State state)
{
    Player maximizer = state.getRedPlayer();
    Player minimizer = state.getBlackPlayer();

    if (state.getTurn() == maximizer &amp;&amp; maximizer.getScore() == state.getPoints())
        return new Value(10, state.getMove());

    else if (state.getTurn() == minimizer &amp;&amp; minimizer.getScore() == state.getPoints())
        return new Value(-10, state.getMove());

    else
        return new Value(0, state.getMove());
}

private static class Value {
    int value;
    Move move;

    public Value(int value, Move move) {
        this.value = value;
        this.move = move;
    }

    public Move getMove() {
        return move;
    }

    public int getValue()
    {
        return value;
    }
}

}


Here is a run of the game where I invoke minimax

Welcome!

// Initial state (human player makes a move)

+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| | | | 3
+---+---+---+
0 1 2

First player to make a move is Red

Choose an option:

Insert
Diagonal move
Attack
Jump
Exit

Input: 1
Enter cell to insert on: 0,3
Red is trying to do a Insert move...
Red placed a piece on the board at (0, 3).

// So far so good

+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

Turn: Black

Starting MiniMax search... // OK, time for the AI to make a move

+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

Black is trying to do a Insert move...
Black placed a piece on the board at (0, 0).

Turn: Red

// So far so good

+---+---+---+
| B | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

Black is trying to do a Insert move...
Black placed a piece on the board at (1, 0).

Turn: Red

// Black just did two moves in a row, looks like the turn is not passed on properly?
// Looks like after just 1 move with minimax the state of the game becomes wrong?

+---+---+---+
| B | B | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

Black is trying to do a Insert move...
Black placed a piece on the board at (2, 0).

Turn: Red

// Take note that the piece inserted by black at (1, 0) no longer exists on the board below
// Does getDeepCopy() not work properly?

+---+---+---+
| B | | B | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

Black is trying to do a Diagonal move...
Black moved diagonally from (0, 0) to (1, 1)

Turn: Red

// Same thing here, now the piece at (2, 0) is gone
// But the piece which was moved from (0, 0) to (1, 1) is still there

+---+---+---+
| | | | 0
+---+---+---+
| | B | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2

// Console output is duplicated because the move now happened on the actual game state
Black is trying to do a Diagonal move...
Black moved diagonally from (0, 0) to (1, 1)

// Red is trying to INSERT on a row which does not belong to Red
// Looks like the turn was not updated properly?

Red is trying to do a Insert move...
Invalid move: not starting row of Red.

Turn: Black


Thank you for reading this rather lengthy post. I hope I can gain some insights into what I am doing wrong so I can make this minimax search do its job properly!

</details>

专注分享java语言的经验与见解，让所有开发者获益！

状态维护在博弈搜索的极小化极大算法中出现问题。

问题

Go like channels in Java

在低资源环境下使用Apache Cassandra和Go服务器

avatica-go客户端读取Phoenix查询服务器：[驱动程序：连接错误]

向Spring端点发送POST请求，返回状态码400。

Spring Boot控制器从Golang应用程序接收到的重定向请求会被重复执行两次。

可以在不将其读入内存的情况下多次重用HTTP请求体吗？

How to register my go lang microservice in Spring Eureka Service Discovery

在应用程序-go + BDD-java中模拟第三方服务

What is value, reference vs pointer and what these three example used to pass?

Do goroutines and light-weight Java threads mean we never need use thread pools and async code again?

发表评论