英文:
State maintenance goes wrong in minimax search on board game
问题
我正在实现一个棋盘游戏上的极小化极大算法,但在搜索状态树时遇到问题。我相当肯定问题要么出在以下方面:
- 我对极小化极大算法的实现有误,或者
- 我构建下一个状态的方式,极小化极大将在其中进行搜索,或者
- 是1和2的结合
我已经以最佳的面向对象方式编写了游戏代码。首先,这是为了练习,但也是为了尽可能地解耦各个部分。但是,这个决策可能已经成为我在实现极小化极大算法时进展的一个障碍,因为游戏本身的代码变得有些混乱。这使得在游戏中调试状态变得有些复杂。然而,通过仔细的调试,我确信在我玩一个交替双方人类玩家的游戏时,游戏状态在我进行状态切换时是维持良好的。但是一旦我调用极小化极大搜索,游戏状态就会迅速崩溃,就好像我做极小化极大递归或构建下一个状态的方式是错误的。
在展示与极小化极大相关的工作之前,我将首先介绍一下如何对游戏进行建模,以确保您拥有所需的信息。之后,我将展示带有激活了极小化极大的样本游戏输出。我已经尝试对输出进行了注释,以便您可以看到出了什么问题。
让我们开始吧。
游戏是一个3 x 4单元格的棋盘游戏,如下所示:
+---+---+---+
| | | | 0 <- 黑色玩家的起始行
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| | | | 3 <- 红色玩家的起始行
+---+---+---+
0 1 2
我现在将介绍我构建的类。我会省略掉构造函数和get/set方法之类的通用代码,只突出会改变类实例状态的重要方法。
Board
类的模型如下:
public class Board
{
public static final int ROWS = 4;
public static final int COLUMNS = 3;
private Cell[][] cells;
public Board()
{
initialize();
}
private void initialize()
{
cells = new Cell[ROWS][COLUMNS];
for (int row = 0; row < ROWS; row++)
{
for (int column = 0; column < COLUMNS; column++)
{
cells[row][column] = new Cell(row, column);
}
}
}
public Board getDeepCopy(State state)
{
Board boardCopy = new Board();
for (int row = 0; row < ROWS; row++)
for (int column = 0; column < COLUMNS; column++)
boardCopy.cells[row][column] = cells[row][column].getDeepCopy(state);
return boardCopy;
}
//其他通用逻辑
//...
}
我在几个类中创建了一个 getDeepCopy(State state)
方法,用于构建当前游戏状态的副本,极小化极大将在其中进行搜索,以免更改游戏的“真实”状态。因此,这个方法将在其他类中重复出现,如您所见。
Cell
类如下所示:
public class Cell
{
private String content;
private Piece piece;
private int row;
private int column;
public Cell getDeepCopy(State state)
{
Cell c = new Cell(row, column);
c.content = content;
c.piece = piece != null ? piece.getDeepCopy(state) : null;
return c;
}
//其他通用逻辑
//...
}
Piece
类定义如下:
public class Piece
{
private Point point;
private String symbol;
public Piece getDeepCopy(State state)
{
return state.getBoard().getCells()[this.getPoint().getY()][this.getPoint().getX()].getPiece();
}
//其他通用逻辑
//...
}
游戏有两名玩家,一名红色玩家和一名黑色玩家。他们各自有4个可用的棋子。
Player
类如下所示:
public class Player
{
private Color color;
private int score;
private Stack<Piece> pieces;
private ArrayList<Piece> placedPieces;
public static final int MAX_PIECES = 4;
private void initializePieces()
{
pieces = new Stack<>();
placedPieces = new ArrayList<>();
for (int i = 0; i < MAX_PIECES; i++)
{
pieces.push(new Piece(color.getSymbol()));
}
}
public Piece getAvailablePiece()
{
return pieces.empty() ? null : pieces.pop();
}
public void returnPiece(Piece piece)
{
if (pieces.size() != MAX_PIECES)
{
pieces.push(piece);
}
}
public void updatePlacedPieces(Piece piece)
{
if (!placedPieces.contains(piece))
placedPieces.add(piece);
else
for (Piece match : placedPieces)
if (piece.equals(match))
match.setPoint(piece.getPoint());
}
//其他通用逻辑
//...
}
现在是游戏状态和极小化极大算法的较重要部分。所有游戏状态在一个名为 State
的类中表示,如下所示(注意 getDeepCopy()
方法,我怀疑这是主要问题之一):
public class State
{
private int points;
private Board board;
private Player turn;
private Player opponent;
private Player redPlayer;
private Player blackPlayer;
private Move move;
<details>
<summary>英文:</summary>
I am working on implementing minimax on a board game but am having issues when searching the state tree. I am fairly certain it is either
1. my implementation of minimax which is wrong, or
2. the way I build my next state on which minimax will do its search, or
3. a combination of 1 and 2
I sat out writing the code for the game in the best object-oriented way I could. This is first and foremost for practice but also to try and decouple things as best as possible. But this decision has *maybe* become a bit of an obstacle for my progress with minimax because the code for the game itself has become a little convoluted. This has made debugging state in the game a little complex. However, through careful debugging I feel confident that the state of the game is maintained well when I play a game where I alternate between two human players -- i.e. I do not do any minimax search on the state of the game. But as soon as I invoke minimax search, then the game state start to fall apart rather quickly, as if the way I do my minimax recursion or build my next state is wrong.
Before I show the work I have done in relation to minimax I'll first introduce how the game is modeled to make sure you have the information you need. After this I will show a sample output of the game running with minimax activated. I have tried to comment the output so you can see what is going *wrong*.
Let's get to it.
The game is a board game of 3 x 4 cells like so:
+---+---+---+
| | | | 0 <- Starting row of Black player
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| | | | 3 <- Starting row of Red player
+---+---+---+
0 1 2
I'll introduce the classes I have constructed now. I will leave out generic code like constructors and get/set methods and only highlight important methods which change the state of the class instance.
The `Board` class is modeled in the following way:
public class Board
{
public static final int ROWS = 4;
public static final int COLUMNS = 3;
private Cell[][] cells;
public Board()
{
initialize();
}
private void initialize()
{
cells = new Cell[ROWS][COLUMNS];
for (int row = 0; row < ROWS; row++)
{
for (int column = 0; column < COLUMNS; column++)
{
cells[row][column] = new Cell(row, column);
}
}
}
public Board getDeepCopy(State state)
{
Board boardCopy = new Board();
for (int row = 0; row < ROWS; row++)
for (int column = 0; column < COLUMNS; column++)
boardCopy.cells[row][column] = cells[row][column].getDeepCopy(state);
return boardCopy;
}
//Other generic logic
...
}
I have created a `getDeepCopy(State state)` method in a couple of the classes which I use to construct a copy of the current state of the game on which minimax will do its search, so as not to change the *real* state of the game. So this method will repeat itself in other classes, as you will see.
The `Cell` class looks like follows
public class Cell
{
private String content;
private Piece piece;
private int row;
private int column;
public Cell getDeepCopy(State state)
{
Cell c = new Cell(row, column);
c.content = content;
c.piece = piece != null ? piece.getDeepCopy(state) : null;
return c;
}
//Other generic logic
...
}
The `Piece` class is defined as
public class Piece
{
private Point point;
private String symbol;
public Piece getDeepCopy(State state)
{
return state.getBoard().getCells()[this.getPoint().getY()][this.getPoint().getX()].getPiece();
}
//Other generic logic
...
}
The game has 2 players, a Red and a Black player. They each have 4 pieces available
The `Player` class looks like so:
public class Player
{
private Color color;
private int score;
private Stack<Piece> pieces;
private ArrayList<Piece> placedPieces;
public static final int MAX_PIECES = 4;
private void initializePieces()
{
pieces = new Stack<>();
placedPieces = new ArrayList<>();
for (int i = 0; i < MAX_PIECES; i++)
{
pieces.push(new Piece(color.getSymbol()));
}
}
public Piece getAvailablePiece()
{
return pieces.empty() ? null : pieces.pop();
}
public void returnPiece(Piece piece)
{
if (pieces.size() != MAX_PIECES)
{
pieces.push(piece);
}
}
public void updatePlacedPieces(Piece piece)
{
if (!placedPieces.contains(piece))
placedPieces.add(piece);
else
for (Piece match : placedPieces)
if (piece.equals(match))
match.setPoint(piece.getPoint());
}
//Other generic logic
...
}
Now to the more heavy parts of the game, the game state and the minimax algorithm. All game state is represented in a `State` class as below (take note of the `getDeepCopy()` method, which I suspect is one of the main culprits)
public class State
{
private int points;
private Board board;
private Player turn;
private Player opponent;
private Player redPlayer;
private Player blackPlayer;
private Move move;
private State parent;
private ArrayList<Move> availableMoves;
private int depth;
// Root state (s_0)
public State(int points)
{
this.points = points;
redPlayer = new Player(Color.Red);
blackPlayer = new Player(Color.Black);
turn = redPlayer;
opponent = blackPlayer;
depth = 0;
board = new Board();
parent = null;
}
// Child states from root state (s_1, s_2, ... s_n)
public State(State parent, Move move)
{
this.parent = parent;
this.move = move;
redPlayer = parent.getRedPlayer();
blackPlayer = parent.getBlackPlayer();
depth = parent.depth + 1;
points = parent.getPoints();
board = parent.getBoard();
passTurn(parent.getTurn());
availableMoves = Logic.getAllMoves(this);
}
public void passTurn(Player lastTurn)
{
if (lastTurn.equals(redPlayer))
{
turn = blackPlayer;
opponent = redPlayer;
}
else
{
turn = redPlayer;
opponent = blackPlayer;
}
System.out.println(String.format("\nTurn: %s", turn.toString()));
}
public State getDeepCopy(State state, Move move)
{
State stateCopy = new State();
stateCopy.move = move;
stateCopy.parent = state.getParent();
stateCopy.points = state.getPoints();
stateCopy.redPlayer = state.getRedPlayer();
stateCopy.blackPlayer = state.getBlackPlayer();
stateCopy.turn = state.getTurn();
stateCopy.opponent = state.getOpponent();
stateCopy.depth = state.getDepth();
stateCopy.board = state.getBoard().getDeepCopy(state); // Here the deep copy begins
stateCopy.availableMoves = state.getAvailableMoves();
return stateCopy;
}
//Other generic logic
}
The game is running in a main class called `Game` where I alternate between a human move and an AI move. All game logic is handled in a class `Logic`, with only `static` methods. This class implements the rules of the game, e.g. if a player wants to insert a piece, and is assisted by a `BoardManager` class (also only with `static` methods) which does operations on the board (placing a piece, removing a piece, etc.). So the `Logic` class holds no state, neither does the `BoardManager` class, they only modify current state based on the Game's instance of state.
Here is the `Game `class with its most important methods highlighted
public class Game
{
private State state;
private boolean playing;
public void play() throws IOException
{
drawBoard();
System.out.println(String.format("First player to make a move is %s", state.getTurn().toString()));
playing = true;
while (!isGameOver() && playing)
if (isHumanPlayer())
doHumanMove();
else
doAIMove();
}
private void doHumanMove() throws IOException
{
Choice choice = getChoice();
boolean isMoveOk = false;
Move move = null;
switch (choice)
{
case INSERT:
System.out.print("Enter cell to insert on: ");
Point to = getPointFromUser();
move = new Move(to, state.getRedPlayer(), Move.Type.Insert);
isMoveOk = Logic.doMove(move, state);
break;
case ...
}
if (isMoveOk)
{
drawBoard();
state = new State(state, move);
}
}
private void doAIMove()
{
Move move = MiniMax.search(0, state, false);
state = new State(state, move);
drawBoard();
}
}
Now to the nitty-gritty part, the minimax class. I invoke minimax using the `search()` method, as can be seen above in the `play()` method of `Game`. I then call `miniMax()` on a `depth`, the current `State` and whether the current player is the `maximizer`. I hope it is self-explanatory by looking at the code
public class MiniMax
{
private static final int maxDepth = 2;
public static Move search(int depth, State state, boolean isPlayerMaximizer)
{
System.out.println("Starting MiniMax search...");
Value v = miniMax(depth, state, isPlayerMaximizer);
return v.getMove();
}
private static Value miniMax(int depth, State state, boolean isPlayerMaximizer)
{
state.getBoard().draw();
if (depth == maxDepth || state.getTurn().getScore() == state.getPoints())
return getScore(state);
if (isPlayerMaximizer)
return getMax(depth, state);
else
return getMin(depth, state);
}
private static Value getMax(int depth, State state)
{
int bestScore = Integer.MIN_VALUE;
Move bestMove = null;
state.setAvailableMoves(Logic.getAllMoves(state)); // Is this wrong?
for (Move move : state.getAvailableMoves())
{
State nextState = state.getDeepCopy(state, move); // Copy of state to work on
Logic.doMove(move, nextState);
nextState.passTurn(nextState.getTurn());
Value val = miniMax(depth + 1, nextState, false);
if (val.getValue() >= bestScore)
{
bestScore = val.getValue();
bestMove = move;
}
}
if (bestMove != null)
Logic.doMove(bestMove, state);
return new Value(bestScore, bestMove);
}
private static Value getMin(int depth, State state)
{
int bestScore = Integer.MAX_VALUE;
Move bestMove = null;
state.setAvailableMoves(Logic.getAllMoves(state));
for (Move move : state.getAvailableMoves())
{
State nextState = state.getDeepCopy(state, move);
Logic.doMove(move, nextState);
state.passTurn(nextState.getTurn());
Value val = miniMax(depth + 1, nextState, true);
if (val.getValue() <= bestScore)
{
bestScore = val.getValue();
bestMove = move;
}
}
if (bestMove != null)
Logic.doMove(bestMove, state);
return new Value(bestScore, bestMove);
}
private static Value getScore(State state)
{
Player maximizer = state.getRedPlayer();
Player minimizer = state.getBlackPlayer();
if (state.getTurn() == maximizer && maximizer.getScore() == state.getPoints())
return new Value(10, state.getMove());
else if (state.getTurn() == minimizer && minimizer.getScore() == state.getPoints())
return new Value(-10, state.getMove());
else
return new Value(0, state.getMove());
}
private static class Value {
int value;
Move move;
public Value(int value, Move move) {
this.value = value;
this.move = move;
}
public Move getMove() {
return move;
}
public int getValue()
{
return value;
}
}
}
Here is a run of the game where I invoke minimax
Welcome!
// Initial state (human player makes a move)
+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| | | | 3
+---+---+---+
0 1 2
First player to make a move is Red
Choose an option:
- Insert
- Diagonal move
- Attack
- Jump
- Exit
Input: 1
Enter cell to insert on: 0,3
Red is trying to do a Insert move...
Red placed a piece on the board at (0, 3).
// So far so good
+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
Turn: Black
Starting MiniMax search... // OK, time for the AI to make a move
+---+---+---+
| | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
Black is trying to do a Insert move...
Black placed a piece on the board at (0, 0).
Turn: Red
// So far so good
+---+---+---+
| B | | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
Black is trying to do a Insert move...
Black placed a piece on the board at (1, 0).
Turn: Red
// Black just did two moves in a row, looks like the turn is not passed on properly?
// Looks like after just 1 move with minimax the state of the game becomes wrong?
+---+---+---+
| B | B | | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
Black is trying to do a Insert move...
Black placed a piece on the board at (2, 0).
Turn: Red
// Take note that the piece inserted by black at (1, 0) no longer exists on the board below
// Does getDeepCopy() not work properly?
+---+---+---+
| B | | B | 0
+---+---+---+
| | | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
Black is trying to do a Diagonal move...
Black moved diagonally from (0, 0) to (1, 1)
Turn: Red
// Same thing here, now the piece at (2, 0) is gone
// But the piece which was moved from (0, 0) to (1, 1) is still there
+---+---+---+
| | | | 0
+---+---+---+
| | B | | 1
+---+---+---+
| | | | 2
+---+---+---+
| R | | | 3
+---+---+---+
0 1 2
// Console output is duplicated because the move now happened on the actual game state
Black is trying to do a Diagonal move...
Black moved diagonally from (0, 0) to (1, 1)
// Red is trying to INSERT on a row which does not belong to Red
// Looks like the turn was not updated properly?
Red is trying to do a Insert move...
Invalid move: not starting row of Red.
Turn: Black
Thank you for reading this rather lengthy post. I hope I can gain some insights into what I am doing wrong so I can make this minimax search do its job properly!
</details>
专注分享java语言的经验与见解,让所有开发者获益!
评论