aboutsummaryrefslogtreecommitdiff
path: root/doc/tex/results.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tex/results.tex')
-rw-r--r--doc/tex/results.tex416
1 files changed, 416 insertions, 0 deletions
diff --git a/doc/tex/results.tex b/doc/tex/results.tex
new file mode 100644
index 0000000..3c586e4
--- /dev/null
+++ b/doc/tex/results.tex
@@ -0,0 +1,416 @@
+\newcommand{\boards}[5]{
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.4\textwidth]{#1}
+ \includegraphics[width=0.4\textwidth]{#2}
+ \caption{#3}
+ \label{#4}
+ #5
+ \end{center}
+\end{figure}
+}
+
+\newcommand{\moveHeatmap}[9]{
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.35\textwidth]{#1}
+ \includegraphics[width=0.5\textwidth]{#2}
+ \\[\smallskipamount]
+ \includegraphics[width=0.35\textwidth]{#3}
+ \includegraphics[width=0.5\textwidth]{#4}
+ \\[\smallskipamount]
+ \includegraphics[width=0.35\textwidth]{#5}
+ \includegraphics[width=0.5\textwidth]{#6}
+ \caption{#7}
+ \label{#8}
+ #9
+ \end{center}
+\end{figure}
+}
+
+\section{Results}
+
+\subsection{Monte Carlo Tree Search Evaluation}
+
+The Monte Carlo Algorithm tries to explore the tree of possibilities as
+efficiently as possible. With this approach, it can be expected to fail when
+alone on a problem such big as the game of Go. Nonetheless, there are some areas
+where it can be useful. It will be evaluated by its capabilities while playing
+games but also when presented with go problems.
+
+The Monte Carlo algorithm has been set to do 5 explorations with 10 simulations
+each when it is asked for a move. In the hardware used this makes it think for
+40 seconds on an empty 9x9 board.
+
+\subsubsection{Monte Carlo Tree Search against itself}
+
+\boards
+{img/matches/mctsVSmcts01.jpg}
+{img/matches/mctsVSmcts02.jpg}
+{Monte Carlo Tree Search against itself.}
+{fig:mctsVSmcts}
+{Both boards are from the same match. The first shows only the first 12 moves
+for clarity. The second shows the finished game.}
+
+When set to play against itself the algorithm shows what looks like random play.
+Some moves could be debated as sensible, but many of them are on the second or
+fist line, which so early on the game are the two worst places to play at.
+
+It seems clear that this algorithm, at least with this specific configuration,
+can't handle the size of an empty Go board, even on the 9x9 size.
+
+A record of the game is shown in \fref{fig:mctsVSmcts}.
+
+\subsubsection{Monte Carlo Tree Search against Go problems}
+
+\boards
+{img/problems/mctsProblem01.jpg}
+{img/problems/mctsProblem02.jpg}
+{Monte Carlo against the first of Cho Chikun's problems.}
+{fig:mctsProblem01}
+
+Since tree exploration on smaller boards or advanced games with little empty
+spaces should be easier the algorithm has also been tested on some Go problems.
+
+A Go problem or tsumego is a predefined layout of the board, or of a section of
+the board, for which the player must find some beneficial move. Life and death
+problems are a subset of tsumegos in which the survival of a group depends on
+finding the correct sequence to save or kill the group. One collection of such
+tsumegos is \textit{Cho Chikun's Encyclopedia of Life and Death}, part of which
+are available on OGS\cite{ogsLifeAndDeath}, an online go server.
+
+The first of these problems and what the algorithm suggested as moves is shown
+in \fref{fig:mctsProblem01}.
+
+Black makes the first move, which means the solution is to find some favorable
+outcome for black, which in this case is killing the white group. The white
+group has a critical point on B1. If white plays on B1 they make two eyes and
+live, but if black plays there first white can't make two eyes and dies, so B1
+is the solution to the tsumego. This is one of the easiest Go problems.
+
+The algorithm neglects this solution. While asked five times to generate a move
+for the starting position it suggested B1 only once.
+
+But notably, after making another move, it consistently suggested B1 for white,
+which is the solution now that white has to play. So in the end it was able to
+solve the tsumego, probably because after making a move it had already explored
+part of the tree but it was difficult that it explored the solution for the
+first move.
+
+The engine was tested against other tsumegos but it was not able to solve them,
+so no more are shown here.
+
+\subsection{Neural Network Training}
+
+Each network has been training by receiving batches of SGF files which are first
+converted to lists of moves by the Training System and then provided to the
+train function of the networks. This has been executed with the following
+command:
+
+\inputminted[fontsize=\small]{bash}{listings/trainCommand.sh}
+
+Which lists the contents of a folder containing multiple SGF files, shuffles the
+list, takes some of them and passes them to train.py as arguments. The combined
+list of game positions and moves made from them in all the games selected make
+up the sample of the training. The number of files selected can of course be
+adapted to the situation.
+
+The networks were configured to train for 20 epochs, that is, processing the
+full batch and updating the weights on the network 20 times. 10\% of the sample
+were used as validation. This means they were not used for training but to
+check the accuracy and loss of the network after training with the rest of
+the batch. This technique of validation can help detect overfitting, which
+happens when a network does a very good job on the population it has been
+trained on but it fails when receiving unseen data.
+
+The training shows loss decrementing and accuracy incrementing on each epoch for
+both training and validation samples, which suggests there is learning happening
+and no overfitting occurring.
+
+The number of files to train on has been selected by trying to keep the
+resulting training time manageable for the developer. It was found that
+incrementing the files per batch did not increment the training time linearly,
+as with batches of 10 games an epoch of training could be completed in one
+minute, in three minutes batches of 20 games, and in up to an hour with batches
+of 100 games.
+
+The outputs from this training process can be seen on \flist{code:denseTraining}
+and \flist{code:convTraining} for the dense network and the convolutional
+network respectively.
+
+\begin{listing}[h]
+ \inputminted[fontsize=\scriptsize]{text}{listings/denseTraining.txt}
+ \caption{Dense neural network training log.}
+ \label{code:denseTraining}
+\end{listing}
+
+\begin{listing}[h]
+ \inputminted[fontsize=\scriptsize]{text}{listings/convTraining.txt}
+ \caption{Convolutional neural network training log.}
+ \label{code:convTraining}
+\end{listing}
+
+\subsection{Neural Network Evaluation}
+
+One way of studying the results of the training could be to programmatically
+test if the network is able to predict human moves when provided with a game
+file. This has the drawback of not directly testing the strength of the network,
+since it would be tested on predicting the moves of some human player, whose
+level could greatly vary, and not on its playing capabilities alone. This also
+has already been tested to some extent by the validation portion of the samples
+providing for training, and the results show the network increasing its accuracy
+of predicting human moves through the training epochs.
+
+Another way is to make the network play either against itself, another network,
+or a human player, and analyze the score it provides to each possible play. The
+output of the network is a vector with the likelihood of making each move, so
+we can take this values as how strongly the engine suggests each of them.
+
+\subsection{Neural Networks Against Themselves}
+
+The case of neural networks playing against themselves is interesting in that it
+shows moves with high certainty, that is, the network strongly suggests a
+specific move. It can be thought that this would be the most common line of play
+the engine has seen, with one certain move leading into another.
+
+This happened for both the dense neural network and the convolutional network.
+
+\subsubsection{Dense network Against Itself}
+
+The initial moves of a match of the dense network against itself, along with
+heatmaps showing the output from the network (which chance it gives for each
+move), can be seen on Figs.~\ref{fig:denseVSdense01}, \ref{fig:denseVSdense02},
+\ref{fig:denseVSdense03} and \ref{fig:denseVSdense04}.
+
+The dense network starts on the center of the board, which is one of the
+standard openings in the 9x9 board. It starts on a very good track, but we must
+acknowledge that the empty board is a position present on every go match it has
+trained on and so it should know it well. It probably means the center was the
+most played opening in the sample. It is interesting to check the heatmap of
+this move, since the selected move has only a score of 0.27. Other common
+openings on the 9x9 are represented on the vertices with some significant score.
+
+The heatmap of the response to the center play is also interesting. The four
+highest scored moves are equivalent, since the first move has not broken any
+symmetry. The move they represent, two spaces from the first stone, is also the
+consensual response. The three moves following in score are also equivalent
+between them.
+
+The first white stone has broken the symmetry in one direction, and so the
+second black move is suggested with more likelihood than the previous one. This
+is also a very common sequence of play.
+
+The following moves are a white attack followed by black's sensible response,
+and a strong white extension (arguably to the wrong side, but the minutia of why
+the other side would be better is probably not grasped by the engine). The
+following moves could be called looking for complication, which is a strategy.
+
+Overall some moves could be criticised but the engine is showing at least a
+basic knowledge of the game's strategy.
+
+\subsubsection{Convolutional Network Against Itself}
+
+The same information on a match of the convolutional network against itself can
+be seen on Figs.~\ref{fig:convVSconv01}, \ref{fig:convVSconv02},
+\ref{fig:convVSconv03} and \ref{fig:convVSconv04}.
+
+The convolutional network also likes to start on the center of the board and is
+more certain of the move than the dense network. Its next two plays are also the
+same, but the match starts to diverge from the fourth move.
+
+Instead of attacking the black stone on the side, white plays directly under at
+B5, keeping the edge of symmetry. This is strategically important because while
+symmetry is present playing on either side makes no difference, while playing
+after the opponent has broken symmetry gives the opportunity to respond and
+capitalize on the now better side to play.
+
+Black responds to the white stone by attacking it at B4 and white then extends
+to the more open side with F7.
+
+The match up to this point is discussed on Sensei's Library as an interesting
+sequence deriving from the center first stone\cite{sl_sword}. The discussion
+there states that white should have extended to the other side.
+
+The following moves are some sensible continuations, with approaches and
+responses on each side. They are probably not the best moves, but the idea
+behind them can be read from the board.
+
+\subsubsection{Dense Network Against Convolutional}
+
+The first 12 and the last 6 moves of a match of the dense network against the
+convolutional network can be seen on be seen on Figs.~\ref{fig:denseVSconv01},
+\ref{fig:denseVSconv02}, \ref{fig:denseVSconv03}, \ref{fig:denseVSconv04},
+\ref{fig:denseVSconv05} and \ref{fig:denseVSconv06}.
+
+The dense network is given black. The convolutional network plays then as white.
+
+The first three moves are known territory for both networks, since both play the
+same sequence when matched against themselves.
+
+After white plays under the black stone as the fourth move, black makes a maybe
+too careful extension, but it seemed very sure about it. Now white has some
+doubts. It seems to like the extensions to both sides, if a bit more to the more
+open side, a preference both engines have previously shown at this point of the
+match. The move it prefers the most, though, would be to play again under black.
+So, as this move is impossible, it defers to the extension to the open side with
+F3.
+
+Black now makes the mistake of overattacking the lone white stone on B5 with B6,
+and white capitalizes on it by keeping extending with D3. B5 is still a move
+with a not negligible score, but it is no longer the highest.
+
+The following moves are questionable, with black keeping building a very tight
+group and getting no territory and white attacking too directly with a lone
+stone into that strong group.
+
+In the final moves of the game we see that black has built a living group on the
+right side and things are looking good for it on the rest of the board. White
+has passed a bunch of times, starting too early, which means it gives too high
+an importance to passing. Since the networks learned only by seeing moves in
+response to boards, it has not learnt that it should not pass until there is no
+more to gain. But it is also important to note that passing was not the most
+likely move for the engine to make, it just happens that the move it would like
+to make is already taken and the highest chance available move is to pass.
+
+\moveHeatmap
+{img/matches/denseVSdense_board_01.jpg}
+{img/matches/denseVSdense_heatmap_01.png}
+{img/matches/denseVSdense_board_02.jpg}
+{img/matches/denseVSdense_heatmap_02.png}
+{img/matches/denseVSdense_board_03.jpg}
+{img/matches/denseVSdense_heatmap_03.png}
+{Dense network against itself, moves 1 through 3.}
+{fig:denseVSdense01}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_04.jpg}
+{img/matches/denseVSdense_heatmap_04.png}
+{img/matches/denseVSdense_board_05.jpg}
+{img/matches/denseVSdense_heatmap_05.png}
+{img/matches/denseVSdense_board_06.jpg}
+{img/matches/denseVSdense_heatmap_06.png}
+{Dense network against itself, moves 4 through 6.}
+{fig:denseVSdense02}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_07.jpg}
+{img/matches/denseVSdense_heatmap_07.png}
+{img/matches/denseVSdense_board_08.jpg}
+{img/matches/denseVSdense_heatmap_08.png}
+{img/matches/denseVSdense_board_09.jpg}
+{img/matches/denseVSdense_heatmap_09.png}
+{Dense network against itself, moves 7 through 9.}
+{fig:denseVSdense03}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_10.jpg}
+{img/matches/denseVSdense_heatmap_10.png}
+{img/matches/denseVSdense_board_11.jpg}
+{img/matches/denseVSdense_heatmap_11.png}
+{img/matches/denseVSdense_board_12.jpg}
+{img/matches/denseVSdense_heatmap_12.png}
+{Dense network against itself, moves 10 through 12.}
+{fig:denseVSdense04}
+
+\moveHeatmap
+{img/matches/convVSconv_board_01.jpg}
+{img/matches/convVSconv_heatmap_01.png}
+{img/matches/convVSconv_board_02.jpg}
+{img/matches/convVSconv_heatmap_02.png}
+{img/matches/convVSconv_board_03.jpg}
+{img/matches/convVSconv_heatmap_03.png}
+{Convolutional network against itself, moves 1 through 3.}
+{fig:convVSconv01}
+
+\moveHeatmap
+{img/matches/convVSconv_board_04.jpg}
+{img/matches/convVSconv_heatmap_04.png}
+{img/matches/convVSconv_board_05.jpg}
+{img/matches/convVSconv_heatmap_05.png}
+{img/matches/convVSconv_board_06.jpg}
+{img/matches/convVSconv_heatmap_06.png}
+{Convolutional network against itself, moves 4 through 6.}
+{fig:convVSconv02}
+
+\moveHeatmap
+{img/matches/convVSconv_board_07.jpg}
+{img/matches/convVSconv_heatmap_07.png}
+{img/matches/convVSconv_board_08.jpg}
+{img/matches/convVSconv_heatmap_08.png}
+{img/matches/convVSconv_board_09.jpg}
+{img/matches/convVSconv_heatmap_09.png}
+{Convolutional network against itself, moves 7 through 9.}
+{fig:convVSconv03}
+
+\moveHeatmap
+{img/matches/convVSconv_board_10.jpg}
+{img/matches/convVSconv_heatmap_10.png}
+{img/matches/convVSconv_board_11.jpg}
+{img/matches/convVSconv_heatmap_11.png}
+{img/matches/convVSconv_board_12.jpg}
+{img/matches/convVSconv_heatmap_12.png}
+{Convolutional network against itself, moves 10 through 12.}
+{fig:convVSconv04}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_01.jpg}
+{img/matches/denseVSconv_heatmap_01.png}
+{img/matches/denseVSconv_board_02.jpg}
+{img/matches/denseVSconv_heatmap_02.png}
+{img/matches/denseVSconv_board_03.jpg}
+{img/matches/denseVSconv_heatmap_03.png}
+{Dense network against convolutional, moves 1 through 3.}
+{fig:denseVSconv01}
+{The dense network plays as black.}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_04.jpg}
+{img/matches/denseVSconv_heatmap_04.png}
+{img/matches/denseVSconv_board_05.jpg}
+{img/matches/denseVSconv_heatmap_05.png}
+{img/matches/denseVSconv_board_06.jpg}
+{img/matches/denseVSconv_heatmap_06.png}
+{Dense network against convolutional, moves 4 through 6.}
+{fig:denseVSconv02}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_07.jpg}
+{img/matches/denseVSconv_heatmap_07.png}
+{img/matches/denseVSconv_board_08.jpg}
+{img/matches/denseVSconv_heatmap_08.png}
+{img/matches/denseVSconv_board_09.jpg}
+{img/matches/denseVSconv_heatmap_09.png}
+{Dense network against convolutional, moves 7 through 9.}
+{fig:denseVSconv03}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_10.jpg}
+{img/matches/denseVSconv_heatmap_10.png}
+{img/matches/denseVSconv_board_11.jpg}
+{img/matches/denseVSconv_heatmap_11.png}
+{img/matches/denseVSconv_board_12.jpg}
+{img/matches/denseVSconv_heatmap_12.png}
+{Dense network against convolutional, moves 10 through 12.}
+{fig:denseVSconv04}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_72.jpg}
+{img/matches/denseVSconv_heatmap_72.png}
+{img/matches/denseVSconv_board_73.jpg}
+{img/matches/denseVSconv_heatmap_73.png}
+{img/matches/denseVSconv_board_74.jpg}
+{img/matches/denseVSconv_heatmap_74.png}
+{Dense network against convolutional, moves 72 through 74.}
+{fig:denseVSconv05}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_75.jpg}
+{img/matches/denseVSconv_heatmap_75.png}
+{img/matches/denseVSconv_board_76.jpg}
+{img/matches/denseVSconv_heatmap_76.png}
+{img/matches/denseVSconv_board_77.jpg}
+{img/matches/denseVSconv_heatmap_77.png}
+{Dense network against convolutional, moves 75 through 77.}
+{fig:denseVSconv06}
+
+% Needed line so the processing of the last command doesn't find an EOF