aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorInigoGutierrez <inigogf.95@gmail.com>2022-07-01 15:40:57 +0200
committerInigoGutierrez <inigogf.95@gmail.com>2022-07-01 15:40:57 +0200
commit4cc55348c8dbb1902a1246fba66237d5c59f0349 (patch)
treeebc363dec6ae00f711be7ebd6e31530f25af1d9f /doc
parent6724aeb3ba98c1b9f042344734c2d683e79dfc64 (diff)
downloadimago-4cc55348c8dbb1902a1246fba66237d5c59f0349.tar.gz
imago-4cc55348c8dbb1902a1246fba66237d5c59f0349.zip
Finished writing documentation.
Diffstat (limited to 'doc')
-rw-r--r--doc/Makefile12
-rw-r--r--doc/diagrams/gameRepresentation.puml17
-rw-r--r--doc/diagrams/skinparams.puml4
-rw-r--r--doc/listings/convModel.txt23
-rw-r--r--doc/listings/convTraining.txt40
-rw-r--r--doc/listings/denseModel.txt17
-rw-r--r--doc/listings/denseTraining.txt40
-rw-r--r--doc/listings/trainCommand.sh1
-rw-r--r--doc/tex/biber.bib22
-rw-r--r--doc/tex/conclusions.tex128
-rw-r--r--doc/tex/imago.tex (renamed from doc/tex/tfg.tex)49
-rw-r--r--doc/tex/implementation.tex35
-rw-r--r--doc/tex/introduction.tex2
-rw-r--r--doc/tex/planification.tex45
-rw-r--r--doc/tex/previousWorks.tex22
-rw-r--r--doc/tex/results.tex416
-rw-r--r--doc/tex/systemAnalysis.tex55
-rw-r--r--doc/tex/systemDesign.tex327
18 files changed, 1089 insertions, 166 deletions
diff --git a/doc/Makefile b/doc/Makefile
index 497e9b1..284000e 100644
--- a/doc/Makefile
+++ b/doc/Makefile
@@ -1,15 +1,19 @@
.SUFFIXES: .puml .png
-docName = tfg
+docName = imago
outputFolder = out
-texFiles = tex/tfg.tex tex/introduction.tex tex/planification.tex tex/implementation.tex tex/systemAnalysis.tex tex/biber.bib Makefile
+texFiles = tex/$(docName).tex tex/introduction.tex tex/planification.tex tex/systemAnalysis.tex tex/systemDesign.tex tex/implementation.tex tex/results.tex tex/conclusions.tex tex/biber.bib
-diagramImgs = diagrams/gameRepresentation.png diagrams/planificationWorkPlanEngine.png diagrams/planificationWorkPlanGame.png diagrams/sgfModule.png diagrams/useCases.png diagrams/analysisClasses.png diagrams/useCase_generateAMove.png diagrams/useCase_useAsBackend.png diagrams/useCase_playAMatch.png diagrams/interfaces.png diagrams/engineModule.png diagrams/modules.png diagrams/fullClasses.png
+diagramImgs = diagrams/planificationWorkPlanEngine.png diagrams/planificationWorkPlanGame.png diagrams/useCases.png diagrams/analysisClasses.png diagrams/useCase_generateAMove.png diagrams/useCase_useAsBackend.png diagrams/useCase_playAMatch.png diagrams/interfaces.png diagrams/gameModule.png diagrams/engineModule.png diagrams/trainingModule.png diagrams/modules.png diagrams/fullClasses.png
+
+imgs = img/imago.jpg img/models/denseModel.png img/models/convModel.png
+
+listings = listings/denseModel.txt listings/convModel.txt listings/denseTraining.txt listings/convTraining.txt listings/trainCommand.sh
all: $(docName).pdf
-$(docName).pdf: $(texFiles) $(diagramImgs)
+$(docName).pdf: $(texFiles) $(diagramImgs) $(imgs) $(listings) Makefile
[ -d $(outputFolder) ] || mkdir $(outputFolder)
xelatex -shell-escape -output-directory $(outputFolder) tex/$(docName).tex
biber $(outputFolder)/$(docName)
diff --git a/doc/diagrams/gameRepresentation.puml b/doc/diagrams/gameRepresentation.puml
deleted file mode 100644
index db5d5a5..0000000
--- a/doc/diagrams/gameRepresentation.puml
+++ /dev/null
@@ -1,17 +0,0 @@
-@startuml
-
-!include skinparams.puml
-
-!include GameState.pumlc
-!include GameTree.pumlc
-!include GameMove.pumlc
-!include GameBoard.pumlc
-
-GameState -> GameTree
-GameState --> GameMove: Last move
-GameTree *--> GameMove
-GameMove --> GameMove: Previous move
-GameMove *--> GameMove: Next moves
-GameBoard <- GameMove
-
-@enduml
diff --git a/doc/diagrams/skinparams.puml b/doc/diagrams/skinparams.puml
index c94ad2d..d9dce03 100644
--- a/doc/diagrams/skinparams.puml
+++ b/doc/diagrams/skinparams.puml
@@ -1,10 +1,10 @@
@startuml
'Old style
-skin rose
+'skin rose
skinparam {
- monochrome true
+ 'monochrome true
shadowing false
linetype polyline
}
diff --git a/doc/listings/convModel.txt b/doc/listings/convModel.txt
new file mode 100644
index 0000000..5c90975
--- /dev/null
+++ b/doc/listings/convModel.txt
@@ -0,0 +1,23 @@
+Model: "sequential"
+_________________________________________________________________
+ Layer (type) Output Shape Param #
+=================================================================
+ conv2d (Conv2D) (None, 9, 9, 32) 608
+
+ max_pooling2d (MaxPooling2D (None, 4, 4, 32) 0
+ )
+
+ conv2d_1 (Conv2D) (None, 4, 4, 64) 18496
+
+ max_pooling2d_1 (MaxPooling (None, 2, 2, 64) 0
+ 2D)
+
+ flatten (Flatten) (None, 256) 0
+
+ dense (Dense) (None, 82) 21074
+
+=================================================================
+Total params: 40,178
+Trainable params: 40,178
+Non-trainable params: 0
+_________________________________________________________________
diff --git a/doc/listings/convTraining.txt b/doc/listings/convTraining.txt
new file mode 100644
index 0000000..6108abc
--- /dev/null
+++ b/doc/listings/convTraining.txt
@@ -0,0 +1,40 @@
+Epoch 1/20
+39501/39501 - 279s - loss: 3.7391 - accuracy: 0.1064 - val_loss: 3.1316 - val_accuracy: 0.2023 - 279s/epoch - 7ms/step
+Epoch 2/20
+39501/39501 - 241s - loss: 2.6512 - accuracy: 0.3046 - val_loss: 2.0729 - val_accuracy: 0.4484 - 241s/epoch - 6ms/step
+Epoch 3/20
+39501/39501 - 214s - loss: 1.6459 - accuracy: 0.6014 - val_loss: 1.2040 - val_accuracy: 0.7143 - 214s/epoch - 5ms/step
+Epoch 4/20
+39501/39501 - 228s - loss: 0.9016 - accuracy: 0.8417 - val_loss: 0.6430 - val_accuracy: 0.8735 - 228s/epoch - 6ms/step
+Epoch 5/20
+39501/39501 - 230s - loss: 0.4704 - accuracy: 0.9380 - val_loss: 0.3520 - val_accuracy: 0.9378 - 230s/epoch - 6ms/step
+Epoch 6/20
+39501/39501 - 222s - loss: 0.2735 - accuracy: 0.9520 - val_loss: 0.2329 - val_accuracy: 0.9549 - 222s/epoch - 6ms/step
+Epoch 7/20
+39501/39501 - 215s - loss: 0.2117 - accuracy: 0.9495 - val_loss: 0.1837 - val_accuracy: 0.9583 - 215s/epoch - 5ms/step
+Epoch 8/20
+39501/39501 - 215s - loss: 0.1797 - accuracy: 0.9533 - val_loss: 0.1787 - val_accuracy: 0.9556 - 215s/epoch - 5ms/step
+Epoch 9/20
+39501/39501 - 225s - loss: 0.1607 - accuracy: 0.9553 - val_loss: 0.1952 - val_accuracy: 0.9446 - 225s/epoch - 6ms/step
+Epoch 10/20
+39501/39501 - 249s - loss: 0.1486 - accuracy: 0.9572 - val_loss: 0.1544 - val_accuracy: 0.9597 - 249s/epoch - 6ms/step
+Epoch 11/20
+39501/39501 - 208s - loss: 0.1380 - accuracy: 0.9586 - val_loss: 0.1467 - val_accuracy: 0.9651 - 208s/epoch - 5ms/step
+Epoch 12/20
+39501/39501 - 210s - loss: 0.1321 - accuracy: 0.9592 - val_loss: 0.1313 - val_accuracy: 0.9665 - 210s/epoch - 5ms/step
+Epoch 13/20
+39501/39501 - 204s - loss: 0.1276 - accuracy: 0.9598 - val_loss: 0.1282 - val_accuracy: 0.9665 - 204s/epoch - 5ms/step
+Epoch 14/20
+39501/39501 - 193s - loss: 0.1222 - accuracy: 0.9604 - val_loss: 0.1174 - val_accuracy: 0.9686 - 193s/epoch - 5ms/step
+Epoch 15/20
+39501/39501 - 183s - loss: 0.1182 - accuracy: 0.9607 - val_loss: 0.1747 - val_accuracy: 0.9433 - 183s/epoch - 5ms/step
+Epoch 16/20
+39501/39501 - 166s - loss: 0.1147 - accuracy: 0.9611 - val_loss: 0.1186 - val_accuracy: 0.9679 - 166s/epoch - 4ms/step
+Epoch 17/20
+39501/39501 - 163s - loss: 0.1119 - accuracy: 0.9616 - val_loss: 0.1112 - val_accuracy: 0.9699 - 163s/epoch - 4ms/step
+Epoch 18/20
+39501/39501 - 168s - loss: 0.1095 - accuracy: 0.9618 - val_loss: 0.1020 - val_accuracy: 0.9706 - 168s/epoch - 4ms/step
+Epoch 19/20
+39501/39501 - 161s - loss: 0.1072 - accuracy: 0.9625 - val_loss: 0.1058 - val_accuracy: 0.9699 - 161s/epoch - 4ms/step
+Epoch 20/20
+39501/39501 - 173s - loss: 0.1052 - accuracy: 0.9624 - val_loss: 0.1031 - val_accuracy: 0.9727 - 173s/epoch - 4ms/step
diff --git a/doc/listings/denseModel.txt b/doc/listings/denseModel.txt
new file mode 100644
index 0000000..006e321
--- /dev/null
+++ b/doc/listings/denseModel.txt
@@ -0,0 +1,17 @@
+Model: "sequential"
+_________________________________________________________________
+ Layer (type) Output Shape Param #
+=================================================================
+ dense (Dense) (None, 9, 9, 81) 243
+
+ dense_1 (Dense) (None, 9, 9, 81) 6642
+
+ flatten (Flatten) (None, 6561) 0
+
+ dense_2 (Dense) (None, 82) 538084
+
+=================================================================
+Total params: 544,969
+Trainable params: 544,969
+Non-trainable params: 0
+_________________________________________________________________
diff --git a/doc/listings/denseTraining.txt b/doc/listings/denseTraining.txt
new file mode 100644
index 0000000..0bfcb51
--- /dev/null
+++ b/doc/listings/denseTraining.txt
@@ -0,0 +1,40 @@
+Epoch 1/20
+148338/148338 - 1151s - loss: 1.1130 - accuracy: 0.6942 - val_loss: 0.6097 - val_accuracy: 0.8249 - 1151s/epoch - 8ms/step
+Epoch 2/20
+148338/148338 - 1084s - loss: 0.5366 - accuracy: 0.8572 - val_loss: 0.4744 - val_accuracy: 0.8617 - 1084s/epoch - 7ms/step
+Epoch 3/20
+148338/148338 - 1071s - loss: 0.4250 - accuracy: 0.8895 - val_loss: 0.4161 - val_accuracy: 0.8813 - 1071s/epoch - 7ms/step
+Epoch 4/20
+148338/148338 - 1118s - loss: 0.3678 - accuracy: 0.9066 - val_loss: 0.3493 - val_accuracy: 0.9024 - 1118s/epoch - 8ms/step
+Epoch 5/20
+148338/148338 - 1092s - loss: 0.3320 - accuracy: 0.9170 - val_loss: 0.3103 - val_accuracy: 0.9185 - 1092s/epoch - 7ms/step
+Epoch 6/20
+148338/148338 - 1097s - loss: 0.3078 - accuracy: 0.9241 - val_loss: 0.3132 - val_accuracy: 0.9145 - 1097s/epoch - 7ms/step
+Epoch 7/20
+148338/148338 - 1074s - loss: 0.2899 - accuracy: 0.9293 - val_loss: 0.2779 - val_accuracy: 0.9257 - 1074s/epoch - 7ms/step
+Epoch 8/20
+148338/148338 - 1114s - loss: 0.2762 - accuracy: 0.9330 - val_loss: 0.2709 - val_accuracy: 0.9246 - 1114s/epoch - 8ms/step
+Epoch 9/20
+148338/148338 - 1111s - loss: 0.2660 - accuracy: 0.9351 - val_loss: 0.2577 - val_accuracy: 0.9319 - 1111s/epoch - 7ms/step
+Epoch 10/20
+148338/148338 - 1104s - loss: 0.2563 - accuracy: 0.9374 - val_loss: 0.2446 - val_accuracy: 0.9388 - 1104s/epoch - 7ms/step
+Epoch 11/20
+148338/148338 - 1084s - loss: 0.2489 - accuracy: 0.9394 - val_loss: 0.2441 - val_accuracy: 0.9348 - 1084s/epoch - 7ms/step
+Epoch 12/20
+148338/148338 - 1088s - loss: 0.2419 - accuracy: 0.9407 - val_loss: 0.2538 - val_accuracy: 0.9337 - 1088s/epoch - 7ms/step
+Epoch 13/20
+148338/148338 - 1090s - loss: 0.2365 - accuracy: 0.9416 - val_loss: 0.2538 - val_accuracy: 0.9312 - 1090s/epoch - 7ms/step
+Epoch 14/20
+148338/148338 - 1063s - loss: 0.2314 - accuracy: 0.9430 - val_loss: 0.2484 - val_accuracy: 0.9308 - 1063s/epoch - 7ms/step
+Epoch 15/20
+148338/148338 - 1111s - loss: 0.2271 - accuracy: 0.9436 - val_loss: 0.2373 - val_accuracy: 0.9359 - 1111s/epoch - 7ms/step
+Epoch 16/20
+148338/148338 - 1124s - loss: 0.2235 - accuracy: 0.9443 - val_loss: 0.2542 - val_accuracy: 0.9257 - 1124s/epoch - 8ms/step
+Epoch 17/20
+148338/148338 - 1074s - loss: 0.2202 - accuracy: 0.9451 - val_loss: 0.2368 - val_accuracy: 0.9327 - 1074s/epoch - 7ms/step
+Epoch 18/20
+148338/148338 - 1120s - loss: 0.2181 - accuracy: 0.9453 - val_loss: 0.2462 - val_accuracy: 0.9286 - 1120s/epoch - 8ms/step
+Epoch 19/20
+148338/148338 - 1121s - loss: 0.2159 - accuracy: 0.9460 - val_loss: 0.2375 - val_accuracy: 0.9316 - 1121s/epoch - 8ms/step
+Epoch 20/20
+148338/148338 - 1115s - loss: 0.2154 - accuracy: 0.9458 - val_loss: 0.2273 - val_accuracy: 0.9352 - 1115s/epoch - 8ms/step
diff --git a/doc/listings/trainCommand.sh b/doc/listings/trainCommand.sh
new file mode 100644
index 0000000..a5f09c4
--- /dev/null
+++ b/doc/listings/trainCommand.sh
@@ -0,0 +1 @@
+./train.py $(ls ../collections/minigo/matches/*.sgf | shuf | head -n 50)
diff --git a/doc/tex/biber.bib b/doc/tex/biber.bib
index 8884962..d0a714e 100644
--- a/doc/tex/biber.bib
+++ b/doc/tex/biber.bib
@@ -132,3 +132,25 @@
urldate = {2022},
url = {https://plantuml.com}
}
+
+@online{sl_sword,
+ title = {9x9 Tengen Openings at Sensei's Library},
+ urldate = {2022},
+ url = {https://senseis.xmp.net/?9x9TengenOpenings}
+}
+
+@online{ogsLifeAndDeath,
+ author = {sunspark},
+ title = {Cho Chikun's Encyclopedia of Life and Death - Elementary: 1 / 900},
+ date = {2016},
+ urldate = {2022},
+ url = {https://online-go.com/puzzle/2824}
+}
+
+@online{fsf_free,
+ author = {Free Software Foundation},
+ title = {What is Free Software? - GNU Project - Free Software Foundation},
+ date = {2022},
+ urldate = {2022},
+ url = {https://www.gnu.org/philosophy/free-sw.html}
+}
diff --git a/doc/tex/conclusions.tex b/doc/tex/conclusions.tex
new file mode 100644
index 0000000..ebbf7fd
--- /dev/null
+++ b/doc/tex/conclusions.tex
@@ -0,0 +1,128 @@
+\section{Conclusions}
+
+Some problems found during the project, an evaluation of the results and what
+could be done in the future to improve the system are discussed here.
+
+\subsection{Problems with the Implementation of the Game}
+
+While Go has a simple set of rules, they lead to many borderline cases which
+must be specifically addressed. For example, the ko rule obliges to check the
+previous board positions after each move so no boards are repeated.
+
+These problems have been solved by designing the tree of moves of a match so
+each move stores the full board layout after it has been played. This of course
+increases the memory needed to run the application, but has been chosen over the
+alternative of having to recreate the board parsing the tree backwards for each
+move to check if ko is occurring, if a move makes a capture and which stones
+have already been captured, etc. It is the old problem of memory versus
+processing, which in this case has been answered with memory. Which one would be
+the best solution would require a deep analysis out of the scope of the project.
+
+\subsection{Problems with the Monte Carlo Tree Search Algorithm}
+
+The implementation Monte Carlo Tree Search algorithm was not precise by itself
+when asked for moves. This could be attributed to various factors:
+
+\begin{itemize}
+
+ \item The way the score is counted makes it difficult to evaluate who is
+ winning at the beginning or middle of the game, so the score of a move
+ has to be evaluated by running playing simulations. These simulations
+ are not precise on analyzing the result since the moves in them are
+ played at random.
+
+ \item The configuration used, 5 explorations with 10 simulations each for
+ cycle, does not allow for many good moves to be found.
+
+\end{itemize}
+
+To solve these problems, either a good score evaluation engine has to be
+implemented, which is another machine learning problem by itself, or the
+explorations and simulations of the engine tuned up to values unable to be run
+on the development hardware.
+
+\subsection{Problems with Neural Networks}
+
+Some concerns with the design and results of the neural networks are discussed
+on this section.
+
+\subsubsection{Finding the correct design}
+
+When approaching neural networks design with a basic understanding of their
+inner workings it is easy to get lost in all the different factors that can be
+tuned. The number and variant of layers, their number of nodes, their activation
+function, the learning rate; they can all be tweaked and doing so in the favour
+of the design requires a good knowledge of what is being done.
+
+The basic approach was to design the input and output shapes and then adjust the
+network based on that. At first, the desired output was a 9x9 two-dimensional
+matrix with the probabilities of playing on each move. A heatmap generated from
+a failed solution of this approach can be seen on \fref{fig:oldHM}, where the
+network distributed the probability for each row instead of the whole board. The
+fitting of shapes of input, internal layers and output has found to be tricky at
+first.
+
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=\textwidth]{img/matches/oldLineByLineHeatmap.png}
+ \caption{A heatmap from an old design of the dense neural network.}
+ \label{fig:oldHM}
+ Note that each row adds up to one, meaning that the engine is selecting
+ the best move for each row and not for the whole board.
+ \end{center}
+\end{figure}
+
+Finally, based on the design of classification networks where each possible
+class is mapped to a value in the one-dimensional output vector, it was decided
+that the output was not a two-dimensional matrix but a one-dimensional vector.
+This also helped address the possibility of passing, treating it as just another
+possible move that was previously left out because of not knowing how to fit it
+on the matrix of possible moves. This vector is then reshaped to a matrix
+representing the board when needed, for example to generate the heatmaps used
+for evaluation.
+
+\subsubsection{Mistakes on Networks' Play Patterns}
+
+One notable mistake made by the networks, specially the convolutional network,
+was passing to much. Passing is considered just another move, so the networks
+have no grasp that they should not pass until there are no more valuable moves
+to be made. A new design problem could be to create a specific passing policy.
+
+Another observation is that the engines seem to follow traditional openings and
+make reasonable decisions at the start of the game, with some interesting
+variations. But as the game progresses their moves make less sense. This could
+be because they have seen many examples of starting moves but complications at
+the middle and end game are much more diverse and the networks had not been
+trained on similar situations.
+
+\subsection{Viable Extensions}
+
+There are several ways this project could be improved, but were left out by lack
+of resources or time. Some interesting ones are explored on this section.
+
+\subsubsection{Joining Monte Carlo Tree Search and Neural Networks}
+
+The tree search algorithm is better at the end of the game, where the tree to
+explore is much more manageable. On the other hand, the neural networks are
+better at the beginning, where the positions are more common and patterns are
+simpler. The logical next step would be to join both: the network could select
+the best moves to be made, and the tree search algorithm could explore them to
+select the best and grow its explored tree in meaningful directions.
+
+This is also what was done by AlphaGo. By seeing the results obtained in this
+project it makes sense they made that decision.
+
+\subsubsection{Train the Engine by Itself}
+
+By training a game engine on human matches, it can at best get as good as the
+human players it has learned from.
+
+Top engines learn not from humans but by playing against themselves. This would
+be an interesting problem to tackle, if requiring a new approach to the design
+of the networks and a lot of training resources.
+
+\subsubsection{Presenting the Engine to the World}
+
+Some Go servers allow for bot players to be registered. This would provide a
+source of training against players from all around the world and also an
+evaluation over time of the engine strength.
diff --git a/doc/tex/tfg.tex b/doc/tex/imago.tex
index 4c64bb2..8242467 100644
--- a/doc/tex/tfg.tex
+++ b/doc/tex/imago.tex
@@ -1,4 +1,4 @@
-\documentclass{article}
+\documentclass[12pt]{article}
\usepackage{geometry}
\usepackage{graphicx}
@@ -6,17 +6,12 @@
\usepackage{hyperref}
\usepackage{csquotes}
\usepackage{enumitem}
+\usepackage[indent=20pt]{parskip} % Space between paragraphs
+\usepackage{indentfirst} % Indent first paragraph of sections
-\usepackage{chngcntr}
-\counterwithin{figure}{section}
-
-\usepackage[backend=biber, style=numeric, sorting=none]{biblatex}
-\addbibresource{tex/biber.bib}
-
-\usepackage{minted} % Code importing and formatting
-\setminted{linenos, breaklines}
+\geometry{left=3cm,top=2cm,bottom=2cm,right=3cm}
-\geometry{left=4cm,top=2cm,bottom=2cm,right=4cm}
+\usepackage{chngcntr}
\hypersetup{colorlinks=false,
linkcolor=black,
@@ -27,6 +22,12 @@
\urlstyle{mono}
+\usepackage[backend=biber, style=numeric, sorting=none]{biblatex}
+\addbibresource{tex/biber.bib}
+
+\usepackage{minted} % Code importing and formatting
+\setminted{linenos, breaklines}
+
\newcommand{\program}{Imago}
\newcommand{\fref}[1]{Fig.~\ref{#1}}
@@ -34,12 +35,19 @@
%\newcommand{\uurl}[1]{\underline{\url{#1}}}
\newcommand{\tabitem}{~~\llap{\textbullet}~~}
+
\begin{document}
\frenchspacing
-\title{\program\\
-\large An AI player of the game of Go}
+\title{
+ \begin{center}
+ \includegraphics[width=0.4\textwidth]{img/logoUniovi.png}\hfill
+ \includegraphics[width=0.3\textwidth]{img/logoEII.png}
+ \end{center}~\\[10pt]
+ \program\\
+ \large An AI player of the game of Go
+}
\author{Íñigo Gutiérrez Fernández}
@@ -118,22 +126,39 @@ inclusion to use this template is included here.
\setcounter{tocdepth}{4}
\tableofcontents
+\clearpage
+
+\counterwithin{figure}{section}
+\renewcommand{\listfigurename}{Figures}
\listoffigures
+\clearpage
+
+\renewcommand{\listoflistingscaption}{Listings}
+\counterwithin{listing}{section}
\listoflistings
\clearpage
\input{tex/introduction.tex}
+\clearpage
\input{tex/planification.tex}
+\clearpage
\input{tex/systemAnalysis.tex}
+\clearpage
\input{tex/systemDesign.tex}
+\clearpage
\input{tex/implementation.tex}
+\clearpage
+
+\input{tex/results.tex}
+\clearpage
+\input{tex/conclusions.tex}
\clearpage
% References (bibliography)
diff --git a/doc/tex/implementation.tex b/doc/tex/implementation.tex
index f88d57f..d46e805 100644
--- a/doc/tex/implementation.tex
+++ b/doc/tex/implementation.tex
@@ -1,6 +1,23 @@
\section{Implementation}
-\subsection{Development environment}
+\subsection{Development Hardware}
+
+The development and evaluation machine is a Satellite L50-C laptop. Its
+specifications are shown as a list for readability.
+
+\begin{itemize}
+ \item \textbf{CPU}: Intel i5-6200U, 2.800GHz
+ \item \textbf{Integrated GPU}: Intel Skylake GT2
+ \item \textbf{Dedicated GPU}: Nvidia Geforce 930M
+ \item \textbf{RAM}: 8 GiB
+\end{itemize}
+
+\subsection{Development Software}
+
+The tools selected for the development of the project and the documentation are
+listed and explained on this section. All the tools used are either
+free\cite{fsf_free} or open source software. The development machine runs 64
+bits Arch Linux as its operating system.
\subsubsection{Language}
@@ -53,7 +70,7 @@ These are some utility libraries commonly used for frequent programming tasks:
\item \textbf{copy}: to obtain deep copies of multidimensional arrays.
\end{itemize}
-\subsubsection{Development tools}
+\subsubsection{Development Tools}
\paragraph{Neovim\cite{neovim}}
@@ -64,14 +81,14 @@ environment with a very fluid usage experience. That, and the fact that the
developer considers Vim's modal editing the best writing experience possible on
a computer, have made Neovim the editor of choice.
-\begin{itemize}
- %TODO: Write about neovim extensions
- \item FZF
- \item Extensions
-
-\end{itemize}
+%TODO: Write about neovim extensions
+%\begin{itemize}
+% \item FZF
+% \item Extensions
+%
+%\end{itemize}
-\subsubsection{Documentation tools}
+\subsubsection{Documentation Tools}
\paragraph{\LaTeX\cite{latex}}
diff --git a/doc/tex/introduction.tex b/doc/tex/introduction.tex
index 6defbd9..2c710ea 100644
--- a/doc/tex/introduction.tex
+++ b/doc/tex/introduction.tex
@@ -12,5 +12,5 @@
game at heart, Go has nonetheless been interpreted as a stylized representation
of fighting a war, settling a frontier, cornering a market, thrashing out an
argument, or even of fortune-telling and prophecy. Go has always been one of the
- most played games in the world. \parencite{sl_go}
+ most played games in the world.\cite{sl_go}
\end{displayquote}
diff --git a/doc/tex/planification.tex b/doc/tex/planification.tex
index 00695a1..f0f7535 100644
--- a/doc/tex/planification.tex
+++ b/doc/tex/planification.tex
@@ -1,6 +1,6 @@
\section{Planning}
-\subsection{Driving needs}
+\subsection{Driving Needs}
As one of the deepest and most studied games in the world, Go presents a very
interesting problem for artificial intelligence. Implementing not only the
@@ -29,18 +29,18 @@ Presented here are the ideal targets of the project.
recorded in the SGF format.
\end{itemize}
-\subsection{Project stages}
+\subsection{Project Stages}
The project will be organized in several stages based on the different
components and needs.
-\subsubsection{Game implementation}
+\subsubsection{Game Implementation}
The rules of the game must be implemented, ideally in a way they can be tested
by direct human play. This system will at its bare minimum represent the
Japanese Go rules (area scoring, no superko rule, no suicide moves).
-\subsubsection{Engine implementation}
+\subsubsection{Engine Implementation}
The key of this project is to create some kind of system able to generate strong
moves based on any given board configuration: this will be such system. It will
@@ -50,7 +50,7 @@ either player. It should also be modular enough so different algorithms can be
selected and tested against each other as an experimental search for the best of
them.
-\subsubsection{Artificial Intelligence algorithms}
+\subsubsection{Artificial Intelligence Algorithms}
Different algorithms for the engine to use should be implemented and tested. The
results of this development and testing process should be presented as part of
@@ -66,7 +66,7 @@ Computer Science at the University of Oviedo.
The used material consists of a development and testing machine owned by the
student with specifications stated later on the project plan.
-\subsection{Work plan}
+\subsection{Work Plan}
The sole developer will be the student, who is currently working as a Junior
Software Engineer on a 35 hour per week schedule and with no university
@@ -92,9 +92,9 @@ Friday. Gantt diagrams for the planned working schedule are shown as
\end{center}
\end{figure}
-\subsection{Previous works}
+\subsection{Previous Works}
-\subsubsection{Existing engines}
+\subsubsection{Existing Engines}
\paragraph{AlphaGo}
@@ -103,28 +103,28 @@ owned by Google. It revolutionized the world of Go in 2015 and 2016 when it
respectively became the first AI to win against a professional Go player and
then won against Lee Sedol, a Korean player of the highest professional rank and
one of the strongest players in the world at the time. Its source code is
-closed, but a paper \parencite{natureAlphaGo2016} written by the team and
+closed, but a paper\cite{natureAlphaGo2016} written by the team and
published on Nature is available on
https://storage.googleapis.com/deepmind-media/alphago/AlphaGoNaturePaper.pdf.
The unprecedented success of AlphaGo served as inspiration for many AI projects,
including this one.
-\paragraph{KataGo~\cite{katago}}
+\paragraph{KataGo\cite{katago}}
An open source project based on the AlphaGo paper that also achieved superhuman
strength of play. The availability of its implementation and documentation
presents a great resource for this project.
-\paragraph{GnuGo~\cite{gnugo}}
+\paragraph{GnuGo\cite{gnugo}}
A software capable of playing Go part of the GNU project. Although not a strong
engine anymore, it is interesting for historic reasons as the free software
engine for which the GTP protocol was first defined.
-\subsubsection{Existing standards}
+\subsubsection{Existing Standards}
-\paragraph{GTP~\cite{gtp}}
+\paragraph{GTP\cite{gtp}}
GTP (\textit{Go Text Protocol}) is a text based protocol for
communication with computer go programs. It is the protocol used by GNU Go and
@@ -132,20 +132,23 @@ the more modern and powerful KataGo. By supporting GTP the engine developed for
this project can be used with existing GUIs and other programs, making it easier
to use it with the tools users are already familiar with.
-\paragraph{SGF~\cite{sgf}}
+\paragraph{SGF\cite{sgf}}
-SGF (\textit{Smart Game Format}) is a text format widely used for storing
-records of Go matches which allows for variants, comments and other metadata.
-Many popular playing tools use it. By supporting SGF vast existing collections
-of games can be used to train the decision algorithms based on neural networks.
+SGF (\textit{Smart Go Format} or, in a more general context, \textit{Smart Game
+Format}) is a text format widely used for storing records of Go matches which
+allows for variants, comments and other metadata. It was devised for Go but it
+supports other games with similar turn-based structure. Many popular playing
+tools use it. By supporting SGF vast existing collections of games, such as
+those played on online Go servers, can be used to train the decision algorithms
+based on neural networks.
-\subsubsection{Sabaki~\cite{sabaki}}
+\subsubsection{Sabaki\cite{sabaki}}
Sabaki is a go board software compatible with GTP engines. It can serve as a GUI
for the engine developed in this project and as an example of the advantages of
following a standardized protocol.
-\subsubsection{Keras~\cite{keras}}
+\subsubsection{Keras\cite{keras}}
Keras is a deep learning API for Python allowing for the high-level definition
of neural networks. This permits easily testing and comparing different network
@@ -153,7 +156,7 @@ layouts.
\subsection{Technological Infrastructure}
-\subsubsection{Programming language}\label{sec:programmingLanguage}
+\subsubsection{Programming Language}\label{sec:programmingLanguage}
The resulting product of this project will be one or more pieces of software
able to be run locally on a personal computer. The programming language of
diff --git a/doc/tex/previousWorks.tex b/doc/tex/previousWorks.tex
deleted file mode 100644
index 6e503a3..0000000
--- a/doc/tex/previousWorks.tex
+++ /dev/null
@@ -1,22 +0,0 @@
-\section{Previous works}
-
-\subsection{SGF}
-
-SGF (\textit{Smart Go Format} or, in a more general context, \textit{Smart Game
-Format}) is a text file format specification for records of games or collections
-of them. It was devised for Go but it supports other games with similar
-turn-based structure. It supports move variations, annotations, setups and game
-metadata. By supporting SGF our application can be used to analyse existing
-games registered by other applications, such as those played on online Go
-servers.
-
-The SGF specification can be found at
-\url{https://www.red-bean.com/sgf/user_guide/index.html}
-
-\subsection{GTP}
-
-GTP (\textit{Go Text Protocol}) is a text based protocol for communication with
-computer go programs. [ref https://www.lysator.liu.se/~gunnar/gtp/] It is the
-protocol used by GNU Go and the more modern and powerful KataGo. By supporting
-GTP our application can be used with existing GUIs and other programs, making
-it easier to use it with the tools users are already familiar with.
diff --git a/doc/tex/results.tex b/doc/tex/results.tex
new file mode 100644
index 0000000..3c586e4
--- /dev/null
+++ b/doc/tex/results.tex
@@ -0,0 +1,416 @@
+\newcommand{\boards}[5]{
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.4\textwidth]{#1}
+ \includegraphics[width=0.4\textwidth]{#2}
+ \caption{#3}
+ \label{#4}
+ #5
+ \end{center}
+\end{figure}
+}
+
+\newcommand{\moveHeatmap}[9]{
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.35\textwidth]{#1}
+ \includegraphics[width=0.5\textwidth]{#2}
+ \\[\smallskipamount]
+ \includegraphics[width=0.35\textwidth]{#3}
+ \includegraphics[width=0.5\textwidth]{#4}
+ \\[\smallskipamount]
+ \includegraphics[width=0.35\textwidth]{#5}
+ \includegraphics[width=0.5\textwidth]{#6}
+ \caption{#7}
+ \label{#8}
+ #9
+ \end{center}
+\end{figure}
+}
+
+\section{Results}
+
+\subsection{Monte Carlo Tree Search Evaluation}
+
+The Monte Carlo Algorithm tries to explore the tree of possibilities as
+efficiently as possible. With this approach, it can be expected to fail when
+alone on a problem such big as the game of Go. Nonetheless, there are some areas
+where it can be useful. It will be evaluated by its capabilities while playing
+games but also when presented with go problems.
+
+The Monte Carlo algorithm has been set to do 5 explorations with 10 simulations
+each when it is asked for a move. In the hardware used this makes it think for
+40 seconds on an empty 9x9 board.
+
+\subsubsection{Monte Carlo Tree Search against itself}
+
+\boards
+{img/matches/mctsVSmcts01.jpg}
+{img/matches/mctsVSmcts02.jpg}
+{Monte Carlo Tree Search against itself.}
+{fig:mctsVSmcts}
+{Both boards are from the same match. The first shows only the first 12 moves
+for clarity. The second shows the finished game.}
+
+When set to play against itself the algorithm shows what looks like random play.
+Some moves could be debated as sensible, but many of them are on the second or
+fist line, which so early on the game are the two worst places to play at.
+
+It seems clear that this algorithm, at least with this specific configuration,
+can't handle the size of an empty Go board, even on the 9x9 size.
+
+A record of the game is shown in \fref{fig:mctsVSmcts}.
+
+\subsubsection{Monte Carlo Tree Search against Go problems}
+
+\boards
+{img/problems/mctsProblem01.jpg}
+{img/problems/mctsProblem02.jpg}
+{Monte Carlo against the first of Cho Chikun's problems.}
+{fig:mctsProblem01}
+
+Since tree exploration on smaller boards or advanced games with little empty
+spaces should be easier the algorithm has also been tested on some Go problems.
+
+A Go problem or tsumego is a predefined layout of the board, or of a section of
+the board, for which the player must find some beneficial move. Life and death
+problems are a subset of tsumegos in which the survival of a group depends on
+finding the correct sequence to save or kill the group. One collection of such
+tsumegos is \textit{Cho Chikun's Encyclopedia of Life and Death}, part of which
+are available on OGS\cite{ogsLifeAndDeath}, an online go server.
+
+The first of these problems and what the algorithm suggested as moves is shown
+in \fref{fig:mctsProblem01}.
+
+Black makes the first move, which means the solution is to find some favorable
+outcome for black, which in this case is killing the white group. The white
+group has a critical point on B1. If white plays on B1 they make two eyes and
+live, but if black plays there first white can't make two eyes and dies, so B1
+is the solution to the tsumego. This is one of the easiest Go problems.
+
+The algorithm neglects this solution. While asked five times to generate a move
+for the starting position it suggested B1 only once.
+
+But notably, after making another move, it consistently suggested B1 for white,
+which is the solution now that white has to play. So in the end it was able to
+solve the tsumego, probably because after making a move it had already explored
+part of the tree but it was difficult that it explored the solution for the
+first move.
+
+The engine was tested against other tsumegos but it was not able to solve them,
+so no more are shown here.
+
+\subsection{Neural Network Training}
+
+Each network has been training by receiving batches of SGF files which are first
+converted to lists of moves by the Training System and then provided to the
+train function of the networks. This has been executed with the following
+command:
+
+\inputminted[fontsize=\small]{bash}{listings/trainCommand.sh}
+
+Which lists the contents of a folder containing multiple SGF files, shuffles the
+list, takes some of them and passes them to train.py as arguments. The combined
+list of game positions and moves made from them in all the games selected make
+up the sample of the training. The number of files selected can of course be
+adapted to the situation.
+
+The networks were configured to train for 20 epochs, that is, processing the
+full batch and updating the weights on the network 20 times. 10\% of the sample
+were used as validation. This means they were not used for training but to
+check the accuracy and loss of the network after training with the rest of
+the batch. This technique of validation can help detect overfitting, which
+happens when a network does a very good job on the population it has been
+trained on but it fails when receiving unseen data.
+
+The training shows loss decrementing and accuracy incrementing on each epoch for
+both training and validation samples, which suggests there is learning happening
+and no overfitting occurring.
+
+The number of files to train on has been selected by trying to keep the
+resulting training time manageable for the developer. It was found that
+incrementing the files per batch did not increment the training time linearly,
+as with batches of 10 games an epoch of training could be completed in one
+minute, in three minutes batches of 20 games, and in up to an hour with batches
+of 100 games.
+
+The outputs from this training process can be seen on \flist{code:denseTraining}
+and \flist{code:convTraining} for the dense network and the convolutional
+network respectively.
+
+\begin{listing}[h]
+ \inputminted[fontsize=\scriptsize]{text}{listings/denseTraining.txt}
+ \caption{Dense neural network training log.}
+ \label{code:denseTraining}
+\end{listing}
+
+\begin{listing}[h]
+ \inputminted[fontsize=\scriptsize]{text}{listings/convTraining.txt}
+ \caption{Convolutional neural network training log.}
+ \label{code:convTraining}
+\end{listing}
+
+\subsection{Neural Network Evaluation}
+
+One way of studying the results of the training could be to programmatically
+test if the network is able to predict human moves when provided with a game
+file. This has the drawback of not directly testing the strength of the network,
+since it would be tested on predicting the moves of some human player, whose
+level could greatly vary, and not on its playing capabilities alone. This also
+has already been tested to some extent by the validation portion of the samples
+providing for training, and the results show the network increasing its accuracy
+of predicting human moves through the training epochs.
+
+Another way is to make the network play either against itself, another network,
+or a human player, and analyze the score it provides to each possible play. The
+output of the network is a vector with the likelihood of making each move, so
+we can take this values as how strongly the engine suggests each of them.
+
+\subsection{Neural Networks Against Themselves}
+
+The case of neural networks playing against themselves is interesting in that it
+shows moves with high certainty, that is, the network strongly suggests a
+specific move. It can be thought that this would be the most common line of play
+the engine has seen, with one certain move leading into another.
+
+This happened for both the dense neural network and the convolutional network.
+
+\subsubsection{Dense network Against Itself}
+
+The initial moves of a match of the dense network against itself, along with
+heatmaps showing the output from the network (which chance it gives for each
+move), can be seen on Figs.~\ref{fig:denseVSdense01}, \ref{fig:denseVSdense02},
+\ref{fig:denseVSdense03} and \ref{fig:denseVSdense04}.
+
+The dense network starts on the center of the board, which is one of the
+standard openings in the 9x9 board. It starts on a very good track, but we must
+acknowledge that the empty board is a position present on every go match it has
+trained on and so it should know it well. It probably means the center was the
+most played opening in the sample. It is interesting to check the heatmap of
+this move, since the selected move has only a score of 0.27. Other common
+openings on the 9x9 are represented on the vertices with some significant score.
+
+The heatmap of the response to the center play is also interesting. The four
+highest scored moves are equivalent, since the first move has not broken any
+symmetry. The move they represent, two spaces from the first stone, is also the
+consensual response. The three moves following in score are also equivalent
+between them.
+
+The first white stone has broken the symmetry in one direction, and so the
+second black move is suggested with more likelihood than the previous one. This
+is also a very common sequence of play.
+
+The following moves are a white attack followed by black's sensible response,
+and a strong white extension (arguably to the wrong side, but the minutia of why
+the other side would be better is probably not grasped by the engine). The
+following moves could be called looking for complication, which is a strategy.
+
+Overall some moves could be criticised but the engine is showing at least a
+basic knowledge of the game's strategy.
+
+\subsubsection{Convolutional Network Against Itself}
+
+The same information on a match of the convolutional network against itself can
+be seen on Figs.~\ref{fig:convVSconv01}, \ref{fig:convVSconv02},
+\ref{fig:convVSconv03} and \ref{fig:convVSconv04}.
+
+The convolutional network also likes to start on the center of the board and is
+more certain of the move than the dense network. Its next two plays are also the
+same, but the match starts to diverge from the fourth move.
+
+Instead of attacking the black stone on the side, white plays directly under at
+B5, keeping the edge of symmetry. This is strategically important because while
+symmetry is present playing on either side makes no difference, while playing
+after the opponent has broken symmetry gives the opportunity to respond and
+capitalize on the now better side to play.
+
+Black responds to the white stone by attacking it at B4 and white then extends
+to the more open side with F7.
+
+The match up to this point is discussed on Sensei's Library as an interesting
+sequence deriving from the center first stone\cite{sl_sword}. The discussion
+there states that white should have extended to the other side.
+
+The following moves are some sensible continuations, with approaches and
+responses on each side. They are probably not the best moves, but the idea
+behind them can be read from the board.
+
+\subsubsection{Dense Network Against Convolutional}
+
+The first 12 and the last 6 moves of a match of the dense network against the
+convolutional network can be seen on be seen on Figs.~\ref{fig:denseVSconv01},
+\ref{fig:denseVSconv02}, \ref{fig:denseVSconv03}, \ref{fig:denseVSconv04},
+\ref{fig:denseVSconv05} and \ref{fig:denseVSconv06}.
+
+The dense network is given black. The convolutional network plays then as white.
+
+The first three moves are known territory for both networks, since both play the
+same sequence when matched against themselves.
+
+After white plays under the black stone as the fourth move, black makes a maybe
+too careful extension, but it seemed very sure about it. Now white has some
+doubts. It seems to like the extensions to both sides, if a bit more to the more
+open side, a preference both engines have previously shown at this point of the
+match. The move it prefers the most, though, would be to play again under black.
+So, as this move is impossible, it defers to the extension to the open side with
+F3.
+
+Black now makes the mistake of overattacking the lone white stone on B5 with B6,
+and white capitalizes on it by keeping extending with D3. B5 is still a move
+with a not negligible score, but it is no longer the highest.
+
+The following moves are questionable, with black keeping building a very tight
+group and getting no territory and white attacking too directly with a lone
+stone into that strong group.
+
+In the final moves of the game we see that black has built a living group on the
+right side and things are looking good for it on the rest of the board. White
+has passed a bunch of times, starting too early, which means it gives too high
+an importance to passing. Since the networks learned only by seeing moves in
+response to boards, it has not learnt that it should not pass until there is no
+more to gain. But it is also important to note that passing was not the most
+likely move for the engine to make, it just happens that the move it would like
+to make is already taken and the highest chance available move is to pass.
+
+\moveHeatmap
+{img/matches/denseVSdense_board_01.jpg}
+{img/matches/denseVSdense_heatmap_01.png}
+{img/matches/denseVSdense_board_02.jpg}
+{img/matches/denseVSdense_heatmap_02.png}
+{img/matches/denseVSdense_board_03.jpg}
+{img/matches/denseVSdense_heatmap_03.png}
+{Dense network against itself, moves 1 through 3.}
+{fig:denseVSdense01}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_04.jpg}
+{img/matches/denseVSdense_heatmap_04.png}
+{img/matches/denseVSdense_board_05.jpg}
+{img/matches/denseVSdense_heatmap_05.png}
+{img/matches/denseVSdense_board_06.jpg}
+{img/matches/denseVSdense_heatmap_06.png}
+{Dense network against itself, moves 4 through 6.}
+{fig:denseVSdense02}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_07.jpg}
+{img/matches/denseVSdense_heatmap_07.png}
+{img/matches/denseVSdense_board_08.jpg}
+{img/matches/denseVSdense_heatmap_08.png}
+{img/matches/denseVSdense_board_09.jpg}
+{img/matches/denseVSdense_heatmap_09.png}
+{Dense network against itself, moves 7 through 9.}
+{fig:denseVSdense03}
+
+\moveHeatmap
+{img/matches/denseVSdense_board_10.jpg}
+{img/matches/denseVSdense_heatmap_10.png}
+{img/matches/denseVSdense_board_11.jpg}
+{img/matches/denseVSdense_heatmap_11.png}
+{img/matches/denseVSdense_board_12.jpg}
+{img/matches/denseVSdense_heatmap_12.png}
+{Dense network against itself, moves 10 through 12.}
+{fig:denseVSdense04}
+
+\moveHeatmap
+{img/matches/convVSconv_board_01.jpg}
+{img/matches/convVSconv_heatmap_01.png}
+{img/matches/convVSconv_board_02.jpg}
+{img/matches/convVSconv_heatmap_02.png}
+{img/matches/convVSconv_board_03.jpg}
+{img/matches/convVSconv_heatmap_03.png}
+{Convolutional network against itself, moves 1 through 3.}
+{fig:convVSconv01}
+
+\moveHeatmap
+{img/matches/convVSconv_board_04.jpg}
+{img/matches/convVSconv_heatmap_04.png}
+{img/matches/convVSconv_board_05.jpg}
+{img/matches/convVSconv_heatmap_05.png}
+{img/matches/convVSconv_board_06.jpg}
+{img/matches/convVSconv_heatmap_06.png}
+{Convolutional network against itself, moves 4 through 6.}
+{fig:convVSconv02}
+
+\moveHeatmap
+{img/matches/convVSconv_board_07.jpg}
+{img/matches/convVSconv_heatmap_07.png}
+{img/matches/convVSconv_board_08.jpg}
+{img/matches/convVSconv_heatmap_08.png}
+{img/matches/convVSconv_board_09.jpg}
+{img/matches/convVSconv_heatmap_09.png}
+{Convolutional network against itself, moves 7 through 9.}
+{fig:convVSconv03}
+
+\moveHeatmap
+{img/matches/convVSconv_board_10.jpg}
+{img/matches/convVSconv_heatmap_10.png}
+{img/matches/convVSconv_board_11.jpg}
+{img/matches/convVSconv_heatmap_11.png}
+{img/matches/convVSconv_board_12.jpg}
+{img/matches/convVSconv_heatmap_12.png}
+{Convolutional network against itself, moves 10 through 12.}
+{fig:convVSconv04}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_01.jpg}
+{img/matches/denseVSconv_heatmap_01.png}
+{img/matches/denseVSconv_board_02.jpg}
+{img/matches/denseVSconv_heatmap_02.png}
+{img/matches/denseVSconv_board_03.jpg}
+{img/matches/denseVSconv_heatmap_03.png}
+{Dense network against convolutional, moves 1 through 3.}
+{fig:denseVSconv01}
+{The dense network plays as black.}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_04.jpg}
+{img/matches/denseVSconv_heatmap_04.png}
+{img/matches/denseVSconv_board_05.jpg}
+{img/matches/denseVSconv_heatmap_05.png}
+{img/matches/denseVSconv_board_06.jpg}
+{img/matches/denseVSconv_heatmap_06.png}
+{Dense network against convolutional, moves 4 through 6.}
+{fig:denseVSconv02}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_07.jpg}
+{img/matches/denseVSconv_heatmap_07.png}
+{img/matches/denseVSconv_board_08.jpg}
+{img/matches/denseVSconv_heatmap_08.png}
+{img/matches/denseVSconv_board_09.jpg}
+{img/matches/denseVSconv_heatmap_09.png}
+{Dense network against convolutional, moves 7 through 9.}
+{fig:denseVSconv03}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_10.jpg}
+{img/matches/denseVSconv_heatmap_10.png}
+{img/matches/denseVSconv_board_11.jpg}
+{img/matches/denseVSconv_heatmap_11.png}
+{img/matches/denseVSconv_board_12.jpg}
+{img/matches/denseVSconv_heatmap_12.png}
+{Dense network against convolutional, moves 10 through 12.}
+{fig:denseVSconv04}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_72.jpg}
+{img/matches/denseVSconv_heatmap_72.png}
+{img/matches/denseVSconv_board_73.jpg}
+{img/matches/denseVSconv_heatmap_73.png}
+{img/matches/denseVSconv_board_74.jpg}
+{img/matches/denseVSconv_heatmap_74.png}
+{Dense network against convolutional, moves 72 through 74.}
+{fig:denseVSconv05}
+
+\moveHeatmap
+{img/matches/denseVSconv_board_75.jpg}
+{img/matches/denseVSconv_heatmap_75.png}
+{img/matches/denseVSconv_board_76.jpg}
+{img/matches/denseVSconv_heatmap_76.png}
+{img/matches/denseVSconv_board_77.jpg}
+{img/matches/denseVSconv_heatmap_77.png}
+{Dense network against convolutional, moves 75 through 77.}
+{fig:denseVSconv06}
+
+% Needed line so the processing of the last command doesn't find an EOF
diff --git a/doc/tex/systemAnalysis.tex b/doc/tex/systemAnalysis.tex
index e4962d3..a3d66de 100644
--- a/doc/tex/systemAnalysis.tex
+++ b/doc/tex/systemAnalysis.tex
@@ -1,6 +1,6 @@
\section{System Analysis}
-\subsection{System reach determination}
+\subsection{System Reach Determination}
These are the main goals the final product must reach.
@@ -277,7 +277,7 @@ The Training System will process SGF files storing records of games, train the
neural network models over those games and store the result. These models can
then be imported by the engine and be used to generate moves.
-\subsubsection{Interface between subsystems}
+\subsubsection{Interface Between Subsystems}
The Training System depends on the NeuralNetwork interface of the Engine System
and uses it to train and store the neural network models.
@@ -287,9 +287,9 @@ System. The Engine System uses it to store the state of a game and provide it
to the decision algorithms. The Training System uses it to create the internal
representation of a game resulting from the processing of an SGF file.
-\subsection{Class analysis}
+\subsection{Class Analysis}
-\subsubsection{Class diagram}
+\subsubsection{Class Diagram}
The classes resulting from the analysis phase are shown in
\fref{fig:analysisClasses}.
@@ -302,15 +302,14 @@ The classes resulting from the analysis phase are shown in
\end{center}
\end{figure}
-\subsubsection{Class description}
+\subsubsection{Class Description}
\newcommand{\interclassSpace}{30pt}
\paragraph{Engine System}
-
\indent \\
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{EngineIO} \\
\midrule
@@ -318,7 +317,6 @@ The classes resulting from the analysis phase are shown in
Offers the interface with the engine. \\
\midrule
\textbf{Responsibilities} \\
- % TODO: Single responsibility would be better?
\tabitem{Read input.} \\
\tabitem{Do some preprocessing.} \\
\tabitem{Forward commands to the engine logic component.} \\
@@ -332,7 +330,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{EngineLogic} \\
\midrule
@@ -362,7 +360,7 @@ The classes resulting from the analysis phase are shown in
now generate moves as from a new game.} \\
}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{DecisionAlgorithm} \\
\midrule
@@ -382,7 +380,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{MonteCarloTreeSearch} \\
\midrule
@@ -404,7 +402,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{MCTSNode} \\
\midrule
@@ -443,7 +441,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{Keras} \\
\midrule
@@ -467,7 +465,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{NeuralNetwork} \\
\midrule
@@ -505,7 +503,7 @@ The classes resulting from the analysis phase are shown in
\indent \\
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{GameState} \\
\midrule
@@ -532,7 +530,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{GameBoard} \\
\midrule
@@ -560,7 +558,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{GameMove} \\
\midrule
@@ -603,7 +601,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{GameBoard} \\
\midrule
@@ -637,13 +635,11 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-%TODO: Finish the classes of the Game System
-
\paragraph{Training System}
\indent \\
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{Trainer} \\
\midrule
@@ -658,13 +654,14 @@ The classes resulting from the analysis phase are shown in
%TODO: Explain why this is empty
\midrule
\textbf{Proposed methods} \\
- %TODO: Explain why this is empty
+ \tabitem{\textbf{loadGameTree()}: Reads a file and generates a GameMove tree
+ from its contents.} \\
\bottomrule
\end{tabular}
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{Parser} \\
\midrule
@@ -686,7 +683,7 @@ The classes resulting from the analysis phase are shown in
\vspace{\interclassSpace}
-\begin{tabular}{p{\linewidth}}
+\begin{tabular}{p{0.9\linewidth}}
\toprule
\textbf{ASTNode} \\
\midrule
@@ -751,17 +748,17 @@ against another machine player.
The engine interface reads the input for generating a move as stated by the
GTP protocol and outputs the coordinates of the board to play.
-\subsection{Use case analysis and scenarios}
+\subsection{Use Case Analysis and Scenarios}
\begin{figure}[h]
\begin{center}
- \includegraphics[width=\textwidth]{diagrams/useCase_playAMatch.png}
+ \includegraphics[width=0.8\textwidth]{diagrams/useCase_playAMatch.png}
\caption{Use case: Play a match.}
\label{fig:useCase_playAMatch}
\end{center}
\end{figure}
-\begin{tabular}{lp{0.7\linewidth}}
+\begin{tabular}{lp{0.6\linewidth}}
\toprule
\multicolumn{2}{c}{\textbf{Play a match}} \\
\midrule
@@ -804,7 +801,7 @@ GTP protocol and outputs the coordinates of the board to play.
\end{center}
\end{figure}
-\begin{tabular}{lp{0.7\linewidth}}
+\begin{tabular}{lp{0.6\linewidth}}
\toprule
\multicolumn{2}{c}{\textbf{Generate a move}} \\
\midrule
@@ -846,7 +843,7 @@ GTP protocol and outputs the coordinates of the board to play.
\end{center}
\end{figure}
-\begin{tabular}{lp{0.7\linewidth}}
+\begin{tabular}{lp{0.6\linewidth}}
\toprule
\multicolumn{2}{c}{\textbf{Use as backend for machine player}} \\
\midrule
diff --git a/doc/tex/systemDesign.tex b/doc/tex/systemDesign.tex
index ccd1c48..80a2ccb 100644
--- a/doc/tex/systemDesign.tex
+++ b/doc/tex/systemDesign.tex
@@ -1,95 +1,324 @@
\section{System Design}
-\subsection{Class diagram}
+\subsection{Class Diagram}
-The full class diagram is shown in \fref{fig:fullClasses}
+The full class diagram is shown in \fref{fig:fullClasses}.
\begin{figure}[h]
\begin{center}
- \includegraphics[width=\textwidth]{diagrams/fullClasses.png}
- \caption{Full class diagram.}
- \label{fig:fullClasses}
+ \makebox[0pt]{\begin{minipage}{1.2\textwidth}
+ \includegraphics[width=\textwidth]{diagrams/fullClasses.png}
+ \caption{Full class diagram.}
+ \label{fig:fullClasses}
+ \end{minipage}}
\end{center}
\end{figure}
-% From here
+The design of each system of the diagram is explained after this section
+together with diagrams for each subsystem, since the full class diagram can be
+too big to be comfortably analyzed.
+\subsection{Game}
+
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.55\textwidth]{diagrams/gameModule.png}
+ \caption{Design of the implementation of the game of Go.}
+ \label{fig:game}
+ A game is represented as a tree of moves.
+ \end{center}
+\end{figure}
+
+A regular Go match is composed of a list of moves. But since game review and
+variants exploration is an important part of Go learning, \program{} and most
+playing and analysis existing programs allow for navigation back and forth
+through the board states of a match and for new variants to be created from each
+of these board states. Therefore, a match is represented as a tree of moves. The
+GameMove class has the information about a specific move and also a reference to
+the previous move and to a list of following moves, implementing this tree
+structure and allowing for navigating both forward and backwards in the move
+history.
+
+The state of the board at any given move must be stored so liberties, captures
+count and legality of moves can be addressed, so it is represented with the
+GameState class, which holds a reference to the current move.
+
+Moves depend on a representation of the game board to have access to its current
+layout and count of captured stones. There are also many logic operations needed
+to be performed on the board, such as getting the stones in a group, counting
+their liberties or checking if a move is playable. The layout of the board and
+these operations are implemented as the GameBoard class.
+
+A game can be started by the executable \texttt{go.py}.
+
+These classes and their relationships can be seen in \fref{fig:game}.
\subsection{Engine}
+\begin{figure}[h]
+ \begin{center}
+ \includegraphics[width=0.8\textwidth]{diagrams/engineModule.png}
+ \caption{Design of the GTP engine.}\label{fig:engine}
+ \end{center}
+\end{figure}
+
An implementation of GTP, that is, the piece of software which offers the GTP
-interface to other applications.\@ It is designed to be used by a software
+interface to other applications. It is designed to be used by a software
controller but can also be directly run, mostly for debugging purposes. Its
design is shown in \fref{fig:engine}. The core of the engine is related with
three components, each with a separate responsibility:
\begin{itemize}
- \item The IO component is the one called from other applications and offers
- the text interface. It reads and processes input and calls corresponding
- commands from the core of the engine.
- \item The EngineBoard component stores the state of the match, recording
- information such as the history of boards positions and whose turn goes
- next. The engine core uses it for these state-storing purposes.
- \item The EngineAI component is responsible of analyzing the match and
- generate moves. The engine core uses it when a decision has to be made
- by the AI, such as when a move needs to be generated by the engine.
+
+ \item The ImagoIO component is the one imported from executables or other
+ applications and offers the text interface. It reads and processes input
+ and calls corresponding commands from the core of the engine.
+
+ \item The GameEngine contains the logic of the commands available from the
+ IO component. It uses a GameState to keep a record of the game and uses
+ a DecisionAlgorithm to generate moves.
+
+ \item The DecisionAlgorithm component is responsible of analyzing the match
+ and generate moves. The engine core uses it when a decision has to be
+ made by the AI, such as when a move needs to be generated by the engine.
+
+\end{itemize}
+
+Two implementations of DecisionAlgorithm have been made: one for the Monte
+Carlo Tree Search algorithm (on the MCTS class) and the other for neural
+networks (on the Keras class).
+
+The Keras class also makes use of the NeuralNetwork class, which offers
+functions for creating, training, saving and using neural network models. The
+designs of the network are implemented in the subclasses DenseNeuralNetwork and
+ConvNeuralNetwork as examples of dense and convolutional networks, respectively.
+
+The engine can be started with the executable \texttt{imagocli.py}.
+
+\subsubsection{Monte Carlo Tree Search Explained}
+
+Monte Carlo Tree Search is an algorithm that can be useful for exploring
+decision trees. It was used by AlphaGo in conjunction with neural networks as
+explained in the AlphaGo 2016 paper\cite{natureAlphaGo2016}.
+
+The algorithm assigns a score to each explored node based on how likely the
+player who makes the corresponding move is to win and updates this score on each
+exploration cycle.
+
+The exploration of the tree has 4 steps:
+
+\begin{enumerate}
+
+ \item \textbf{Selection}: The most promising move with unexplored children
+ is selected for exploration. Unexplored children are viable moves which
+ are not yet part of the tree.
+
+ \item \textbf{Expansion}: An unexplored children of the selected move is
+ added to the tree. This children is selected at random.
+
+ \item \textbf{Simulation}: The score of the new move is evaluated by playing
+ random matches from it.
+
+ \item \textbf{Backpropagation}: The score of the new move, as well as its
+ previous moves up to the root of the tree, is updated based on the
+ results of the simulation.
+
+\end{enumerate}
+
+The suggested move is the children of the current move with the best score from
+the perspective of the player which has to make the move.
+
+The implementation of the algorithm will use the existing GameMove class from
+the Game System to access the game logic it needs, such as to get the possible
+children from a node or to simulate random games.
+
+\subsubsection{Neural Networks Explained}
+
+A neural network is composed of nodes or ``neurons''. Each node contains a value
+named weight. During execution the node receives a numeric input, multiplies it
+for its weight and applies a function called activation function to the result.
+
+Nodes are organized in layers so that a network contains several layers and each
+layer one or more neurons. The input to the network forms the input layer, which
+contents are forwarded to the second layer. The second layer applies the weight
+and activation function as discussed before in each of its nodes and the result
+is forwarded to the next layer, and so on. At the end the last layer, called the
+output layer, contains the result of the input evaluation. Each layer can have a
+unique activation function for all its nodes and a different strategy of
+connecting to the previous and next layers.
+
+Several kinds of layers have been used in this project:
+
+\begin{itemize}
+
+ \item \textbf{Dense layers}, which connects each of its nodes to each of the
+ nodes of the previous layers.
+
+ \item \textbf{Convolutional layers}, which process their input by applying
+ a filter function to adjacent values. In the case of this project, the
+ board is filtered by grouping its vertices in 3x3 squares. The aim of
+ these layers is to detect patterns in the input, such as curves, edges
+ or more complex shapes, so they are used a lot on neural networks
+ processing images. They are used in this project because a configuration
+ of the Go board is not that different from a two-dimensional image.
+
+ \item \textbf{Max pooling layers}, which process their input in a similar
+ way to convolutional layers, but reducing the size of the input by
+ keeping the highest value in each of the groups they process. This
+ reduction accentuates the patterns detected by convolutional layers and
+ helps on the detection of bigger, more complex patterns.
+
+ \item \textbf{Flatten layers}, which just change the shape of the input so
+ that all the values are on one dimension.
+
\end{itemize}
+Combinations of these layers have been used to define two neural networks.
+
+First, a network using mainly dense layers as an example of a more general
+purpose design of a network.
+
+Then, a network with convolutional and max pooling layers to compare the
+approach used on image processing to the more general one and studying its
+utility on the analysis of the Go board.
+
+These networks have been implemented on the DenseNeuralNetwork and
+ConvNeuralNetwork classes, respectively.
+
+The networks have been designed to process boards of size 9x9, which is the
+introductory size to the game. It is the easiest both for the hardware to handle
+and for the analysis of results while keeping able to support meaningful
+matches.
+
+Both networks have the same design for their input and output.
+
+Their input is a three-dimensional matrix of size 9x9x2 with values either 0 or
+1. It represents two views of the board, one with ones as the stones of a player
+and the other with ones as the stones of the other player.
+
+Their output is a vector with 82 elements of type float. Classification networks
+typically use a vector of probabilities with one element for each class they are
+trying to classify. Here the classes are the 81 positions of the 9x9 board and
+the pass move, hence 82 total elements. Each element signifies the chance of
+playing that move for the input board position, so the element with the highest
+value represents the suggested move.
+
+\subsubsection{Dense Neural Network Design}
+
+\begin{listing}[h]
+ \inputminted{text}{listings/denseModel.txt}
+ \caption{Dense neural network model.}
+ \label{code:denseModel}
+\end{listing}
+
\begin{figure}[h]
\begin{center}
- \includegraphics[width=\textwidth]{diagrams/engineModule.png}
- \caption{Design of the GTP engine.}\label{fig:engine}
+ \includegraphics[width=0.7\textwidth]{img/models/denseModel.png}
+ \caption{Dense neural network model.}
+ \label{fig:denseNN}
\end{center}
\end{figure}
-\subsection{Modules}
+This network first uses two dense layers with 81 nodes each. This number has
+been selected so each node can have as input each of the vertices of the board.
+A flatten layer acts then to make the output one-dimensional, and a final dense
+layer provides the vector containing the likelihood of each possible move.
-One module to store the state of the game and the game tree. One module to parse
-moves. One module to read and write SGF files. Modules are shown in
-\fref{fig:modules}.
+The design of this network is shown in \flist{code:denseModel} and
+\fref{fig:denseNN} as provided by Keras' summary and plot\_model functions
+respectively.
+
+\subsubsection{Convolutional Neural Network Design}
+
+\begin{listing}[h]
+ \inputminted{text}{listings/convModel.txt}
+ \caption{Convolutional neural network model.}
+ \label{code:convModel}
+\end{listing}
\begin{figure}[h]
\begin{center}
- \includegraphics[width=\textwidth]{diagrams/modules.png}
- \caption{Modules.}\label{fig:modules}
+ \includegraphics[width=0.7\textwidth]{img/models/convModel.png}
+ \caption{Convolutional neural network model.}
+ \label{fig:convNN}
\end{center}
\end{figure}
-\subsection{Representation of a match}
+This network uses two pairs of convolutional and max pooling layers with the aim
+of being trained to recognize patterns on the board. A flatten layer acts then
+to make the output one-dimensional, and a final dense layer provides the vector
+containing the likelihood of each possible move.
-A regular Go match is composed of a list of moves. But since game review and
-variants exploration is an important part of Go learning, \program{} allows for
-navigation back and forth through the board states of a match and for new
-variants to be created from each of these board states. Therefore, a match is
-represented as a tree of moves. The state of the board at any given move must
-also be stored so liberties, captures count and legality of moves can be
-addressed, so it is represented with its own class, which holds a reference both
-to the game tree and the current move. Moves depend on a representation of the
-game board to have access to its current layout and count of captured stones.
-These classes and their relationships can be seen in
-\fref{fig:gameRepresentation}.
+The design of this network is shown in \flist{code:convModel} and
+\fref{fig:convNN} as provided by Keras' summary and plot\_model functions
+respectively.
+
+\subsection{Training}
\begin{figure}[h]
\begin{center}
- \includegraphics[width=0.7\textwidth]{diagrams/gameRepresentation.png}
- \caption{A game is represented as a tree of moves.}\label{fig:gameRepresentation}
+ \includegraphics[width=\textwidth]{diagrams/trainingModule.png}
+ \caption{Components of the SGF file parsing module.}
+ \label{fig:trainingModule}
+ Components not showing a capital C are not classes, as in they not
+ follow the object-oriented paradigm and do not implement any classes,
+ only functions.
\end{center}
\end{figure}
-\subsection{SGF}
+Neural networks can be powerful machine learning algorithms, but they have to be
+trained first so they can provide meaningful results. For a Go AI it makes sense
+to have its algorithms trained on Go games. There exists a common text format to
+store Go games: SGF. If the system is able to process SGF files, it can provide
+the games stored on them to the neural networks for training. And so the need
+for an SGF parser arises.
To parse SGF files a lexer and parser have been implemented using PLY.\@ The
result of the parsing is an AST (Annotated Syntax Tree) reflecting the contents
of the text input, each node with zero or more properties, and with the ability
-to convert themselves and their corresponding subtree into a GameTree. This is
-done for the root node, since from the SGF specification there are some
+to convert themselves and their corresponding subtree into a GameMove tree. This
+is done for the root node, since from the SGF specification there are some
properties only usable in the root node, like those which specify general game
-information and properties such as rank of players or komi. These components are
-shown in \fref{fig:sgfModule}.
+information and properties such as rank of players or komi.
-\begin{figure}[h]
- \begin{center}
- \includegraphics[width=\textwidth]{diagrams/sgfModule.png}
- \caption{Components of the SGF file parsing module.}\label{fig:sgfModule}
- \end{center}
-\end{figure}
+Here follows an explanation of the role and motivation before each component of
+the Training module to show how these previous concerns have been addressed and
+solved. These components are shown in \fref{fig:trainingModule}.
+
+\begin{itemize}
+
+ \item \textbf{SGF}: Provides a high-level method to convert a path to a SGF
+ file to a GameMove tree.
+
+ \item \textbf{sgfyacc}: The implementation of a SGF parser using PLY. Takes
+ the tokens generated by \textbf{sgflex} and creates an ASTNode tree from
+ them.
+
+ \item \textbf{sgflex}: The implementation of a SGF lexer using PLY.\@ Takes
+ text input and generates the tokens of the SGF language from them.
+
+ \item \textbf{ASTNode}: The AST resulting from the parsing of a SGF file.
+ Has a method to convert it to a tree of GameMove and so obtain the
+ contents of the SGF in the internal representation used by the project's
+ systems.
+
+ \item \textbf{Property}: The representation of a property of an ASTNode
+ tree. Each property is made of a name and one or more values and this
+ class helps handling this specific situation.
+
+The training can be started with the executable \texttt{train.py}.
+
+\end{itemize}
+
+%\subsection{Modules}
+%
+%One module to store the state of the game and the game tree. One module to parse
+%moves. One module to read and write SGF files. Modules are shown in
+%\fref{fig:modules}.
+%
+%\begin{figure}[h]
+% \begin{center}
+% \includegraphics[width=\textwidth]{diagrams/modules.png}
+% \caption{Modules.}\label{fig:modules}
+% \end{center}
+%\end{figure}