diff options
Diffstat (limited to 'doc/tex')
-rw-r--r-- | doc/tex/appendixes.tex | 42 | ||||
-rw-r--r-- | doc/tex/biber.bib | 26 | ||||
-rw-r--r-- | doc/tex/conclusions.tex | 19 | ||||
-rw-r--r-- | doc/tex/glossary.tex | 21 | ||||
-rw-r--r-- | doc/tex/imago.tex | 31 | ||||
-rw-r--r-- | doc/tex/implementation.tex | 234 | ||||
-rw-r--r-- | doc/tex/introduction.tex | 42 | ||||
-rw-r--r-- | doc/tex/planning.tex | 116 | ||||
-rw-r--r-- | doc/tex/results.tex | 112 | ||||
-rw-r--r-- | doc/tex/systemAnalysis.tex | 152 | ||||
-rw-r--r-- | doc/tex/systemDesign.tex | 240 |
11 files changed, 760 insertions, 275 deletions
diff --git a/doc/tex/appendixes.tex b/doc/tex/appendixes.tex index 89c413c..581f764 100644 --- a/doc/tex/appendixes.tex +++ b/doc/tex/appendixes.tex @@ -32,8 +32,6 @@ The costs are calculated based on a programmer salary of 20€/hour. } \end{table} -%TODO: Finish budgets - \subsubsection{Material resources} \begin{table}[h] @@ -49,3 +47,43 @@ The costs are calculated based on a programmer salary of 20€/hour. \end{table} \subsubsection{Totals} + +\begin{table}[h] + \makebox[\linewidth]{ + \begin{tabular}{l r} + \toprule + \textbf{Category} & \textbf{Cost (€)} \\ + \midrule + Work & 2500 \\ + \midrule + Materials & 600 \\ + \midrule + \textbf{Total} & \textbf{3100} \\ + \bottomrule + \end{tabular} + } +\end{table} + +\subsection{Budget for the client} + +\begin{table}[h] + \makebox[\linewidth]{ + \begin{tabular}{l r} + \toprule + \textbf{Task} & \textbf{Cost (€)} \\ + \midrule + Game preliminary research & 300 \\ + \midrule + Game implementation & 1100 \\ + \midrule + Game unit testing & 1000 \\ + \midrule + Game system testing & 100 \\ + \midrule + Materials & 600 \\ + \midrule + \textbf{Total} & \textbf{3100} \\ + \bottomrule + \end{tabular} + } +\end{table} diff --git a/doc/tex/biber.bib b/doc/tex/biber.bib index 324c0ad..a058686 100644 --- a/doc/tex/biber.bib +++ b/doc/tex/biber.bib @@ -54,6 +54,12 @@ url = {https://sabaki.yichuanshen.de} } +@online{tensorflow, + title = {TensorFlow}, + urldate = {2023}, + url = {https://www.tensorflow.org/} +} + @online{keras, title = {Keras: the Python deep learning API}, urldate = {2022}, @@ -117,6 +123,12 @@ url = {http://www.dabeaz.com/ply} } +@online{git, + title = {Git}, + urldate = {2023}, + url = {https://git-scm.com/} +} + @online{vim, title = {welcome home : vim online}, urldate = {2022}, @@ -147,6 +159,13 @@ url = {https://senseis.xmp.net/?9x9TengenOpenings} } +@book{choLifeAndDeath, + author = {Cho Chikun}, + title = {Cho Chikun Life and Death Dictionary}, + year = {1996}, + editor = {Nihon Ki-in (the Japanese Go Association)} +} + @online{ogsLifeAndDeath, author = {sunspark}, title = {Cho Chikun's Encyclopedia of Life and Death - Elementary: 1 / 900}, @@ -162,3 +181,10 @@ urldate = {2022}, url = {https://www.gnu.org/philosophy/free-sw.html} } + +@online{arch, + title = {Arch Linux}, + date = {2023}, + urldate = {2023}, + url = {https://archlinux.org/} +} diff --git a/doc/tex/conclusions.tex b/doc/tex/conclusions.tex index 16118e0..a6bc434 100644 --- a/doc/tex/conclusions.tex +++ b/doc/tex/conclusions.tex @@ -6,8 +6,8 @@ could be done in the future to improve the system are discussed here. \subsection{Problems with the Implementation of the Game} While Go has a simple set of rules, they lead to many borderline cases which -must be specifically addressed. For example, the \gls{ko} rule obliges to check the -previous board positions after each move so no boards are repeated. +must be specifically addressed. As an example, the \gls{ko} rule obliges to +check the previous board positions after each move so no boards are repeated. These problems have been solved by designing the tree of moves of a match so each move stores the full board layout after it has been played. This of course @@ -20,8 +20,8 @@ the best solution would require a deep analysis out of the scope of the project. \subsection{Problems with the Monte Carlo Tree Search Algorithm} -The implementation Monte Carlo Tree Search algorithm was not precise by itself -when asked for moves. This could be attributed to various factors: +The implemented Monte Carlo Tree Search algorithm was not precise by itself when +asked for moves. This could be attributed to various factors: \begin{itemize} @@ -31,7 +31,7 @@ when asked for moves. This could be attributed to various factors: are not precise on analyzing the result since the moves in them are played at random. - \item The configuration used, 5 explorations with 10 simulations each for + \item The configuration used, 5 explorations with 10 simulations each per cycle, does not allow for many good moves to be found. \end{itemize} @@ -48,7 +48,7 @@ on this section. \subsubsection{Finding the correct design} -When approaching neural networks design with a basic understanding of their +When approaching neural networks design with just basic understanding of their inner workings it is easy to get lost in all the different factors that can be tuned. The number and variant of layers, their number of nodes, their activation function, the learning rate; they can all be tweaked and doing so in the favour @@ -93,7 +93,8 @@ make reasonable decisions at the start of the game, with some interesting variations. But as the game progresses their moves make less sense. This could be because they have seen many examples of starting moves but complications at the middle and end game are much more diverse and the networks had not been -trained on similar situations. +trained on similar situations. These stages of the game are probably where a +strong Monte Carlo Tree Search algorithm would help the most. \subsection{Viable Extensions} @@ -109,8 +110,8 @@ simpler. The logical next step would be to join both: the network could select the best moves to be made, and the tree search algorithm could explore them to select the best and grow its explored tree in meaningful directions. -This is also what was done by AlphaGo. By seeing the results obtained in this -project it makes sense they made that decision. +This is also what was done by AlphaGo. Going by the results obtained in this +project it makes sense they went with that design. \subsubsection{Train the Engine by Itself} diff --git a/doc/tex/glossary.tex b/doc/tex/glossary.tex index b1cea02..d14dc02 100644 --- a/doc/tex/glossary.tex +++ b/doc/tex/glossary.tex @@ -22,8 +22,8 @@ \newglossaryentry{suicide} { name=suicide move, - description={A typically prohibited move which would cause the player's - group to have zero liberties} + description={A typically prohibited move which doesn't capture any stone + and results in the player's group having zero liberties} } \newglossaryentry{tsumego} @@ -32,5 +32,22 @@ description={A Go problem} } +\newglossaryentry{criticalPoint} +{ + name=critical point, + description={A position on the board which makes a group either live or die + depending on who plays there first} +} + +\newglossaryentry{eye} +{ + name=eye, + description={An area surrounded by a group. If a group has at least two eyes + it can never be captured, since the opponent would have to play in all the + eyes simultaneously to capture it} +} + +\newacronym{ai}{AI}{Artificial Intelligence} \newacronym{sgf}{SGF}{Smart Game Format} \newacronym{gtp}{GTP}{Go Text Protocol} +\newacronym{ast}{AST}{Annotated Syntax Tree} diff --git a/doc/tex/imago.tex b/doc/tex/imago.tex index 0447d70..f8e3101 100644 --- a/doc/tex/imago.tex +++ b/doc/tex/imago.tex @@ -151,8 +151,8 @@ inclusion of to use this template is included here. \clearpage -%\input{tex/introduction.tex} -%\clearpage +\input{tex/introduction.tex} +\clearpage \input{tex/planning.tex} \clearpage @@ -166,16 +166,29 @@ inclusion of to use this template is included here. \input{tex/implementation.tex} \clearpage -%\input{tex/results.tex} -%\clearpage +\input{tex/results.tex} +\clearpage + +\input{tex/appendixes.tex} +\clearpage + +\input{tex/conclusions.tex} +\clearpage + +% Glossary and acronyms -%\input{tex/appendixes.tex} -%\clearpage +\printglossary[title=Glossary of Go Terminology] +\makeatletter +\def\@currentlabelname{\@glotype@main@title} +\label{glossary} +\makeatother -%\input{tex/conclusions.tex} -%\clearpage +\printglossary[type=\acronymtype] +\makeatletter +\def\@currentlabelname{\@glotype@acronymtype@title} +\label{acronyms} +\makeatother -\printglossaries \clearpage % References (bibliography) diff --git a/doc/tex/implementation.tex b/doc/tex/implementation.tex index f89e54c..73e4cfd 100644 --- a/doc/tex/implementation.tex +++ b/doc/tex/implementation.tex @@ -1,8 +1,11 @@ \section{Implementation} +This section is about the specific tools and process used to implement +\program{}. + \subsection{Development Hardware} -The development and evaluation machine is a Satellite L50-C laptop. Its +The development and evaluation machine is a Toshiba Satellite L50-C laptop. Its specifications are shown as a list for readability. \begin{itemize} @@ -15,9 +18,9 @@ specifications are shown as a list for readability. \subsection{Development Software} The tools selected for the development of the project and the documentation are -listed and explained on this section. All the tools used are either -free \cite{fsf_free} or open source software. The development machine runs 64 -bits Arch Linux as its operating system. +listed and explained on this section. All the tools used are either free +\cite{fsf_free} or open source software. The development machine runs 64 bits +Arch Linux \cite{arch} as its operating system. \subsubsection{Language} @@ -28,58 +31,58 @@ allows easy use of the Keras library for implementing neural networks. Various Python libraries have been used to easy the development process or assist in the analysis of results. These are: -\paragraph{Keras/Tensorflow \cite{keras}} +\paragraph{Keras/TensorFlow} -Tensorflow is a platform for machine learning which provides a diverse range of -tools, one of which is a Python library for machine learning. +TensorFlow \cite{tensorflow} is a platform for machine learning which provides a +diverse range of tools, one of which is a Python library for machine learning. -Keras is a high-level API for Tensorflow allowing for the easy definition of -neural networks. It permits easily testing and comparing different network -layouts. +Keras \cite{keras} is a high-level API for TensorFlow allowing for the easy +definition of neural networks. It permits easily testing and comparing different +network layouts. -\paragraph{NumPy \cite{numpy}} +\paragraph{NumPy} -A scientific package for Python providing a lot of mathematical tools. The most -interesting for this project are its capabilities to create and transform -matrices. +A scientific package for Python providing a lot of mathematical tools +\cite{numpy}. The most interesting for this project are its capabilities to +create and transform matrices. -\paragraph{Matplotlib \cite{matplotlib}} +\paragraph{Matplotlib} -A Python library for creating graphs and other visualizations. It is used to -show the likelihood of moves the neural networks of the project create from a -board configuration. +A Python library for creating graphs and other visualizations \cite{matplotlib}. +It is used to show the likelihood of moves the neural networks of the project +create from a board configuration. -\paragraph{PLY \cite{ply}} +\paragraph{PLY} -A tool for generating compilers in Python. It is an implementation of the lex -and yacc utilities, allowing to create lexers and parsers. It is used in the -project to create the \acrshort{sgf} parser which transform \acrshort{sgf} files to internal -representations of Go matches. +A tool for generating compilers in Python \cite{ply}. It is an implementation of +the lex and yacc utilities, allowing to create lexers and parsers. It is used in +the project to create the \acrshort{sgf} parser which transform \acrshort{sgf} +files to internal representations of Go matches. \paragraph{Other utility libraries} -These are some utility libraries commonly used for frequent programming tasks: +These are some utility libraries commonly used for frequent programming tasks. \begin{itemize} - \item \textbf{sys}: to stop the execution of the program or access system info such + \item \textbf{sys}: To stop the execution of the program or access system info such as primitives maximum values. - \item \textbf{os}: to interact with files. - \item \textbf{re}: to check strings with regular expressions. - \item \textbf{random}: to get random values, for example to obtain a random - item from a list. - \item \textbf{copy}: to obtain deep copies of multidimensional arrays. + \item \textbf{os}: To interact with files. + \item \textbf{re}: To check strings with regular expressions. + \item \textbf{random}: For randomness, for example to obtain a random item + from a list. + \item \textbf{copy}: To obtain deep copies of multidimensional arrays. \end{itemize} \subsubsection{Development Tools} -\paragraph{Neovim \cite{neovim}} +\paragraph{Neovim} A text editor based on Vim \cite{vim}, providing its same functionality with -useful additions and defaults for modern computers and terminal emulators. With -some extensions and configuration it can become a powerful development -environment with a very fluid usage experience. That, and the fact that the -developer considers Vim's modal editing the best writing experience possible on -a computer, have made Neovim the editor of choice. +useful additions and defaults for modern computers and terminal emulators +\cite{neovim}. With some extensions and configuration it becomes a powerful +development environment with a very fluid usage experience. That, and the +preference of the developer of Vim's modal editing as the best writing +experience possible on a computer, have made Neovim the editor of choice. %TODO: Write about neovim extensions %\begin{itemize} @@ -88,21 +91,28 @@ a computer, have made Neovim the editor of choice. % %\end{itemize} +\paragraph{Git} + +A version control tool widely used in software development environments +\cite{git}. If the reader is not already familiar with it, it suffices to say it +allows to store and manage snapshots of a project, navigate through them and +diverge into different development branches, among other useful features. + \subsubsection{Documentation Tools} -\paragraph{\LaTeX \cite{latex}} +\paragraph{\LaTeX} -A typesetting system widely used in the investigation field, among others. It -allows for documentation like this text to be written in plain text and then -compiled to PDF or other formats, which permits keeping the source files of the -documentation small and organized plus other benefits of plain text such as -being able to use version control. +A typesetting system widely used in the investigation field, among others +\cite{latex}. It allows for documentation like this text to be written in plain +text and then compiled to PDF or other formats, which permits keeping the source +files of the documentation small and organized plus other benefits of plain text +such as being able to be used in conjunction with version control tools. -\paragraph{PlantUML \cite{puml}} +\paragraph{PlantUML} -A program which creates diagrams from plain text files. PlantUML supports syntax -for many different sorts of diagrams, mainly but not only UML. It has been used -to generate the diagrams used in this text. +A program which creates diagrams from plain text files \cite{puml}. PlantUML +supports syntax for many different sorts of diagrams, mainly but not only UML. +It has been used to generate the diagrams used in this text. \paragraph{Make} @@ -115,7 +125,137 @@ It has been used to generate this text from \LaTeX{} and PlantUML source files. The contents of the Makefile with which this document has been compiled are shown in \lref{code:makefile}. -\begin{listing}[h] +\begin{listing}[p] \inputminted{make}{Makefile} \caption{Documentation Makefile}\label{code:makefile} \end{listing} + +\subsection{Execution of the Testing Plan} + +Part of the implementation process is to execute the testing plan. The results +of this execution are provided in this section. + +\subsubsection{Execution of the Unitary Testing} + +The script used to run the tests is shown on \lref{lst:test} and its output on +\lref{lst:testOutput}. + +\begin{listing}[p] + \inputminted{bash}{listings/test.sh} + \caption{Script to run the tests and list the result.} + \label{lst:test} +\end{listing} + +\begin{listing}[p] + \inputminted[fontsize=\footnotesize]{text}{listings/testOutput.txt} + \caption{Unitary testing output.} + \label{lst:testOutput} +\end{listing} + +\subsubsection{Execution of the Integration Testing} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.2\linewidth}p{0.3\linewidth}p{0.4\linewidth}} + \toprule + \multicolumn{3}{c}{\textbf{Engine and Game modules}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} & \textbf{Results} \\ + \midrule + The GTP interface of the engine is used to play a match & + The module handles the game and can show its state. & + The engine runs correctly and is capable of keeping track of and showing the + state of a match, which shows it is making good use of the Game module. \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.2\linewidth}p{0.3\linewidth}p{0.4\linewidth}} + \toprule + \multicolumn{3}{c}{\textbf{Training and Engine module}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} & \textbf{Results} \\ + \midrule + The training process is started & + The training uses the network defined on the Engine module. & + The training output shows the network in training follows the design defined + on the Engine module. \\ + \bottomrule +\end{tabular} + +\subsubsection{Execution of the System Testing} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.2\linewidth}p{0.3\linewidth}p{0.4\linewidth}} + \toprule + \multicolumn{3}{c}{\textbf{Game interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} & \textbf{Results} \\ + \midrule + Play a game of Go with no engine & + The game can be played until the end. & + The interface continues asking for moves until both players pass + consecutively, which ends the game and the execution. \\ + \midrule + Provide a wrong move & + The interface shows it is wrong and the game continues without a change of + state. & + As expected, the interface shows a message signaling that the move is wrong + and showing an example of a move. \\ + \midrule + Close the game & + The interface closes. & + A message is shown signaling that the game is ending and the engine closes. + \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.2\linewidth}p{0.3\linewidth}p{0.4\linewidth}} + \toprule + \multicolumn{3}{c}{\textbf{Engine interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} & \textbf{Results} \\ + \midrule + Ask for the available commands & + The interface outputs the available commands. & + The list of available commands is printed one per line. \\ + \midrule + Provide a move & + The state of the engine updates with the new move. & + No output is given, but after providing the \texttt{showboard} command it is + shown that the move has been registered. \\ + \midrule + Ask for a move & + The engine suggests a move without changing the state of the current game. & + After the necessary time for generating the move, it is printed. The + \texttt{showboard} is then used to check the state of the game has not + changed. \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.2\linewidth}p{0.3\linewidth}p{0.4\linewidth}} + \toprule + \multicolumn{3}{c}{\textbf{Training interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} & \textbf{Results} \\ + \midrule + Provide some games to train on & + A neural network model is created. & + First, all the training files used are printed, and then, as expected, the + training process commences. A model is created or updated in the + \texttt{models} folder. \\ + \midrule + Start the training without providing games & + An error message is shown and the execution terminated. & + As expected, a message is shown asking for SGF files to be provided as + arguments and the execution is terminated.\\ + \bottomrule +\end{tabular} + +\subsubsection{Usability Testing} diff --git a/doc/tex/introduction.tex b/doc/tex/introduction.tex index 8b5430c..22dd98a 100644 --- a/doc/tex/introduction.tex +++ b/doc/tex/introduction.tex @@ -27,7 +27,7 @@ As one of the deepest and most studied games in the world, Go presents a very interesting problem for artificial intelligence. Implementing not only the game's simple but subtle rules, but a system capable of playing it with a satisfying level of skill, is a task worth of pursuing as an exercise on -software design, algorithmics and AI research. +software design, algorithmics and \acrfull{ai} research. On the practical level, this project can be a foundation for the development of different Go analysis algorithms by providing an existing engine to house them, @@ -43,7 +43,43 @@ Presented here are the ideal targets of the project. game's rules. \item An engine capable of analyzing board positions and generating strong moves via various decision algorithms. - \item Compatibility with existing GUIs. + \item An interface compatible with existing GUIs. \item A way for processing existing records of games, which are usually - recorded in the \acrshort{sgf} format. + recorded in the \acrfull{sgf}. +\end{itemize} + +\subsection{Rules of Go} + +Some understanding of the basics of the game is necessary to process this +document. Luckily for the reader, the rules of Go are pretty simple. If the +reader prefers, there is an interactive tutorial at +\texttt{https://online-go.com/learn-to-play-go/} going over the fundamentals and +introducing basic strategy for managing the stones which is already useful and +needed for the first games. Either way, the rules are sumarized as follows: + +\begin{itemize} + +\item There are two players. One plays as black, the other as white. Black plays + first. + +\item The player with the biggest score when the game ends wins. The score + consists of surrounded territory and captured enemy stones. Surrounded + territory is defined as the areas of empty space connected orthogonally + only to stones of one color. Each empty space on a surrounded area and each + captured enemy stone score one point. + +\item As their turn, a player can either place a stone of their color in an + empty space of the board or pass. The game ends when both players pass + consecutively. + +\item Stones of the same color orthogonally adjacent to one another are + considered connected. When one group of connected stones has no more + orthogonally adjacent empty spaces it is considered as captured and its + stones are removed from the board. + +\item Additionally, to prevent endlessly repeating plays, it is forbidden to + make a move which resets the board to the previous position. This is called + the \Gls{ko} rule, is of strategic relevance outside the scope of a basic + introduction to the game, and doesn't always come up. + \end{itemize} diff --git a/doc/tex/planning.tex b/doc/tex/planning.tex index 17d413f..4057268 100644 --- a/doc/tex/planning.tex +++ b/doc/tex/planning.tex @@ -12,7 +12,8 @@ components and needs. The rules of the game must be implemented, ideally in a way they can be tested by direct human play. This system will at its bare minimum represent the -Japanese Go rules (area scoring, no \gls{superko} rule, no \gls{suicide} moves). +Japanese Go rules (area scoring, no \gls{superko} rule, no \gls{suicide} moves. +These terms are explained in the \nameref{glossary} of this document). \subsubsection{Engine Implementation} @@ -70,6 +71,9 @@ hours a day on weekdays this amounts to 375 hours. \subsection{Previous Works} +This section lists and describes the existing projects which inspired and can be +of use in the development of \program{}. + \subsubsection{Existing Engines} \begin{figure}[h] @@ -82,19 +86,19 @@ hours a day on weekdays this amounts to 375 hours. \end{center} \end{figure} -\paragraph{AlphaGo \cite{alphaGo}} +\paragraph{AlphaGo} -A Go play and analysis engine developed by DeepMind Technologies, a company -owned by Google. It revolutionized the world of Go in 2015 and 2016 when it -respectively became the first AI to win against a professional Go player and -then won against Lee Sedol, a Korean player of the highest professional rank and -one of the strongest players in the world at the time. Its source code is -closed, but a paper written by the team has been published on Nature -\cite{natureAlphaGo2016}. The logo of the project is shown on -\fref{fig:alphaGoLogo}. +A Go play and analysis engine \cite{alphaGo} developed by DeepMind Technologies, +a company owned by Google. It revolutionized the world of Go in 2015 and 2016 +when it respectively became the first \acrshort{ai} to win against a +professional Go player and then won against Lee Sedol, a Korean player of the +highest professional rank and one of the strongest players in the world at the +time. Its source code is closed, but a paper written by the team has been +published on Nature \cite{natureAlphaGo2016}. The logo of the project is shown +on \fref{fig:alphaGoLogo}. -The unprecedented success of AlphaGo served as inspiration for many AI projects, -including this one. +The unprecedented success of AlphaGo served as inspiration for many +\acrshort{ai} projects, including this one. \begin{figure}[h] \begin{center} @@ -105,12 +109,12 @@ including this one. \end{center} \end{figure} -\paragraph{KataGo \cite{katago}} +\paragraph{KataGo} -An open source project based on the AlphaGo paper that also achieved superhuman -strength of play. The availability of its implementation and documentation -presents a great resource for this project. The logo of the project is shown on -\fref{fig:kataGoLogo}. +An open source project \cite{katago} based on the AlphaGo paper that also +achieved superhuman strength of play. The availability of its implementation and +documentation presents a great resource for this project. The logo of the +project is shown on \fref{fig:kataGoLogo}. \begin{figure}[h] \begin{center} @@ -121,22 +125,22 @@ presents a great resource for this project. The logo of the project is shown on \end{center} \end{figure} -\paragraph{GnuGo \cite{gnugo}} +\paragraph{GnuGo} -A software capable of playing Go and part of the GNU project. Although not a -strong engine anymore, it is interesting for historic reasons as the free -software engine for which the \acrfull{gtp} was first defined. The logo of the -project is shown on \fref{fig:gnuGoLogo}. +A software \cite{gnugo} capable of playing Go and part of the GNU project. +Although not a strong engine anymore, it is interesting for historic reasons as +the free software engine for which the \acrfull{gtp} was first defined. The logo +of the project is shown on \fref{fig:gnuGoLogo}. \subsubsection{Existing Standards} -\paragraph{\acrshort{gtp} \cite{gtp}} +\paragraph{\acrshort{gtp}} -The \acrfull{gtp} is a text based protocol for communication with computer Go -programs. It is the protocol used by GNU Go and the more modern and powerful -KataGo. By supporting \acrshort{gtp} the engine developed for this project can -be used with existing GUIs and other programs, making it easier to be used with -the tools target users are already familiar with. +The \acrfull{gtp} \cite{gtp} is a text based protocol for communication with +computer Go programs. It is the protocol used by GNU Go and the more modern and +powerful KataGo. By supporting \acrshort{gtp} the engine developed for this +project can be used with existing GUIs and other programs, making it easier to +be used with the tools target users are already familiar with. %TODO %\begin{listing}[h] @@ -145,16 +149,16 @@ the tools target users are already familiar with. % \label{lst:gtpExample} %\end{listing} -\paragraph{\acrshort{sgf} \cite{sgf}} +\paragraph{\acrshort{sgf}} -The \acrfull{sgf} is a text format widely used for storing records of Go matches -which allows for variants, comments and other metadata. It was devised for Go -but it supports other games with similar turn-based structure. Many popular -playing tools use it. By supporting \acrshort{sgf} vast existing collections of -games, such as those played on online Go servers, can be used to train the -decision algorithms based on neural networks. An example of a \acrshort{sgf} -file can be seen on \lref{lst:sgfExample}. The game state described in this file -is shown visually in \fref{fig:sgfExample}. +The \acrfull{sgf} \cite{sgf} is a text format widely used for storing records of +Go matches which allows for variants, comments and other metadata. It was +devised for Go but it supports other games with similar turn-based structure. +Many popular playing tools use it. By supporting \acrshort{sgf} vast existing +collections of games, such as those played on online Go servers, can be used to +train the decision algorithms based on neural networks. An example of a +\acrshort{sgf} file can be seen on \lref{lst:sgfExample}. The game state +described in this file is shown visually in \fref{fig:sgfExample}. \begin{listing}[h] \inputminted[breakafter=\]]{text}{listings/sgfExample.sgf} @@ -168,9 +172,9 @@ is shown visually in \fref{fig:sgfExample}. \begin{center} \includegraphics[width=0.5\textwidth]{img/sgfExample.png} \caption{Screenshot of Sabaki showing the \gls{tsumego} described in the - \acrshort{sgf} example from \lref{lst:sgfExample}. Note that Sabaki - marks the two continuations described in the \acrshort{sgf} file with - two transparent grey dots. + \acrshort{sgf} example from \lref{lst:sgfExample}. Note that the two + continuations described in the \acrshort{sgf} file are marked with two + transparent grey dots. }\label{fig:sgfExample} \end{center} \end{figure} @@ -184,12 +188,12 @@ is shown visually in \fref{fig:sgfExample}. \end{center} \end{figure} -\subsubsection{Sabaki \cite{sabaki}} +\subsubsection{Sabaki} -Sabaki is a Go board software compatible with \acrshort{gtp} engines. It can -serve as a GUI for the engine developed in this project and as an example of the -advantages of following a standardized protocol. Part of its graphical interface -is shown on \fref{fig:sabaki}. +Sabaki \cite{sabaki} is a Go board software compatible with \acrshort{gtp} +engines. It can serve as a GUI for the engine developed in this project and as +an example of the advantages of following a standardized protocol. Part of its +graphical interface is shown on \fref{fig:sabaki}. \begin{figure}[h] \begin{center} @@ -200,14 +204,24 @@ is shown on \fref{fig:sabaki}. \end{center} \end{figure} -\subsubsection{Keras \cite{keras}} +\subsubsection{Keras} + +Keras \cite{keras} is a deep learning API for Python allowing for the high-level +definition of neural networks. This permits easily testing and comparing +different network layouts. The logo of the project is shown on +\fref{fig:kerasLogo}. -Keras is a deep learning API for Python allowing for the high-level definition -of neural networks. This permits easily testing and comparing different network -layouts. The logo of the project is shown on \fref{fig:kerasLogo}. +\subsubsection{PLY} + +PLY \cite{ply} is a Python implementation of the compiler construction tools lex +and yacc. It will be useful for implementing the \acrshort{sgf} parser needed to +process records of Go matches. \subsection{Technological Infrastructure} +This section establishes the technological needs of the project and proposes +solutions to them. + \subsubsection{Programming Language}\label{sec:programmingLanguage} The resulting product of this project will be one or more pieces of software @@ -217,11 +231,11 @@ choice is Python \cite{python}, for various reasons: \begin{itemize} \item It has established a reputation on scientific fields and more - specifically on AI research and development. + specifically on \acrshort{ai} research and development. \item Interpreters are available for many platforms, which allows the most people possible to access the product. \item Although not very deeply, it has been used by the developer student - during its degree including in AI and game theory contexts. + during its degree including in \acrshort{ai} and game theory contexts. \end{itemize} diff --git a/doc/tex/results.tex b/doc/tex/results.tex index fac2df9..e166873 100644 --- a/doc/tex/results.tex +++ b/doc/tex/results.tex @@ -30,13 +30,16 @@ \section{Results} +This section shows an analysis of the successes and failures of the final +product and the strength of play of its various algorithms. + \subsection{Monte Carlo Tree Search Evaluation} The Monte Carlo Algorithm tries to explore the tree of possibilities as efficiently as possible. With this approach, it can be expected to fail when -alone on a problem such big as the game of Go. Nonetheless, there are some areas -where it can be useful. It will be evaluated by its capabilities while playing -games but also when presented with Go problems. +alone on a problem such as big as the game of Go. Nonetheless, there are some +areas where it can be useful. It will be evaluated by its capabilities while +playing games but also when presented with Go problems. The Monte Carlo algorithm has been set to do 5 explorations with 10 simulations each when it is asked for a move. In the hardware used this makes it think for @@ -57,7 +60,8 @@ Some moves could be debated as sensible, but many of them are on the second or fist line, which so early on the game are the two worst places to play at. It seems clear that this algorithm, at least with this specific configuration, -can't handle the size of an empty Go board, even on the 9x9 size. +can't handle the tree of possibilities of an empty Go board even at the 9x9 +size. A record of the game is shown in \fref{fig:mctsVSmcts}. @@ -72,56 +76,63 @@ A record of the game is shown in \fref{fig:mctsVSmcts}. Since tree exploration on smaller boards or advanced games with little empty spaces should be easier the algorithm has also been tested on some Go problems. -A Go problem or \gls{tsumego} is a predefined layout of the board, or of a section of -the board, for which the player must find some beneficial move. Life and death -problems are a subset of \gls{tsumego}s in which the survival of a group depends on -finding the correct sequence to save or kill the group. One collection of such -\gls{tsumego}s is \textit{Cho Chikun's Encyclopedia of Life and Death}, part of which -are available on OGS \cite{ogsLifeAndDeath}, an online Go server. +A Go problem or \gls{tsumego} is a predefined layout of the board, or of a +section of the board, for which the player must find some beneficial move. Life +and death problems are a subset of \gls{tsumego}s in which the aim is to find +the correct sequence to save or kill a group. One collection of such +\gls{tsumego}s is \textit{Cho Chikun Life and Death Dictionary} +\cite{choLifeAndDeath}, part of which are available on OGS +\cite{ogsLifeAndDeath}, an online Go server. The first of these problems and what the algorithm suggested as moves is shown in \fref{fig:mctsProblem01}. Black makes the first move, which means the solution is to find some favorable outcome for black, which in this case is killing the white group. The white -group has a critical point on B1. If white plays on B1 they make two eyes and -live, but if black plays there first white can't make two eyes and dies, so B1 -is the solution to the \gls{tsumego}. This is one of the easiest Go problems. +group has a \gls{criticalPoint} on B1. If white plays on B1 they make two +\glspl{eye} and live, but if black plays there first white can't make two +\glspl{eye} and dies, so B1 is the solution to the \gls{tsumego}. This is one of +the easiest Go problems. The algorithm neglects this solution. While asked five times to generate a move for the starting position it suggested B1 only once. But notably, after making another move, it consistently suggested B1 for white, -which is the solution now that white has to play. So in the end it was able to -solve the \gls{tsumego}, probably because after making a move it had already explored -part of the tree but it was difficult that it explored the solution for the -first move. +which is now the correct move for white. So in the end it was able to solve the +\gls{tsumego}, probably because after making a move it had already explored part +of the tree but it was difficult that it explored the solution for the first +move, that is, in its first exploration cycle. -The engine was tested against other \gls{tsumego}s but it was not able to solve them, -so no more are shown here. +The engine was tested against other \gls{tsumego}s but it was not able to solve +them, so no more are shown here. This was the one simple enough for an engine of +this characteristics and running on the testing hardware to solve. \subsection{Neural Network Training} -Each network has been training by receiving batches of \acrshort{sgf} files which are first -converted to lists of moves by the Training System and then provided to the -train function of the networks. This has been executed with the following -command: +Each network has been training by receiving batches of \acrshort{sgf} files +which are first converted to lists of moves by the Training System and then +provided to the train function of the networks. This has been executed with the +following command: \inputminted[fontsize=\small]{bash}{listings/trainCommand.sh} -Which lists the contents of a folder containing multiple \acrshort{sgf} files, shuffles the -list, takes some of them and passes them to train.py as arguments. The combined -list of game positions and moves made from them in all the games selected make -up the sample of the training. The number of files selected can of course be -adapted to the situation. +Which lists the contents of a folder containing multiple \acrshort{sgf} files, +shuffles the list, takes some of them and passes them to train.py as arguments. +The combined list of game positions and moves made from them in all the games +selected make up the sample of the training. The number of files selected can of +course be adapted to the situation. The networks were configured to train for 20 epochs, that is, processing the full batch and updating the weights on the network 20 times. 10\% of the sample -were used as validation. This means they were not used for training but to -check the accuracy and loss of the network after training with the rest of -the batch. This technique of validation can help detect overfitting, which -happens when a network does a very good job on the population it has been -trained on but it fails when receiving unseen data. +was used as validation. This means they were not used for training but to check +the accuracy and loss of the network after training with the rest of the batch. +This technique of validation can help detect overfitting, which happens when a +network does a very good job on the population it has been trained on but it +fails when receiving unseen data. + +The outputs from this training process can be seen on \lref{code:denseTraining} +and \lref{code:convTraining} for the dense network and the convolutional +network respectively. The training shows loss decrementing and accuracy incrementing on each epoch for both training and validation samples, which suggests there is learning happening @@ -134,10 +145,6 @@ as with batches of 10 games an epoch of training could be completed in one minute, in three minutes batches of 20 games, and in up to an hour with batches of 100 games. -The outputs from this training process can be seen on \lref{code:denseTraining} -and \lref{code:convTraining} for the dense network and the convolutional -network respectively. - \begin{listing}[h] \inputminted[fontsize=\scriptsize]{text}{listings/denseTraining.txt} \caption{Dense neural network training log.} @@ -163,8 +170,8 @@ of predicting human moves through the training epochs. Another way is to make the network play either against itself, another network, or a human player, and analyze the score it provides to each possible play. The -output of the network is a vector with the likelihood of making each move, so -we can take this values as how strongly the engine suggests each of them. +output of the network is a vector with the likelihood of making each move, so we +can take these values as how strongly the engine suggests each of them. \subsection{Neural Networks Against Themselves} @@ -185,10 +192,10 @@ move), can be seen on Figs.~\ref{fig:denseVSdense01}, \ref{fig:denseVSdense02}, The dense network starts on the center of the board, which is one of the standard openings in the 9x9 board. It starts on a very good track, but we must acknowledge that the empty board is a position present on every Go match it has -trained on and so it should know it well. It probably means the center was the -most played opening in the sample. It is interesting to check the heatmap of -this move, since the selected move has only a score of 0.27. Other common -openings on the 9x9 are represented on the vertices with some significant score. +trained on, so it should know it well. It probably means the center was the most +played opening in the sample. It is interesting to check the heatmap of this +move, since the selected move has only a score of 0.27. Other common openings on +the 9x9 are represented on the vertices with some significant score. The heatmap of the response to the center play is also interesting. The four highest scored moves are equivalent, since the first move has not broken any @@ -201,12 +208,13 @@ second black move is suggested with more likelihood than the previous one. This is also a very common sequence of play. The following moves are a white attack followed by black's sensible response, -and a strong white extension (arguably to the wrong side, but the minutia of why +and a strong white extension (arguably to the wrong side, but the details of why the other side would be better is probably not grasped by the engine). The -following moves could be called looking for complication, which is a strategy. +following moves could be classified as ``looking for complication'', which is a +strategy. -Overall some moves could be criticised but the engine is showing at least a -basic knowledge of the game's strategy. +Overall some moves can be criticised but the engine is showing at least a basic +knowledge of the game's strategy. \subsubsection{Convolutional Network Against Itself} @@ -219,10 +227,10 @@ more certain of the move than the dense network. Its next two plays are also the same, but the match starts to diverge from the fourth move. Instead of attacking the black stone on the side, white plays directly under at -B5, keeping the edge of symmetry. This is strategically important because while -symmetry is present playing on either side makes no difference, while playing -after the opponent has broken symmetry gives the opportunity to respond and -capitalize on the now better side to play. +B5, keeping the edge of symmetry. This is strategically important because as +long as symmetry is present playing on either side makes no difference, while +playing after the opponent has broken symmetry gives the opportunity to respond +and capitalize on the now better side to play. Black responds to the white stone by attacking it at B4 and white then extends to the more open side with F7. @@ -233,7 +241,7 @@ there states that white should have extended to the other side. The following moves are some sensible continuations, with approaches and responses on each side. They are probably not the best moves, but the idea -behind them can be read from the board. +behind them can be grasped from the board. \subsubsection{Dense Network Against Convolutional} diff --git a/doc/tex/systemAnalysis.tex b/doc/tex/systemAnalysis.tex index 5263d1f..811a8f3 100644 --- a/doc/tex/systemAnalysis.tex +++ b/doc/tex/systemAnalysis.tex @@ -1,5 +1,9 @@ \section{System Analysis} +This section provides a preliminary analysis of the needs of the project, such +as its requirements, components and use cases. It also includes the +specification of the testing plan. + \subsection{System Reach Determination} These are the main goals the final product must reach. @@ -12,8 +16,8 @@ These are the main goals the final product must reach. \item A library for representing the game of Go. It can be used for the decision algorithms to keep the state of the game and can also be used - in an interactive application for a user to play the game and test the - code. + in an interactive application for a human user to play the game and test + the code. \item An engine program as a way of presenting an interface for using these algorithms. The engine will use the \acrshort{gtp} so it can be used with an @@ -50,7 +54,7 @@ requisites needed for the system. coordinates of the vertex to play on or as ``pass''. \begin{enumerate} \item The text introduced for the move must follow the - regular expression \texttt{([A-Z][0-9]+|pass)} + regular expression \texttt{([A-HJ-Z][0-9]+)|(pass)} \item If the move is not valid it must be notified to the user and another move asked for. \end{enumerate} @@ -149,8 +153,8 @@ requisites needed for the system. \item The trainer can import existing games. \begin{enumerate} \item Records of games stored as \acrshort{sgf} can be imported. - \item Files containing records of games are provided as arguments to - the trainer. + \item Files containing records of games can be provided as arguments + to the trainer. \end{enumerate} \end{enumerate} @@ -178,13 +182,13 @@ requisites needed for the system. \begin{enumerate} \item For understanding the workings of the application the user needs to be - familiar with the basics of the game of Go. + familiar with the rules of the game of Go. \item For directly using the engine the user needs to be familiar with command line interfaces. - \item For directly using the trainer the user needs to know the different - network models available. + \item For directly using the trainer the user needs to know about the + different network models available. \end{enumerate} @@ -194,7 +198,7 @@ requisites needed for the system. \begin{enumerate} - \item The game program will be a Python file able to be executed by the + \item The game program will be a Python script able to be executed by the Python interpreter. \item The game program will make use of standard input and standard output @@ -205,7 +209,7 @@ requisites needed for the system. \item Standard output will be used for messages directed to the user. \end{enumerate} - \item The engine program will be a Python file able to be executed by the + \item The engine program will be a Python script able to be executed by the Python interpreter. \item The engine program will make use of standard input and standard output @@ -216,11 +220,11 @@ requisites needed for the system. commands. \end{enumerate} - \item The trainer program will be a Python file able to be executed by the + \item The trainer program will be a Python script able to be executed by the Python interpreter. - \item The engine program will make use of standard input and standard output - for communication. + \item The trainer program will make use of standard input and standard + output for communication. \begin{enumerate} \item Standard input will be used for reading commands. \item Standard output will be used for showing the result of @@ -289,12 +293,15 @@ internal representation of a game resulting from the processing of an \subsection{Class Analysis} +This section describes the needed classes expected as a result of the analysis +of the project. + \subsubsection{Class Diagram} The classes resulting from the analysis phase are shown in \fref{fig:analysisClasses}. -\begin{figure}[h] +\begin{figure}[p] \begin{center} \includegraphics[width=\textwidth]{diagrams/analysisClasses.png} \caption{General classes obtained from the analysis @@ -716,13 +723,14 @@ non-human. \item The human player who interacts with the playing interface. \item The human user who interacts with the engine. + \item The human user who starts the training. \item A GUI software which uses the engine to generate moves. \end{itemize} \subsection{Use Cases} -\begin{figure}[h] +\begin{figure}[p] \begin{center} \includegraphics[width=\textwidth]{diagrams/useCases.png} \caption{Use cases.} @@ -738,20 +746,25 @@ use case is explained next. The game interface reads the moves presented by the player and shows their result on the board. +\paragraph{Generate a move} + +The engine interface reads the input for generating a move as stated by the +\acrshort{gtp} protocol and outputs the coordinates of the board to play. + +\paragraph{Train the engine} + +The training system is started to process matches stored as \acrshort{sgf} files +and to train a neural network. + \paragraph{Use as backend for machine player} The engine is used as the backend for generating moves for a machine player, this is, for automated play, either against a human who is using the GUI or against another machine player. -\paragraph{Generate a move} - -The engine interface reads the input for generating a move as stated by the -\acrshort{gtp} protocol and outputs the coordinates of the board to play. - \subsection{Use Case Analysis and Scenarios} -\begin{figure}[h] +\begin{figure}[p] \begin{center} \includegraphics[width=0.8\textwidth]{diagrams/useCase_playAMatch.png} \caption{Use case: Play a match.} @@ -759,6 +772,8 @@ The engine interface reads the input for generating a move as stated by the \end{center} \end{figure} +\vspace{\interclassSpace} + \begin{tabular}{lp{0.6\linewidth}} \toprule \multicolumn{2}{c}{\textbf{Play a match}} \\ @@ -794,7 +809,7 @@ The engine interface reads the input for generating a move as stated by the \vspace{\interclassSpace} -\begin{figure}[h] +\begin{figure}[p] \begin{center} \includegraphics[width=\textwidth]{diagrams/useCase_generateAMove.png} \caption{Use case: Generate a move.} @@ -836,7 +851,49 @@ The engine interface reads the input for generating a move as stated by the \vspace{\interclassSpace} -\begin{figure}[h] +\begin{figure}[p] + \begin{center} + \includegraphics[width=\textwidth]{diagrams/useCase_trainTheEngine.png} + \caption{Use case: Train the engine.} + \label{fig:useCase_trainTheEngine} + \end{center} +\end{figure} + +\begin{tabular}{lp{0.6\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Train the engine}} \\ + \midrule + \textbf{Preconditions} & Some SGF files have been acquired to be used as + training games. \\ + \midrule + \textbf{Postconditions} & A neural network model has been created or updated. \\ + \midrule + \textbf{Actors} & Human user. \\ + \midrule + \textbf{Description} & + 1. The user starts the trainer.\newline + 2. The trainer processes the input files.\newline + 3. The trainer trains a model with the games described in the input + files. \\ + \midrule + \textbf{Secondary scenarios} & + --- \\ + \midrule + \textbf{Exceptions} & + \textbf{No input provided}: An error message is shown. Go back to step 1 of + main scenario. \newline + \textbf{The input is not formatted properly}: An error message is shown. Go + back to step 1 of main scenario. \\ + \midrule + \textbf{Notes} & + A robustness diagram for this scenario is shown in + \fref{fig:useCase_trainTheEngine}.\\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{figure}[p] \begin{center} \includegraphics[width=\textwidth]{diagrams/useCase_useAsBackend.png} \caption{Use case: Use as backend for machine player.} @@ -893,36 +950,37 @@ The Testing Plan will include four types of tests: \subsubsection{Unitary Testing} -Tests for the Python code are developed using the unittest \cite{python_unittest} -testing framework. It has been chosen by virtue of being thoroughly documented -and widely used. +A unitary testing suite provides test coverage at a base level, checking the +correct behaviour of each specific code piece. -The coverage of unit testing is checked with Coverage.py \cite{python_coverage}, -which can by itself run the unittest tests and generate coverage reports based -on the results. +\subsubsection{Integration Testing} -The script used to run the tests is shown on \lref{lst:test} and its output on -\lref{lst:testOutput}. +The interaction between components will be tested to check their correct +behaviour on the various situations where they have to work together: -% Maybe put an example report here? -\begin{listing}[h] - \inputminted{bash}{listings/test.sh} - \caption{Script to run the tests.} - \label{lst:test} -\end{listing} +\begin{itemize} +\item The Engine module is able to use the Game module to handle a match. +\item The Training module is able to use the Engine module and its definitions + of neural networks. +\end{itemize} -\begin{listing}[h] - \inputminted[fontsize=\footnotesize]{text}{listings/testOutput.txt} - \caption{Unitary testing output.} - \label{lst:testOutput} -\end{listing} +\subsubsection{System Testing} -\subsubsection{Integration Testing} +The working of the system as a whole has to be tested. Mainly: -\subsubsection{System Testing} +\begin{itemize} -\subsubsection{Usability Testing} + \item A game of Go can be played by a human by using the Game module. + \item A game of Go can be played against the computer by using the + \acrshort{gtp} interface of the Engine module. + \item The \acrshort{gtp} interface is compatible with at least one existing + GUI that claims compatibility with this protocol. + \item A training session can be started and it produces some results to be + evaluated independent of this test step. -% Game playing +\end{itemize} + +\subsubsection{Usability Testing} -% Using the engine with an existing GUI +Some people representing the final user will be confronted with the interfaces +of the system to test its usability. diff --git a/doc/tex/systemDesign.tex b/doc/tex/systemDesign.tex index 7a5db7b..78fb2dc 100644 --- a/doc/tex/systemDesign.tex +++ b/doc/tex/systemDesign.tex @@ -1,5 +1,8 @@ \section{System Design} +This section explains the design of the component systems of \program{}, the +algorithms used and how they are to be implemented. + \subsection{Class Diagram} The full class diagram is shown in \fref{fig:fullClasses}. @@ -15,7 +18,7 @@ The full class diagram is shown in \fref{fig:fullClasses}. \end{figure} The design of each system of the diagram is explained after this section -together with diagrams for each subsystem, since the full class diagram can be +along with diagrams for each subsystem, since the full class diagram can be too big to be comfortably analyzed. \subsection{Game} @@ -30,14 +33,14 @@ too big to be comfortably analyzed. \end{figure} A regular Go match is composed of a list of moves. But since game review and -variants exploration is an important part of Go learning, \program{} and most +exploration of variants is an important part of Go learning, \program{} and most playing and analysis existing programs allow for navigation back and forth through the board states of a match and for new variants to be created from each of these board states. Therefore, a match is represented as a tree of moves. The -\texttt{GameMove} class has the information about a specific move and also a reference to -the previous move and to a list of following moves, implementing this tree -structure and allowing for navigating both forward and backwards in the move -history. +\texttt{GameMove} class has the information about a specific move and also a +reference to the previous move and to a list of following moves, implementing +this tree structure and allowing for navigating both forward and backwards in +the move history. The state of the board at any given move must be stored so liberties, captures count and legality of moves can be addressed, so it is represented with the @@ -62,11 +65,11 @@ These classes and their relationships can be seen in \fref{fig:game}. \end{center} \end{figure} -An implementation of \acrshort{gtp}, that is, the piece of software which offers the \acrshort{gtp} -interface to other applications. It is designed to be used by a software -controller but can also be directly run, mostly for debugging purposes. Its -design is shown in \fref{fig:engine}. The core of the engine is related with -three components, each with a separate responsibility: +This will be an implementation of \acrshort{gtp}, that is, the piece of software +which offers the \acrshort{gtp} interface to other applications. It is designed +to be used by a software controller but can also be directly run, mostly for +debugging purposes. Its design is shown in \fref{fig:engine}. The core of the +engine is related with three components, each with a separate responsibility: \begin{itemize} @@ -80,8 +83,8 @@ three components, each with a separate responsibility: \item The \texttt{DecisionAlgorithm} component is responsible of analyzing the match and generate moves. The engine core uses it when a decision - has to be made by the AI, such as when a move needs to be generated by - the engine. + has to be made by the \acrshort{ai}, such as when a move needs to be + generated by the engine. \end{itemize} @@ -100,30 +103,33 @@ The engine can be started with the executable \texttt{imagocli.py}. \subsubsection{Monte Carlo Tree Search Explained} Monte Carlo Tree Search is an algorithm that can be useful for exploring -decision trees. It was used by AlphaGo in conjunction with neural networks as +decision trees. It has a history of use in Go engines, not being able to reach +strong levels of play until it was paired with neural networks by AlphaGo as explained in the AlphaGo 2016 paper \cite{natureAlphaGo2016}. -The algorithm assigns a score to each explored node based on how likely the -player who makes the corresponding move is to win and updates this score on each -exploration cycle. +The algorithm assigns a score to each explored node of the game tree based on +how likely the player who makes the corresponding move is to win and updates +this score on each exploration cycle. The exploration of the tree has 4 steps: \begin{enumerate} - \item \textbf{Selection}: The most promising move with unexplored children - is selected for exploration. Unexplored children are viable moves which - are not yet part of the tree. + \item \textbf{Selection}: The most promising move with at least one + unexplored children is selected for exploration. Unexplored children are + viable moves which are not yet part of the tree. \item \textbf{Expansion}: An unexplored children of the selected move is added to the tree. This children is selected at random. \item \textbf{Simulation}: The score of the new move is evaluated by playing - random matches from it. + different matches from it. How this matches are played varies: they can + be totally random, but here is where AlphaGo introduces one of its + neural networks so these simulation matches are more valuable. - \item \textbf{Backpropagation}: The score of the new move, as well as its - previous moves up to the root of the tree, is updated based on the - results of the simulation. + \item \textbf{Backpropagation}: The scores of the new move and its previous + moves up to the root of the tree are updated based on the results of the + simulation. \end{enumerate} @@ -132,7 +138,7 @@ the perspective of the player which has to make the move. The implementation of the algorithm will use the existing \texttt{GameMove} class from the Game System to access the game logic it needs, such as to get the -possible children from a node or to simulate random games. +available children from a node or to simulate random games. \subsubsection{Neural Networks Explained} @@ -156,13 +162,13 @@ Several kinds of layers have been used in this project: \item \textbf{Dense layers}, which connects each of its nodes to each of the nodes of the previous layers. - \item \textbf{Convolutional layers}, which process their input by applying - a filter function to adjacent values. In the case of this project, the + \item \textbf{Convolutional layers}, which process their input by applying a + filter function to adjacent values. In the case of this project, the board is filtered by grouping its vertices in 3x3 squares. The aim of these layers is to detect patterns in the input, such as curves, edges - or more complex shapes, so they are used a lot on neural networks - processing images. They are used in this project because a configuration - of the Go board is not that different from a two-dimensional image. + or more complex shapes, so they are used a lot on image processing. They + are used in this project because a configuration of the Go board is not + that different from a two-dimensional image. \item \textbf{Max pooling layers}, which process their input in a similar way to convolutional layers, but reducing the size of the input by @@ -178,19 +184,19 @@ Several kinds of layers have been used in this project: Combinations of these layers have been used to define two neural networks. First, a network using mainly dense layers as an example of a more general -purpose design of a network. +purpose and baseline design of a network. Then, a network with convolutional and max pooling layers to compare the -approach used on image processing to the more general one and studying its -utility on the analysis of the Go board. +approach used on image processing to the more general one and study its utility +on the analysis of the Go board. These networks have been implemented on the \texttt{DenseNeuralNetwork} and \\ \texttt{ConvNeuralNetwork} classes, respectively. The networks have been designed to process boards of size 9x9, which is the introductory size to the game. It is the easiest both for the hardware to handle -and for the analysis of results while keeping able to support meaningful -matches. +and for the analysis of results while still being big enough to support +meaningful matches. Both networks have the same design for their input and output. @@ -269,16 +275,16 @@ respectively. \end{figure} Neural networks can be powerful machine learning algorithms, but they have to be -trained first so they can provide meaningful results. For a Go AI it makes sense -to have its algorithms trained on Go games. There exists a common text format to -store Go games: \acrshort{sgf}. If the system is able to process \acrshort{sgf} files, it can provide -the games stored on them to the neural networks for training. And so the need -for an \acrshort{sgf} parser arises. - -To parse \acrshort{sgf} files a lexer and parser have been implemented using -PLY.\@ The result of the parsing is an AST (Annotated Syntax Tree) reflecting -the contents of the text input, each node with zero or more properties, and with -the ability to convert themselves and their corresponding subtree into a +trained first so they can provide meaningful results. For a Go \acrshort{ai} it +makes sense to have its algorithms trained on Go games. There exists a common +text format to store Go games: \acrshort{sgf}. If the system is able to process +\acrshort{sgf} files, it can provide the games stored on them to the neural +networks for training. And so the need for an \acrshort{sgf} parser arises. + +To parse \acrshort{sgf} files a lexer and parser have been implemented using PLY +\cite{ply}. The result of the parsing is an \acrfull{ast} reflecting the +contents of the text input, each node with zero or more properties, and with the +ability to convert themselves and their corresponding subtree into a \texttt{GameMove} tree. This is done for the root node, since from the \acrshort{sgf} specification there are some properties only usable in the root node, like those which specify general game information and properties such as @@ -290,23 +296,23 @@ solved. These components are shown in \fref{fig:trainingModule}. \begin{itemize} - \item \textbf{\acrshort{sgf}}: Provides a high-level method to convert a path to a \acrshort{sgf} + \item \textbf{\texttt{\acrshort{sgf}}}: Provides a high-level method to convert a path to a \acrshort{sgf} file to a \texttt{GameMove} tree. - \item \textbf{sgfyacc}: The implementation of a \acrshort{sgf} parser using PLY. Takes - the tokens generated by \textbf{sgflex} and creates an \texttt{ASTNode} + \item \textbf{\texttt{sgfyacc}}: The implementation of a \acrshort{sgf} parser using PLY. Takes + the tokens generated by \texttt{sgflex} and creates an \texttt{ASTNode} tree from them. - \item \textbf{sgflex}: The implementation of a \acrshort{sgf} lexer using + \item \textbf{\texttt{sgflex}}: The implementation of a \acrshort{sgf} lexer using PLY.\@ Takes text input and generates the tokens of the \acrshort{sgf} language from them. - \item \textbf{ASTNode}: The AST resulting from the parsing of a + \item \textbf{\texttt{ASTNode}}: The AST resulting from the parsing of a \acrshort{sgf} file. Has a method to convert it to a tree of \texttt{GameMove} and so obtain the contents of the \acrshort{sgf} in the internal representation used by the project's systems. - \item \textbf{Property}: The representation of a property of an + \item \textbf{\texttt{Property}}: The representation of a property of an \texttt{ASTNode} tree. Each property is made of a name and one or more values and this class helps handling this specific situation. @@ -327,6 +333,134 @@ The training can be started with the executable \texttt{train.py}. % \end{center} %\end{figure} -%\subsection{Technical Testing Plan Specification} +\subsection{Technical Testing Plan Specification} + +This section lists and explains the exact testing plan. + +\subsubsection{Unitary Testing} + +Tests for the Python code will be developed using the unittest +\cite{python_unittest} testing framework. It has been chosen by virtue of being +thoroughly documented and widely used. + +The coverage of unit testing will be checked with Coverage.py +\cite{python_coverage}, which can by itself run the unittest tests and generate +coverage reports based on the results. + +\subsubsection{Integration Testing} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.4\linewidth}p{0.5\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Engine and Game modules}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} \\ + \midrule + The GTP interface of the engine is used to play a match & The module handles + the game and can show its state. \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.4\linewidth}p{0.5\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Training and Engine module}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} \\ + \midrule + The training process is started & The training uses the network defined on + the Engine module. \\ + \bottomrule +\end{tabular} + +\subsubsection{System Testing} + +These are the tests to check the correct working of the system as a whole. The +tests are grouped by the interface they are run against. + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.4\linewidth}p{0.5\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Game interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} \\ + \midrule + Play a game of Go with no engine & The game can be played until the end. \\ + \midrule + Provide a wrong move & The interface shows it is wrong and the game + continues without a change of state. \\ + \midrule + Close the game & The interface closes. \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.4\linewidth}p{0.5\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Engine interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} \\ + \midrule + Ask for the available commands & The interface outputs the available + commands. \\ + \midrule + Provide a move & The state of the engine updates with the new move. \\ + \midrule + Ask for a move & The engine suggests a move without changing the state of + the current game. \\ + \bottomrule +\end{tabular} + +\vspace{\interclassSpace} + +\begin{tabular}{p{0.4\linewidth}p{0.5\linewidth}} + \toprule + \multicolumn{2}{c}{\textbf{Training interface}} \\ + \midrule + \textbf{Test} & \textbf{Expected behaviour} \\ + \midrule + Provide some games to train on & A neural network model is created. \\ + \midrule + Start the training without providing games & An error message is shown and + the execution terminated. \\ + \bottomrule +\end{tabular} + +\subsubsection{Usability Testing} + +Two different human users will be exposed to the interfaces of the project and +asked to answer a questionary about their experience. The aim of this process is +to identify and address any problems end users could have when using the +system. + +As the training of the neural networks is part of the preparation of the system, +its usability will not be tested for end users. + +These are the questions provided to the testers. + +\begin{itemize} -%TODO + \item Playing a game against a human: + \begin{itemize} + \item Were you able to start the interface? + \item How hard to understand was the interface of the game? + \end{itemize} + + \item Playing a game against the engine: + \begin{itemize} + \item Were you able to start the interface? + \item How hard to understand was the interface of the game? + \item How strong did you find the engine? + \end{itemize} + + \item Playing a game against the interface through a third-party GUI: + \begin{itemize} + \item Were you able to start the interface? + \item Did you find any problems when setting up the engine? + \end{itemize} + +\end{itemize} |