Skip to content

lukas-kuhn/reinforced-tictactoe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforced TicTacToe

Reinforced TicTacToe

This project has been created as a demonstration and is completely open source.

The opponent AI is based on a reinforcement learning algorithm (On-policy first-visit Monte Carlo) that has been trained on 1 million matches against a random opponent.

On Policy First Visit MC

From Sutton and Barto (2018) Reinforcement Learning: An Introduction, chapter 5.4

The algorithm had no model of TicTacToe and discovered the best winning strategy play-by-play by discounting a given reward (-1 for a loss / 0 for a draw / 1 for a win) over all previous in-game actions after each episode.

Training

The calculation (training) of the Q-ActionValues has been done in Python and can be found here: reinforced_tictactoe.py. The file contains a class for the TicTacToe game as well as the algorithm used to train the AI.

Website

The website has been build with React and Javascript and utilizes the Q-ActionValues generated in training as an opponent AI. The source code for the website can be found here: /src. It consists of the main game board and also shows the user the top three contemplated moves.

Reinforced TicTacToe

About

Reinforcement Learning for TicTacToe

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors