Skip to content

Grid world problem - Reinforcement Learning

Published: at 03:57 PM

Training and applying RL methods to solve a grid world envorment, the agent should be able to navigate to the goal efficiently.

Github

Table of contents

Open Table of contents

Environment

The environment defined is a 4x4 grid where the agent has to reach the goal state from the initial state, the environment defined is identical for both deterministic and stochastic environments.

Action The agent has 4 possible actions at any given state.
up=0; down=1; right=2; left=3
State In this 4x4 grid environment there are 16 states
Rewards 4 rewards are defined in this environment, the position and value of the rewards:
(2,0): -3, (1,2): -4, (1,0): 2, (3,1): 5, (3,3): 20
The goal position (3,3) has the highest reward value (20)
Objective To reach the goal state

Algorithms

For the defined deterministic and stochastic environments, the Q-learning algorithm and SARSA is used to solve the environment, the deterministic environment consists of a 4x4 grid world where the agent starts at an initial position (0,0) and needs to reach the goal state (3,3) and should collect rewards on the way, there are negative and positive rewards defined in the grid, the agent should not collect the negative rewards and stay away from them and collect the positive rewards and reach the goal state.

Analysis

Screenshots

Read more - Project Report