Small World Experiment - Reinforcement Learning

Summary

In this project, I modelled the Small World Experiment (Stanley Milgram, 1967) by using a Reinforcement Learning algorithm known as Q-Learning. The aim of the project was to create a model that can learn the shortest distance between two people over multiple iterations via their close connection. I depicted the network of people as a graph where people are the nodes and the edges between nodes represented the friendship. In this experiment, the model has to find the shortest path between Person A and Person P, both unknown to each other. The network also consisted of dead/terminal nodes where if the model reaches these ndoes, the episode ends and restarts from the start. Check out the video for a live demonstration of the model learning over iterations!