Graduation Year

2024

Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts

Department

Physics

Second Department

Linguistics and Cognitive Science

Reader 1

Adam Landsberg

Reader 2

Brian Keeley

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2024 Katherine L Graham

Abstract

The predictive mind theory proposes that brains work in a way that makes predictions about future stimuli to process information efficiently and accurately. Bayesian brain theory suggests that the brain utilizes Bayesian probability models to make predictions, while the free-energy minimization hypothesis proposes that these predictions are made to minimize energy or uncertainty, ensuring accurate perceptions. Vertechi et al. (2020) explored animal participants’ utilization of stimulus-bound strategy versus inference-based strategy to solve a Markov decision process with a 2-state environment, one of which is always active. These sites have a certain probability of switching to a different site and the inverse probability of staying in the same site for the next guess or iteration. This setup served as the basis for my experiment, where I employed three types of model-free artificial neural networks in the 2-state MDP environment: Deep Q-learning, Proximal Policy Optimization, and Recurrent PPO with long-short term memory architecture. Each agent was tested in three environments with varying probabilities of active site switching and reward allocation.

The data showed that all but one ANN in the medium environment failed to learn with an accuracy above the expected rate limited to 1-back memory. In the medium and difficult environments, the DQN was the best performer, followed closely by the RPPO. Across past studies, the DQN was outperformed by the PPO agents, which is inconsistent with our findings. However, our findings are consistent with Vertechi et al.’s (2020) prediction that a model-free and stimulus-bound agent would get worse at learning the environment depending on the frequency at which the rewards were given. These findings also show that animals must have at least a mixture of model-free and model-based processing involved when problem solving and doing other cognitive tasks.

Share

COinS