Home > Events > Memory Learning under Partial Observability

Memory Learning under Partial Observability

Date
Thu, 25 Jun 2026 | 17:00 - 19:00
Location
Seminar Room 2
Speakers
Peter Koepernik (OpenAI)
Event Price
Free
Booking Required
Not required

Title: Memory Learning under Partial Observability

Abstract: When a reinforcement learning agent has access only to partial observations of its environment, optimal decision-making generally requires retaining and using information from the past. This work characterizes the properties a learned memory representation must satisfy for an optimal policy to be expressible as a function of that representation. Building on this, we introduce an auxiliary training objective that encourages deep reinforcement learning agents to learn such memory functions. Empirical results across a diverse set of environments demonstrate that this approach can substantially improve performance under partial observability.

Bio: Peter is a Research Scientist at OpenAI working on sub-quadratic attention mechanisms to improve long-context performance of large language models. He recently completed a DPhil in Statistics at Oxford, with research in probability theory, stochastic analysis, numerical SDE methods, and reinforcement learning under partial observability. More broadly, he is interested in how mathematical approaches can help make machine learning algorithms more scalable, robust, and useful.