Note

How My Interests Developed

March 9, 2026

Explaining my research interests and what they mean to me.

notespersonalresearch

Right now, there are two main areas I'm interested in. One is the study of memory: what it is, how it contributes to learning and intelligent decision-making, and how we can model and coordinate it in artificial intelligence systems. The other is probabilistic inference, which I think is a useful perspective on modeling uncertainty, and happens to have nice ties to reinforcement learning, which is useful given the current LLM RLHF/RLVR paradigm. These might seem rather disjoint, but they are both important and interesting areas to work in, and they have some nice connections to each other as well.

From the memory side of things, it is a bit cliché, but my interest in it started with taking some cognitive science and psychology courses in my undergrad. I never really thought about it before, but it made sense that the concepts of working memory and long-term memory should exist, and work together for predicting what is needed to solve problems and holding information needed to solve problems in mind. This idea of memory resurfaced again when I was at EPFL for Summer 2025, where I learned about state space models (SSMs). At the time, Mamba and Mamba-2 were all the rage, and everyone wanted a sequence model that was linear in the sequence length. But preceding these papers was an interesting theory paper called HiPPO, which framed memory as a problem of online function approximation. The SSMs at the time were already quite good at modeling long sequences, and it gave me a new perspective on how to think about memory more mechanistically. And as I took a class in Bayesian statistics in my last year, I just kept thinking about memory.

Memory is the one thing that has consistently resurfaced over my research journey. I also think that at the frontier of LLM research and engineering, the solutions aren't quite where they should be. Yes, modern LLMs are extremely good at reasoning problems and the agent harnesses are getting better by the day, but ultimately "memory" still boils down to complex context engineering. I still find that occasionally, if the agent does not write everything down, or even if it writes too much down, it might forget or hallucinate things from earlier in the conversation.

On another front, I study probabilistic inference. Probabilistic inference is important for developing robust AI systems in general given that there will always be some measure of uncertainty in making real-world inferences, and it is also a useful perspective for daily life. One flavour of this is Bayes' Theorem: when you are uncertain about things, it helps to think about what you initially believed, what new information you have, and how that should change your belief. I started in this area with my Markov chain Monte Carlo research with Jeffrey Rosenthal, and I spent more time on the topic under Roger Grosse, who was highly influential for me. Working with him, I internalized that probabilistic inference is an extremely general framework for LLM research, given that almost any desired property can be framed as a target probability distribution we would like to sample from. I was also somewhat mindblown when I first studied the conntection between probabilistic inference and RL: finding an optimal policy to maximize rewards can be mathematically formulated as inferring the most likely trajectory in a specific probabilistic graphical model.

Just as reinforcement learning and control can be elegantly framed as inference problems, human memory (and some might argue AI as well) is fundamentally a process of probabilistic inference. Memory is highly dynamic and malleable, and is essentially designed to infer what happened in the past to optimally predict the future in a noisy, uncertain world.

So these two threads of memory and probabilistic inference might seem different, but maybe in some high-dimensional conceptual manifold, they are two sides of the same coin. My research would sit at their intersection: building systems that can maintain and update structured memories under uncertainty, and using the language of probabilistic inference to reason about how and what to remember. I think there is a lot of exciting work to be done here, especially as language models become more capable but continue to struggle with the kinds of long-horizon, memory-intensive tasks that come naturally to humans. I'm optimistic that by grounding memory in principled inference rather than ad-hoc engineering, we might make meaningful progress on some of the harder open problems in AI.