I'm an Electrical Engineering PhD student at Aalto Robot Learning Lab, Finland.
I have always been fascinated by robots interacting with physical world, other robots and humans.
Motivated by this theme, I started from exploring reinforcement learning algorithms. Recently I am interested in imitation learning for humanoid control via diffusion models.
I have done several works on mutli-agent reinforcment learning, curriculum learning and model-based reinforcement learning. Representative papers are highlighted.
In order to overcome the relative overgeneralization problem in multi agent learning, we propose to enable optimism in multi-agent policy gradient methods by reshaping advantages.
We propose multi-agent correlated policy factorization under CTDE, in order to overcome the asymmetric learning failure when naively distill individual policies from a joint policy.
We show two flaws in existing reward based curriculum learning algorithms when generating number of agents as curriculum in MARL.
Instead, we propose a learning progress metric as a new optimization objective which generates curriculum maximizing the learning progress of agents.
We show that in many multi agent systems where agents are weakly coupled, partial observation can still enable near-optimal
decision making. Moreover, in a mobile robot manipulator, we show partial observation of agents can improve robustness to agent failure.
We propose a simple but effective model-based reinforcement learning algorithm relying only on a latent dynamics model trained
by latent temporal consistency.