I am a postdoctoral researcher in the Department of Computer Science at Aalto University, Finland.
I am supervised by Prof. Juho Kannala and Prof. Arno Solin.
My PhD research, supervised by Prof. Joni Pajarinen, focused on reinforcement learning, imitation learning, and their applications in robotics and general decision-making.
Currently, my research interests include robot perception, particularly 3D scene understanding and physics-based dynamics modeling.
I have done several projects on generative modeling, reinforcement learning, imitation learning, and curriculum learning. Representative papers are highlighted. Please check my Google Scholar for more details.
Diffusion models often suffer from generating spatially inconsistent images. Instead of enforcing heuristics of physics or geometry prior knowledge to the generation process, we propose sparsely supervised diffusion, a principled method to mitigate this problem by compressing excessive correlation brought by limited data samples. The method is simple yet effective and can be implemented in several lines of code.
Video generation is usually performed in a well structured latent space given by VAE. We propose a latent compressed VAE to remove the high-frequency components of video latent while offloading the high-frequency reconstruction to the decoder. In this way, the VAE encodes the video into diffusion friendly latent and improves the video generation quality.
We show two flaws in existing reward based curriculum learning algorithms when generating number of
agents as curriculum in MARL.
Instead, we propose a learning progress metric as a new optimization objective which generates
curriculum maximizing the learning progress of agents.
We propose multi-agent correlated policy factorization under CTDE, in order to overcome the
asymmetric learning failure when naively distill individual policies from a joint policy.
In order to overcome the relative overgeneralization problem in multi agent learning, we propose to
enable optimism in multi-agent policy gradient methods by reshaping advantages.
We propose a bi-level optimization framework to address the issue of physically infeasible motion
data in humanoid imitation learning.
The method alternates between optimizing the robot's policy and modifying the reference motions,
while using a latent space regularization to preserve the original motion patterns.
We show that in many multi agent systems where agents are weakly coupled, partial observation can
still enable near-optimal
decision making. Moreover, in a mobile robot manipulator, we show partial observation of agents can
improve robustness to agent failure.
We propose a simple but effective model-based reinforcement learning algorithm relying only on a
latent dynamics model trained
by latent temporal consistency.