Moritz Reuss
I am a PhD student in the Intuitive Robots Lab (IRL) at the Karlsruhe Institute of Technology (KIT), Germany.
My research focuses on robotics and machine learning supervised by Rudolf Lioutikov.
Previously, I obtained my Master Degree in Mechanical Engineering at the KIT where I wrote my thesis at Bosch supervised by Gerhard Neumann.
During my studies I interned at Audi AG, IPG Automotive, and the Research Center for Informatics (FZI).
Email  / 
CV  / 
Google Scholar  / 
Github  / 
LinkedIn
|
|
Research
My primary research goal is to build intelligent embodied agents that assist people in their everyday lives and
communicate intuitively.
One of the key challenges to be solved towards this goal is learning from multimodal, uncurated human demonstrations
without rewards.
Therefore, I am working on novel methods that exploit multimodality and learn versatile behavior.
Representative papers are highlighted.
|
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Moritz Reuss*,
Jyothish Pari*,
Pulkit Agrawal,
Rudolf Lioutikov
ICLR 2025
Project Page
/
Code
/
Arxiv
We propose Mixture-of-Denoising Experts (MoDE) as a novel generalist policy for guided behavior generation, that outperforms dense transformer-based Diffusion Policies in performance, number of parameters and efficiency.
Our proposed method introduces a novel routing strategy, that conditions the expert selection on the current noise level of the diffusion process. We test MoDE on four established imitation learning benchmarks, including CALVIN and LIBERO.
In our experiments, MoDE consistently outperforms dense transformer architectures and state-of-the-art baselines on CALVIN and LIBERO benchmark.
We pretrain MoDE on a subset of OXE for just 3 days on 6 GPUS to surpass OpenVLA and Octo in terms of performance on SIMPLER.
In addition, MoDE achieves higher average performance with 90% less FLOPS, 20% faster inference and 40% less parameters compared to the dense transformer diffusion policy.
|
Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models
Nils Blank,
Moritz Reuss,
Marcel Ruehle,
Ömer Erdinç Yağmurlu,
Fabian Wenzel,
Oier Mees ,
Rudolf Lioutikov ,
Conference on Robot Learning 2024 (CoRL),
Oral @ 2nd Workshop on Mobile Manipulation and Embodied Intelligence at ICRA 2024
Paper Link
We introduce a novel approach to automatically label uncurated, long-horizon robot teleoperation data at scale in a zero-shot manner without any human intervention.
We utilize a combination of pre-trained vision-language foundation models to detect objects in a scene, propose possible tasks, segment tasks from large datasets of unlabelled interaction data and then train language-conditioned policies on the relabeled datasets.
Our initial experiments show that our method enables training language-conditioned policies on unlabeled and unstructured datasets that match ones trained with oracle human annotations.
|
Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals
Moritz Reuss,
Ömer Erdinç Yağmurlu,
Fabian Wenzel,
Rudolf Lioutikov
Robotics: Science and Systems (RSS), 2024,
Oral @ Workshop on Language and Robot Learning (LangRob)
@ CoRL 2023,
Project Page
/
Code
/
Arxiv
We present a novel diffusion policy for learning from uncurated, reward-free offline data with sparse language labels.
Our method, called Multimodal Diffusion Transformer (MDT), is able to learn complex, long-horizon behaviors and sets a new state-of-the-art on the challenging CALVIN benchmark.
MDT uses a novel transformer architecture for diffusion policies, that leverages pre-trained vision and language foundation models and aligns multimodal goal-specifications in the latent space of the transformer encoder.
MDT uses two novel self-supervised auxiliary objectives to better follow goals specified in language and images.
|
Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations
Xiaogang Jia,
Denis Blessing,
Xinkai Jiang,
Moritz Reuss,
Atalay Donat,
Rudolf Lioutikov ,
Gerhard Neumann
ICLR 2024
OpenReview
Introducing D3IL, a novel set of simulation benchmark environments and datasets tailored for Imitation Learning,
D3IL is uniquely designed to challenge and evaluate AI models on their ability to learn and replicate diverse,
multi-modal human behaviors. Our environments encompass multiple sub-tasks and object manipulations, providing a rich
diversity in behavioral data, a feature often lacking in other datasets. We also introduce practical metrics to
effectively quantify a model's capacity to capture and reproduce this diversity. Extensive evaluations of state-of-the-art methods on D3IL offer insightful
benchmarks, guiding the development of future imitation learning algorithms capable of generalizing complex human
behaviors.
|
Goal Conditioned Imitation Learning using Score-based Diffusion Policies
Moritz Reuss,
Maximilian Li,
Xiaogang Jia,
Rudolf Lioutikov
Best Paper Award @ Workshop on Learning from Diverse, Offline Data
(L-DOD) @ ICRA 2023, Robotics: Science and Systems (RSS), 2023
project page
/
Code
/
arXiv
We present a novel policy representation, called BESO, for goal-conditioned imitation learning using score-based diffusion models.
BESO is able to effectively learn goal-directed, multi-modal behavior from uncurated reward-free offline-data.
On several challening benchmarks our method outperforms current policy representation by a wide margin.
BESO can also be used as a standard policy for imitation learning and achieves state-of-the-art performance
with only 3 denoising steps.
|
|
|
|