Five papers from RLLAB are accepted to IROS 2024

[2024.07.01]

Following papers are accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024):

RNR-Nav: A Real-World Visual Navigation System Using Renderable Neural Radiance Maps by Minsoo Kim, Obin Kwon, Howoong Jun, and Songhwai Oh
- Abstract: We propose a novel visual localization and navigation framework for real-world environments, utilizing a mapping method that directly integrates observed visual information into the bird-eye-view map. While the renderable neural radiance map (RNR-Map) [1], a grid map consisting of visual information represented as latent codes at each grid, shows considerable promise in simulated settings, its deployment in real-world scenarios poses undiscovered challenges. RNR-Map utilizes projections of multiple vectors into a single latent code, resulting in information loss under suboptimal conditions. To address such issues, our enhanced RNR-Map for real-world robots, RNR-Map++, incorporates strategies to mitigate information loss, such as a weighted map and positional encoding. For robust real-time localization, we integrate a particle filter into the correlation-based localization framework using RNRMap++ without a rendering procedure. Consequently, we establish a real-world robot system for visual navigation utilizing RNR-Map++, which we call “RNR-Nav.” Experimental results demonstrate that the proposed methods significantly enhance rendering quality and localization robustness compared to previous approaches. In real-world navigation tasks, RNR-Nav achieves a success rate of 84.4%, marking a 68.8% enhancement over the methods of the original RNR-Map paper.
- Video
Unsupervised 3D Part Decomposition via Leveraged Gaussian Splatting by Jae Goo Choy, Geonho Cha, Hogun Kee, and Songhwai Oh
- Abstract: We propose a novel unsupervised method for motion-based 3D part decomposition of articulated objects with a single monocular video of a dynamic scene. In contrast to existing unsupervised methods relying on optical flow or tracking techniques, our approach addresses this problem without additional information by leveraging Gaussian splatting techniques.We generate a series of Gaussians from a monocular video and then analyze the relationship between the Gaussians to decompose the dynamic scene into motion-based parts. To decompose dynamic scenes consisting of articulated objects, we design an articulated deformation field suitable for the movement of articulated objects. And to effectively understand the relationships of Gaussians of different shapes, we propose a 3D reconstruction loss using 3D occupied voxel maps generated from the Gaussians. Experimental results demonstrate that our method outperforms existing approaches in terms of 3D part decomposition for articulated objects and demonstrates competitive image synthesis performance.
- Video
Gradual Receptive Expansion Using Vision Transformer for Online 3D Bin Packing by Minjae Kang, Hogun Kee, Yoseph Park, Junseok Kim, Jaeyeon Jeong, Geunje Cheon, Jaewon Lee, and Songhwai Oh
- Abstract: The bin packing problem (BPP) is a challenging combinatorial optimization problem with a number of practical applications. This paper deals with online 3D-BPP, a BPP variant with direct ties to real-world situations, where the packer makes immediate decisions for a loading position as items continually arrive. We propose a novel reinforcement learning algorithm named GREViT, which utilizes a vision transformer for solving online 3D-BPP for the first time. By proposing the gradual receptive expansion technique, GREViT overcomes the limitations inherent in learning-based methods that only excel in their trained bins. As a result, GREViT surpasses existing BPP algorithms in packing ratio with various bin sizes. The ability of GREViT to tackle real-world tasks is validated by its successful application to a real robot solving online 3D-BPP. The attached video demonstrates GREViT undertaking 3D-BPP in both simulated and real-world environments.
- Video
Renderable Street View Map-Based Localization: Leveraging 3D Gaussian Splatting for Street-Level Positioning by Howoong Jun, Hyeonwoo Yu, and Songhwai Oh
- Abstract: In this paper, we introduce a new method for street-level localization that first utilizes 3D Gaussian splatting in street-level localization problem. Robust localization with street-level real-world images such as street view is a major issue for autonomous vehicle, augmented reality (AR) navigation, and outdoor mobile robots. The objective is to determine the position and orientation of a query image that matches a street view database composed of RGB images. However, given the limited information available in the street view images, accurately determining the location solely based on this data presents a significant challenge. To address this challenge, we propose a novel method called renderable street view map-based localization (RSM-Loc). This approach enhances the localization process by augmenting 2D street view images into a renderable map using 3D Gaussian splatting, to resolve street-level localization problems. Upon receiving a query RGB image without geometry information, the proposed method renders 2D images from a pre-made renderable map and compares image pose similarities between the rendered images and the query image. Through iterations of this process, the proposed method eventually estimates the pose of the given query image. The experimental results demonstrate that RSM-Loc outperforms the baselines with neural-field-based localization. Additionally, we conduct deep analysis on the proposed method to show that our method can serve as a new concept for the street-level localization problem.
- Video
Safe CoR: A Dual-Expert Approach to Integrating Imitation Learning and Safe Reinforcement Learning Using Constraint Rewards by Hyeokjin Kwon, Gunmin Lee, Junseo Lee, and Songhwai Oh
- Abstract: In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constraints, but still faces challenges in navigating intricate environments such as complex driving situations. To overcome these challenges, we present the safe constraint reward (Safe CoR) framework, a novel method that utilizes two types of expert demonstrations: reward expert demonstrations focusing on performance optimization and safe expert demonstrations prioritizing safety. By exploiting a constraint reward (CoR), our framework guides the agent to balance performance goals of reward sum with safety constraints. We test the proposed framework in diverse environments, including the safety gym, metadrive, and the real-world Jackal platform. Our proposed framework enhances the performance of algorithms 39% higher while 88% less constraint violations from the real-world Jackal platform, demonstrating the framework’s efficacy. Through this innovative approach, we expect significant advancements in real-world performance, leading to transformative effects in the realm of safe and reliable autonomous agents.
- Video