Two papers from RLLAB are accepted to IROS 2019


Following papers are accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019):

  • Deep Predictive Autonomous Driving Using Multi-Agent Joint Trajectory Prediction and Traffic Rules by Kyunghoon Cho, Timothy Ha, Gunmin Lee, and Songhwai Oh
    • Abstract: Autonomous driving is a challenging problem because the autonomous vehicle must understand complex and dynamic environment. This understanding consists of predicting future behavior of nearby vehicles and recognizing predefined rules. It is observed that not all rules have equivalent values, and the priority of the rules may change depending on the situation or the driver’s driving style. In this work, we jointly reason both a future trajectories of vehicles and degree of satisfaction of each rule in the deep learning framework. Joint reasoning allows modeling interactions between vehicles, and leads to better prediction results. A rule is represented as a signal temporal logic (STL) formula, and a robustness slackness, a margin to the satisfaction of the rule, is predicted for the both autonomous and other vehicle, in addition to future trajectories. Learned robustness slackness decides which rule should be prioritized for the given situation for the autonomous vehicle; and filter out non-valid predicted trajectories for surrounding vehicles. The predicted information from the deep learning framework is used in model predictive control (MPC), which allows the autonomous vehicle navigate efficiently and safely. We prove the feasibility of our approach on the publicly available NGSIM datasets; Proposed method shows a driving style similar to the human one and considers the safety related to the rules through the future prediction of the surrounding vehicles.
    • Video:
  • Soft Action Particle Deep Reinforcement Learning for a Continuous Action Space by Minjae Kang, Kyungjae Lee, and Songhwai Oh
    • Abstract: Recent advances of actor-critic methods in deep reinforcement learning have enabled to perform several continuous control problems. However, existing actor-critic algorithms require a large number of parameters to model policy and value functions where it can lead to overfitting issue and is difficult to tune hyperparameter. In this paper, we introduce a new off-policy actor-critic algorithm which can reduce a significant number of parameters compared to existing actor-critic algorithm without a performance loss. The proposed method replaces the actor network with a set of action particles that employ few parameters. Then, the policy distribution is represented using state action value network with action particles. During the learning phase, to improve performance of policy distribution, the location of action particles is updated to maximize state action values. To enhance the exploration and stable convergence, we add perturbation to action particles during training. In experiment, we validate the proposed method in MuJoCo environments and empirically show that our method shows similar or better performance than the state-of-the-art actor-critic method with a smaller number of parameters.
    • Video: