LfD with STL - 1 | Aniruddh Puranic

In this paper, we propose a learning-from-demonstrations (LfD) framework that uses high-level tasks expressed in Signal Temporal Logic (LfD), and user demonstrations to extract reward functions and control policies via reinforcement learning. Our paper shows how such a framework can learn non-Markovian/temporal rewards and overcome some issues with inverse reinforcement learning methods.

This paper was presented at the Conference on Robot Learning (CoRL) 2020.

Paper
Video