research | Aniruddh Puranic

An overview of my research for developing safe and explainable robots that interact with humans.

NEURO-SYMBOLIC AI

Left: Reward modeling and prediction via Gaussian processes and deep neural networks. Right: Extracted specification-consistent behaviors in simulations, including Nvidia Isaac.

Modeling and control synthesis for long-horizon tasks using Temporal Behavior Trees

This work addresses the limitations of long-horizon, myopic reinforcement learning by combining the adaptability of Behavior Trees (BTs) with the formal rigor of temporal logics. While temporal logics enable precise task specification, they struggle with complex dependencies. BTs offer modularity and flexibility but lack formal guarantees. By translating BTs into Timed Automata, this approach enables formal verification, inconsistency detection, and control synthesis using UPPAAL, ensuring robust, time-aware task execution.

References

Matheu, R., Puranic, A. G., Baras, J. S., & Belta, C. (2025). BT2Automata: Expressing Behavior Trees as Automata for Formal Control Synthesis. Proceedings of the 28th ACM International Conference on Hybrid Systems: Computation and Control. https://doi.org/10.1145/3716863.3718042
Matheu, R., Puranic, A. G., Baras, J. S., & Belta, C. (2025). OMTBT: Online Monitoring of Temporal Behavior Trees with Applications to Closed-Loop Learning. Proceedings of the 2025 23rd European Control Conference (ECC) – Accepted.

Learning reward functions and control policies that satisfy temporal-logic specifications

Designing effective reward functions in reinforcement learning (RL) is challenging and error-prone, often leading to unsafe behaviors. This work presents a neurosymbolic learning-from-demonstrations (LfD) framework that leverages Signal Temporal Logic (STL) and user demonstrations to derive temporal reward functions and control policies. Unlike traditional inverse RL, this approach captures non-Markovian rewards and enhances safety and performance. Initially developed for discrete environments, the framework was later extended to continuous and stochastic settings using Gaussian Processes and neural networks.

References

Puranic, A. G., Deshmukh, J. V., & Nikolaidis, S. (2023). Learning Performance Graphs From Demonstrations via Task-Based Evaluations. IEEE Robotics and Automation Letters (RA-L). Oral Presentation at ICRA, 8(1), 336–343. https://doi.org/10.1109/LRA.2022.3226072
Puranic, A. G., Deshmukh, J. V., & Nikolaidis, S. (2021). Learning From Demonstrations Using Signal Temporal Logic in Stochastic and Continuous Domains. IEEE Robotics and Automation Letters (RA-L). Presented at IROS, 6(4), 6250–6257. https://doi.org/10.1109/LRA.2021.3092676
Puranic, A., Deshmukh, J., & Nikolaidis, S. (2021). Learning from Demonstrations using Signal Temporal Logic. Proceedings of the 2020 Conference on Robot Learning (CoRL), 155, 2228–2242. https://proceedings.mlr.press/v155/puranic21a.html

Learning to improve/extrapolate beyond demonstrator performance

Machine learning performance relies heavily on data quality and quantity, making noisy or limited demonstrations a challenge in robotics. This work proposes neuro-symbolic apprenticeship learning, using temporal logic-guided reinforcement learning to enable robots to self-monitor and adapt for improved safety and performance. The capabilities of the framework are exhibited on a variety of mobile navigation, fixed-base manipulation and mobile-manipulation tasks using the Nvidia Isaac simulator. This paper is published in the proceedings of IROS 2024. Additional details can be found on the supplemental document.

References

Puranic, A. G., Deshmukh, J. V., & Nikolaidis, S. (2024). Signal Temporal Logic-Guided Apprenticeship Learning. 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 11147–11154. https://doi.org/10.1109/IROS58592.2024.10801924
Puranic, A. G., Deshmukh, J. V., & Nikolaidis, S. (2023). Learning Performance Graphs From Demonstrations via Task-Based Evaluations. IEEE Robotics and Automation Letters (RA-L). Oral Presentation at ICRA, 8(1), 336–343. https://doi.org/10.1109/LRA.2022.3226072

INTERPRETABLE/EXPLAINABLE AI (xAI)

Left: Generating graphs that explain demonstrator performance and formal specification conflicts. Center: Neural reward modeling from inferred graphs. Right: Mining formal specifications from time-series data.

Generating explainable temporal logic graphs from human data

Evaluating human demonstrations is key to learning safe, effective robot policies. Prior LfD-STL methods required manual ranking of STL specifications, represented as a DAG. To reduce this burden, we introduce Performance Graph Learning (PeGLearn), which automatically infers the DAG from demonstrations. PeGLearn enhances explainability, validated via a CARLA driving study, and models surgeon behavior using human feedback in a surgical domain.Additional details can be found on the supplemental document.

Learning (mining) specifications from temporal data

Autonomous cyber-physical systems such as self-driving cars, unmanned aerial vehicles, general purpose robots, and medical devices can often be modeled as a system consisting of heterogeneous components. Understanding the high-level behavior of such components, especially equipped with deep learning, at an abstract, behavioral level is thus a significant challenge. Our work seeks to answer: Given a requirement on the system output behaviors, what are the assumptions on the model environment, i.e., inputs to the model, that guarantee that the corresponding output traces satisfy the output requirement? We develop techniques involving decision-tree classifiers, counterexample-guided learning, optimization, enumeration and parameter mining to extract STL specifications that explain system behaviors.

References

Puranic, A. G., Deshmukh, J. V., & Nikolaidis, S. (2023). Learning Performance Graphs From Demonstrations via Task-Based Evaluations. IEEE Robotics and Automation Letters (RA-L). Oral Presentation at ICRA, 8(1), 336–343. https://doi.org/10.1109/LRA.2022.3226072
Mohammadinejad, S., Deshmukh, J. V., Puranic, A. G., Vazquez-Chanlatte, M., & Donzé, A. (2020). Interpretable Classification of Time-Series Data Using Efficient Enumerative Techniques. Proceedings of the 23rd International Conference on Hybrid Systems: Computation and Control (HSCC). https://doi.org/10.1145/3365365.3382218
Mohammadinejad, S., Deshmukh, J. V., & Puranic, A. G. (2020). Mining Environment Assumptions for Cyber-Physical System Models. 2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS), 87–97. https://doi.org/10.1109/ICCPS48487.2020.00016

COMPUTER VISION

Evaluating the quality of vision-based perception algorithms

Computer vision is vital for cyber-physical systems, but ensuring the robustness of deep learning-based perception is challenging. Traditional testing relies on ground truth labels, which are labor-intensive to obtain. This work introduces Timed Quality Temporal Logic (TQTL) to formally specify spatio-temporal properties of perception algorithms, enabling evaluation without ground truth.

Vision-based metric for evaluating surgeon’s performance

Due to the lack of instrument force feedback during robot-assisted surgery, tissue-handling technique is an important aspect of surgical performance to assess. We develop a vision-based machine learning algorithm for object detection and distance prediction to measure needle entry point deviation in tissue during robotic suturing as a proxy for tissue trauma.

References

Balakrishnan, A., Puranic, A. G., Qin, X., Dokhanchi, A., Deshmukh, J. V., Ben Amor, H., & Fainekos, G. (2019). Specifying and Evaluating Quality Metrics for Vision-based Perception Systems. 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1433–1438. https://doi.org/10.23919/DATE.2019.8715114
Puranic, A., Chen, J., Nguyen, J., Deshmukh, J., & Hung, A. (2019). MP35-04 Automated Evaluation of Instrument Force Sensitivity During Robotic Suturing Utilizing Vision-Based Machine Learning. Journal of Urology, 201(Supplement 4), e505–e506. https://doi.org/10.1097/01.JU.0000555994.79498.94