‘Safe Deep Reinforcement Learning by Verifying Task-Level Properties’

“Cost functions are commonly employed in Safe Deep Reinforcement Learning (DRL). However, the cost is typically encoded as an indicator function due to the difficulty of quantifying the risk of policy decisions in the state space. Such an encoding requires the agent to visit numerous unsafe states to learn a cost-value function to drive the learning process toward safety. … In this paper, we investigate an alternative approach that uses domain knowledge to quantify the risk in the proximity of such states by defining a violation metric.”

Read the paper and see the full list of authors in ArXiv.

View on Site: ‘Safe Deep Reinforcement Learning by Verifying Task-Level Properties’