# Inverse Reinforcement Learning Given the $(s,a, R)$ triple, where $R$ is the reward from expert annotations, learn the function $f(s, a) \rightarrow R$. Presupposition: Reward function provides the most succinct and transferable definition of a task. --- ## References 1. Inverse Reinforcement Learning from Preferences https://danieltakeshi.github.io/2021/04/01/inverse-rl-prefs/ 2. Berkeley Lecture on IRL https://people.eecs.berkeley.edu/~pabbeel/cs287-fa15/slides/lecture9-inverseRL.pdf