

{"id":164,"date":"2022-08-22T07:54:21","date_gmt":"2022-08-22T05:54:21","guid":{"rendered":"https:\/\/project.inria.fr\/avbot\/?page_id=164"},"modified":"2022-08-22T10:20:43","modified_gmt":"2022-08-22T08:20:43","slug":"chris-reinke","status":"publish","type":"page","link":"https:\/\/project.inria.fr\/avbot\/workshop\/chris-reinke\/","title":{"rendered":"Deep Transfer Reinforcement Learning for Social Robotics by Dr. Chris Reinke"},"content":{"rendered":"<p><strong>Title<\/strong>: Deep Transfer Reinforcement Learning for Social Robotics<\/p>\n<p><strong>Abstract<\/strong>: Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate the expected return of previously learned policies in a new target task and to transfer their knowledge. A limiting factor of the SF framework is its assumption that rewards linearly decompose into successor features and a reward weight vector. We propose a novel SF mechanism, \u03be-learning, based on learning the cumulative discounted probability of successor features. Crucially, \u03be-learning allows to reevaluate the expected return of policies for general reward functions. We introduce two \u03be-learning variations, prove its convergence, and provide a guarantee on its transfer performance. Experimental evaluations based on \u03be-learning with function approximation demonstrate the prominent advantage of \u03be-learning over available mechanisms not only for general reward functions, but also in the case of linearly decomposable reward functions.<\/p>\n<p><strong>Preprint<\/strong>:\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2110.15701\">https:\/\/arxiv.org\/abs\/2110.15701<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Title: Deep Transfer Reinforcement Learning for Social Robotics Abstract: Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate the expected return of\u2026<\/p>\n<p> <a class=\"continue-reading-link\" href=\"https:\/\/project.inria.fr\/avbot\/workshop\/chris-reinke\/\"><span>Continue reading<\/span><i class=\"crycon-right-dir\"><\/i><\/a> <\/p>\n","protected":false},"author":1210,"featured_media":0,"parent":146,"menu_order":3,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-164","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/pages\/164","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/users\/1210"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/comments?post=164"}],"version-history":[{"count":2,"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/pages\/164\/revisions"}],"predecessor-version":[{"id":170,"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/pages\/164\/revisions\/170"}],"up":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/pages\/146"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/avbot\/wp-json\/wp\/v2\/media?parent=164"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}