In the near future, intelligent and autonomous systems will become more ubiquitous and pervasive in applications such as autonomous robotics, design of intelligent personal assistants, and management of energy smart grids. Although very diverse, these applications call for the development of decision-making systems able to interact and manage open-ended, uncertain, and partially known environments. This will require increasing the autonomy of ICT systems, which will have to continuously learn from data, improve their performance over time, and quickly adapt to changes. EXTRA-LEARN is directly motivated by the evidence that one of the key features that allows humans to accomplish complicated tasks is their ability of building knowledge from past experience and transfer it while learning new tasks. We believe that integrating transfer of learning in machine learning algorithms will dramatically improve their performance and enable them to solve complex tasks. We identify in the reinforcement learning (RL) framework the most suitable candidate for this integration. RL formalizes the problem of learning an optimal control policy from the experience directly collected from an unknown environment. Nonetheless, practical limitations of current algorithms encouraged research to focus on how to integrate prior knowledge into the learning process. Although this improves the performance of RL algorithms, it dramatically reduces their autonomy. In this project we pursue a paradigm shift from designing RL algorithms incorporating prior knowledge to methods able to incrementally discover, construct, and transfer “prior” knowledge in a fully automatic way. More in detail, three main elements of RL algorithms would significantly benefit from transfer of knowledge.

  1. For every new task, RL algorithms need exploring the environment for a long time, and this corresponds to slow learning processes for large environments. Transfer learning would enable RL algorithms to dramatically reduce the exploration of each new task by exploiting its resemblance with tasks solved in the past.
  2. RL algorithms evaluate the quality of a policy by computing its state-value function. Whenever the number of states is too large, approximation is needed. Since approximation may cause instability, designing suitable approximation schemes is particularly critical. While this is currently done by a domain expert, we propose to perform this step automatically by constructing features that incrementally adapt to the tasks encountered over time. This would significantly reduce human supervision and increase the accuracy and stability of RL algorithms across different tasks.
  3. In order to deal with complex environments, hierarchical RL solutions have been proposed, where state representations and policies are organized over a hierarchy of subtasks. This requires a careful definition of the hierarchy, which, if not properly constructed, may lead to very poor learning performance. The ambitious goal of transfer learning is to automatically construct a hierarchy of skills, which can be effectively reused over a wide range of similar tasks.

Providing transfer solutions for each of these elements sets our objectives and defines the research lines of EXTRA-LEARN. The major short-term impact of the project will be a significant advancement of the state-of-the-art in transfer and RL, with the development of a novel generation of transfer RL algorithms, whose improved performance will be evaluated in a number of test beds and validated by a rigorous theoretical analysis. In the long term, we envision decision-making support systems where transfer learning takes advantage of the massive amount of data available from many different tasks (e.g., users) to construct high-level knowledge that allows sophisticated reasoning and learning in complex domains, with a dramatic impact on a wide range of domains, from robotics to healthcare, from energy to transportation.

Comments are closed.