Implementation

ML3RI is implemented into four workpackages:

WP1: Robust perception of multi-person scenarios

The aim of WP1 is to develop learning methods providing robust perception capabilities with special emphasis on (i) how to properly fuse auditory and visual data, (ii)  how to handle the combinatorial observation-to-person assignment problem and (iii) how to handle a time-varying number of persons. The work-package is decomposed into four tasks: the design and implementation of auditory and DNN cue extractors, the inference of individual and collective behavioural cues with probabilistic models, the joint learning of the probabilistic and DNN parameters and handling a time-varying number of persons. Four tasks are foreseen:

  • T1.1 Design and extraction of auditory and visual cues
  • T1.2 Inference of individual and collective cues
  • T1.3 Hybrid probabilistic/deep learning
  • T1.4 Time-varying number of persons

WP2: Reinforcement learning for multi-person robot interaction in the wild

Behavior synthesis is very challenging, since the span of all possible observations, actions and states is huge. Indeed, apart from the individual and collectives cues, and their associated observations, the set of robot actions include gestures, navigation, speech, head and body movements, to cite a few. The objective of WP2 is to provide automatic tools so that robots learn how to coordinate the actions with the perceived scene. The work is split in three tasks:

  • T2.1 Deep structured reinforcement learning
  • T2.2 Constrained deep reinforcement learning
  • T2.3 Joint learning of the action and prediction models

WP3: Generation of multi-modal behavioural data

DNN are high-performance robust methodologies for many classification, recognition and decision-taking tasks. However, their success strongly relies on the existence of huge training datasets. Multi-person interactions in the wild are so rich (and thus complex in data variability) that brute force learning is not expected to work, as discussed in WP1-2. Even if structuring the perception model (WP1), constraining the action policy search or jointly training the action and perception models (WP2) can help training the DNN and DQN used in ML3RI we believe that it is still necessary to reduce as much as possible the amount of scenario-specific data required for fine-tuning the parameters of the overall architecture. Indeed, gathering multi-person interactions in real situations is a very tedious process. One must conceive
the scenario, find and prepare the participants and recording devices, acquire and post-process the data and finally annotate it. Therefore, any efforts to reduce the amount of required data will have a very positive impact on the development of interactive systems.

  • T3.1 Preliminary data collection
  • T3.2 Learning to generate data
  • T3.3 Cycling generation, action and perception

WP4: Experimental validation

The experimental validation of the proposed methodology is the tool to demonstrate the short-term progress and validate the achievements of ML3RI. The final aim is to evaluate the usefulness of the developed robot learning skills. Specifically, we are interested in multi-person robot interactions in the wild, and therefore we will conduct experiments in the most realistic conditions as is possible. The experimental validation campaign will be implemented on two different robotic platforms: AV-TurtleBot and ARI from PAL Robotics. Three tasks are foreseen:

  • T4.1 Single person experiments
  • T4.2 Multi-person experiments with AV-TurtleBot
  • T4.3 Multi-person experiments with Pepper

Comments are closed.