The impact of ML3RI is manifold. Indeed, by developing the work-flow proposed in the project, we will understand which automatic machine perception and decision-taking models and associated algorithms are suitable for robots immersed in multi-person conversations in the wild. From a methodological perspective we will introduce structured, hybrid probabilistic/deep learning models for perception and action that are trainable with very few and noisy scenario-specific data. In this regard, we will create and foster research around a new sub-domain of machine learning aiming to investigate how to jointly learn structured perception and action models. This is important because current methods are limited in performance (they are too shallow) or in interpretation (they are unstructured). This opens the door to investigate dialogue modeling in multi-person conversations in the wild, which enriches the current span of possible experimental scenarios. Moreover, the project will allow the possibility to investigate semantics and knowledge representation for robotic platforms, currently unable to converse with multiple persons at the same time.