PANDORA Project

The recent major advances in Artificial Intelligence are to a very large part due to the significant progress in Machine Learning on the topic of Deep Neural Networks, which have been shown to be able to achieve state-of-the-art performance in just about any application area. Such networks have a large number of parameters that interact in intricate ways, which gives them the power to learn complicated concepts but also makes them very difficult to interpret and explain, which strongly limits their applicability in practice, such as in health care. Explainability of graph neural networks (GNN) has recently attracted a lot of research attention. Existing work mostly focuses on explaining individual neurons, or on learning interpretable input/output mappings, rather than actually explaining what is going on inside the network. In Pandora, our hypothesis is that a GNN performs well because it has been able to learn important concepts within the data. These concepts deserve to be brought to the attention of experts to develop new scientific breakthroughs or to detect biases within the training data. Our research hypothesis is that we can provide knowledge by introspecting the GNN models. With Pandora, we propose to characterize, gain insight, and explain in easily understandable terms the inner workings of GNNs. In a nutshell, we propose to discover statistically significant patterns of neural co-activation so as to determine how networks encode concepts over multiple neurons, identify information shared between classes, trace information through the network, and overall, to determine how networks perceive the world. Using those patterns we want to characterise under which conditions a prediction made by the network is to be trusted, and finally, learn trustworthy GNNs that are explicitly explainable using patterns. To assess the usefulness of our work, we will apply it on a variety of use cases in chemoinformatics, social web and semantic web.