Machine learning (ML) is ubiquitous in AI-based services and data-oriented scientific fields but raises serious privacy concerns when training on personal data. The starting point of PRIDE is that personal data should belong to the individual who produces it. This requires to revisit ML algorithms to learn from many decentralized personal datasets while preventing the reconstruction of raw data. Differential Privacy (DP) provides a strong notion of protection, but current decentralized ML algorithms are not able to learn useful models under DP.
The goal of PRIDE is to develop theoretical and algorithmic tools that enable differentially-private ML methods operating on decentralized datasets, through three complementary objectives:
- Prove that decentralized learning protocols naturally amplify DP guarantees;
- Propose algorithms at the intersection of decentralized ML and secure multi-party computation;
- Design data-adaptive communication schemes to speed up the convergence on heterogeneous datasets.