This internship may open the way for a PhD thesis.
In the current model, the data series collected by “things” of the IoT end up on a centralized server where they are analyzed and queried. However, the systematic centralization of the data series has a negative impact on users’ privacy and sustainability . This explains the growing interest for transposing traditional data management functionalities directly into the smart objects. Relational database operations like selection, projection and join have been recently proposed for smart objects equipped with large Flash memory, as well as complex treatments such as facial recognition and full-text keyword search techniques (see for example ).
The project aims at changing the way sensory data is managed in the IoT by putting data owners at the center of the architecture. The project will allow data owners to better understand and control the use of private sensory data collected by their various devices, or “things” (e.g., smart watch, smartphone, etc.). Informed users will be conscious of privacy risks and will be able to act upon them using appropriate adaptive interfaces. To reach its goals, Priv’IoT securely stores, analyses and protects the sensory data generated by a given user’s things in a secure and trustworthy device named the PrivaBox, before the collected data is shared with applications.
To improve both sustainability and privacy in IoTs, we will transpose the existing techniques used to efficiently manage data series in real time into a new personal data management techniques for time series, under the control of the data owner, linked to a set of IoT devices or smart objects. The idea is to organize from a secure home box a storage and indexation layer able to maintain a subpart of the data (seldom used or private data) and the corresponding indexes in the smart objects, at the network edges (in the spirit of ), and thus satisfy energy and privacy constraints. Indexing will be selectively applied to the data as they are queried, inspired by Database Cracking  and Adaptive Indexing  principles, in order to concentrate the resources on the useful part of the data. We will also study embedded data compression/aggregation/degradation/aging techniques in link with privacy and energy constraints to minimize the data volume transmitted to the home cloud while preserving the expected usages on the data.
The goal of this master project is to propose solutions to securely store and index data series (and more precisely, time series), in a decentralized and individual way, such that data and data management are pushed to the edge of the network (e.g., in the “things” itself, in a home box). In order to demonstrate the novel capabilities resulting from the project, we envision building two end-to-end demonstrators in the context of the project that span from data collection to the implementation of a real application. The demonstrator will be in the domains of quantified-self with data collected by IoT domestic objects and by quantified-self devices, while the PlugDB platform (see https://project.inria.fr/plugdb/en/) developed by our team will be used as the PrivaBox.
This internship may open the way to a CIFRE PhD thesis with Hippocad (http://hippocad.com/), a company providing numeric platforms to ease the coordination of medical and social care at home. Hippocad develops a box located at home (http://humansbox.com/) which acts both as a secure safe for patient’s personal data (including data produced by IoT objects and quantified-self devices) and as a node of a P2P network connecting the patient home to service providers.
Required skills: Databases, storage and indexation schemes, database encryption and C programming.
Advisors: Nicolas Anciaux and Iulian Sandu Popa.
Localization: PETRUS team https://www.inria.fr/en/teams/petrus (ex SMIS https://www.inria.fr/equipes/smis), localized at UVSQ, 45 avenue des Etats Unis – 78035 Versailles (http://tinyurl.com/comeUVSQ)
 Anciaux, N., Bonnet, P., Bouganim, L., Nguyen, B., Sandu Popa, I., and P. Pucheral. Trusted cells: A sea change for personal data services. In CIDR, 2013.
 Anciaux, N., Lallali, S., Sandu Popa, I., and Pucheral, P.: A Scalable Search Engine for Mass Storage Smart Objects. PVLDB 8(9): 910-921 (2015)
 S. Idreos, M. L. Kersten, and S. Manegold. Database Cracking. In CIDR, 2007.
 K. Zoumpatianos, S., Idreos, and T., Palpanas. ADS: the adaptive data series index. The VLDB Journal, 1-24 (2016).