CoLearn

Every minute, 500 hours of video are uploaded on Youtube, and 240, 000 images are added on Facebook. Since it is physically impossible that this huge mass of data is entirely processed and visualized by humans, there is absolute need to rely on advanced Machine Learning methods so as to sort, organize, and recommend the content to users. But as a crucial step, the data should first be transmitted from users to a server for further storage and processing. The conventional communication framework assumes that the data should be completely reconstructed, even with some distortions, by the server. Instead, this project aims to develop a novel communication paradigm considering the learning performance as one of the key criterion for the design of the communication system.

As described in the previous scheme, the source-channel code design will be realized by addressing together two concurrent criterion: data reconstruction quality and learning performance.

The project will therefore develop an Information-Theoretic analysis so as to understand the fundamental limits of such systems, and develop novel coding techniques allowing for both learning and data reconstruction from the coded data. The project will then apply its main findings in two applications:

Video content analysis in the compressed domain
Acoustic signal classification from underwater sensors

Presentation