

{"id":66,"date":"2021-10-04T11:16:23","date_gmt":"2021-10-04T09:16:23","guid":{"rendered":"https:\/\/project.inria.fr\/colearn\/?page_id=66"},"modified":"2023-09-19T17:19:41","modified_gmt":"2023-09-19T15:19:41","slug":"results","status":"publish","type":"page","link":"https:\/\/project.inria.fr\/colearn\/results\/","title":{"rendered":"Results"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Information-Theoretic bounds for Regression<\/h2>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:100%\">\n<p>In the context of goal-oriented communications, we consider that a source is compressed and transmitted to a receiver which aims to perform a regression using both the received compressed data and the side information available only at decoder, as illustrated in the coding scheme.<\/p>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><a href=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/regression_scheme-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"403\" height=\"155\" src=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/regression_scheme-1.png\" alt=\"\" class=\"wp-image-202\" srcset=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/regression_scheme-1.png 403w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/regression_scheme-1-300x115.png 300w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/regression_scheme-1-150x58.png 150w\" sizes=\"auto, (max-width: 403px) 100vw, 403px\" \/><\/a><figcaption>Coding scheme for regression<\/figcaption><\/figure><\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:55%\">\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><a href=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/rate-generalisation_error.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/rate-generalisation_error.png\" alt=\"\" class=\"wp-image-165\" width=\"313\" height=\"228\" srcset=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/rate-generalisation_error.png 768w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/rate-generalisation_error-300x219.png 300w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/rate-generalisation_error-150x109.png 150w\" sizes=\"auto, (max-width: 313px) 100vw, 313px\" \/><\/a><figcaption>Rate versus generalization-error regions for linear regression in the non-asymptotic regime, for different block lengths n.<\/figcaption><\/figure><\/div>\n<\/div>\n<\/div>\n\n\n\n<p>For parametric regression performed on compressed observations, we derive new achievable Information-Theoretic (Rate, Generalization Error) regions in an asymptotic regime, and show that the proposed bounds are much tighter than existing ones. This result is then extended to a non-asymptotic regime using tools from finite block-length information-theory. <\/p>\n\n\n\n<p>We next investigate the trade-off between the data reconstruction evaluated from the distortion, and the regression task evaluated from the generalization error. Existing works show that for various learning tasks (hypothesis testing, visual perception, etc.), there is a tradeoff in terms of coding rate between reconstruction and learning performance. In the case of regression, we propose an achievable scheme to asymptotically achieve both the minimum generalization error and optimal reconstruction rate-distortion region, which shows that there is actually no trade-off between reconstruction and regression. We also confirm this result in a non-asymptotic regime, by evaluating the correlation between the different terms of the dispersion matrix. <\/p>\n\n\n\n<p><em>[1] M. Raginsky, \u201cLearning from compressed observations,\u201d in 2007 IEEE Information Theory Workshop, 2007, pp. 420\u2013425.<br>[2] S. Watanabe, S. Kuzuoka, and V. Y. F. Tan, \u201cNonasymptotic and second-order achievability bounds for coding with side-information,\u201d IEEE Transactions on Information Theory, vol. 61, no. 4, pp. 1574\u20131605, 2015<br>[3] Y. Blau and T. Michaeli, \u201cRethinking lossy compression: The rate-distortion-perception tradeoff,\u201d in International Conference on Machine Learning. PMLR, 2019, pp. 675\u2013685.<br>[4] G. Katz, P. Piantanida, and M. Debbah, \u201cDistributed binary detection with lossy data compression,\u201d IEEE Transactions on information Theory, vol. 63, no. 8, pp. 5207\u20135227, 2017<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Learning on entropy-coded images with CNN<\/h2>\n\n\n\n<p>We start from the observation that the volumes of generated visual data are so massive (500 hours per minute uploaded to YouTube in 2022) that these data cannot be viewed by humans but rather by machines. For tasks performed by these machines (object detection, tracking, scene recognition, etc.), the reconstruction of the entire image is not necessary. Therefore, we are interested in the question of processing data directly in the compressed domain, specifically within the binary stream generated by an encoder that includes an entropy coding function.<\/p>\n\n\n\n<p>In fact, in data compression algorithms (such as JPEG, MPEG, etc.), entropy coding is the only operation that reduces the size of the data, while other operations (transform, prediction) prepare the data but do not reduce its size. Thus, we study the issue of classification in the entropy-compressed domain when this classification is performed using a CNN, which is one of the most advanced methods in the state of the art.<\/p>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><a href=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"967\" height=\"209\" src=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1.png\" alt=\"\" class=\"wp-image-184\" srcset=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1.png 967w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1-300x65.png 300w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1-768x166.png 768w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/colearn_website_coding-1-150x32.png 150w\" sizes=\"auto, (max-width: 967px) 100vw, 967px\" \/><\/a><\/figure><\/div>\n<\/div><\/div>\n\n\n\n<p>The question is challenging due to the characteristics of entropy codes, where the encoded symbols have variable sizes and the coding function is not fixed but depends on the data. The contributions are manifold:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Contrary to intuition, we demonstrate that it is possible to learn from entropy-coded data (with an expected loss of performance).<\/li><li>We propose two properties that are necessary for a convolutional neural network to perform well.<\/li><li>This qualitatively explains the loss of performance when dealing with data in compressed form.<\/li><li>We introduce a metric (quantitative this time) that allows us to predict the loss of performance when data is entropy-coded.<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><a href=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/20230919fig_colearn_website-metric.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/20230919fig_colearn_website-metric.png\" alt=\"\" class=\"wp-image-177\" width=\"439\" height=\"355\" srcset=\"https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/20230919fig_colearn_website-metric.png 564w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/20230919fig_colearn_website-metric-300x243.png 300w, https:\/\/project.inria.fr\/colearn\/files\/2023\/09\/20230919fig_colearn_website-metric-150x121.png 150w\" sizes=\"auto, (max-width: 439px) 100vw, 439px\" \/><\/a><\/figure><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Information-Theoretic bounds for Regression In the context of goal-oriented communications, we consider that a source is compressed and transmitted to a receiver which aims to\u2026<\/p>\n<p> <a class=\"continue-reading-link\" href=\"https:\/\/project.inria.fr\/colearn\/results\/\"><span>Continue reading<\/span><i class=\"crycon-right-dir\"><\/i><\/a> <\/p>\n","protected":false},"author":1754,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-66","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/pages\/66","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/users\/1754"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/comments?post=66"}],"version-history":[{"count":14,"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/pages\/66\/revisions"}],"predecessor-version":[{"id":203,"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/pages\/66\/revisions\/203"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/colearn\/wp-json\/wp\/v2\/media?parent=66"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}