

{"id":551,"date":"2017-11-19T15:42:04","date_gmt":"2017-11-19T14:42:04","guid":{"rendered":"https:\/\/project.inria.fr\/dalhis\/?page_id=551"},"modified":"2017-11-19T16:06:22","modified_gmt":"2017-11-19T15:06:22","slug":"results-2016-2017","status":"publish","type":"page","link":"https:\/\/project.inria.fr\/dalhis\/research\/results-2016-2017\/","title":{"rendered":"Results (2016-2017)"},"content":{"rendered":"<p>The work of the associate team in 2016-2018 centered around two areas: distributed Infrastructure support for workflow and data management and deep partnerships with scientific collaborations.<\/p>\n<h3><strong>Distributed Infrastructure Support for Workflow and Data Management<\/strong><\/h3>\n<p><strong>Energy-efficient data-intensive workflow execution:\u00a0<\/strong> We explored a way for energy aware<br \/>\nHPC cloud users to reduce their footprint on cloud infrastructures by reducing the size of the virtual<br \/>\nresources they are asking for. A user who agrees to reduce her impact on the environment can choose<br \/>\nto lose in performance by executing her application on less resources on the infrastructure. The unused<br \/>\nresources are free for another application and thus favor a better consolidation of the whole system. The<br \/>\nbetter the consolidation, the lower the electrical consumption. The proposed system offers three execution<br \/>\nmodes based on: Big, Medium and Little. An algorithm selects the size of the VMs for executing each<br \/>\ntask of the workflows depending on the selected execution mode. Medium mode executes using the user\u2019s<br \/>\nnormal VM resources for each workflow stage, Little mode reduces the VMs by one size for the workflow<br \/>\nand Big increases the VMs by one size for the workflow.<br \/>\nWe evaluated the impact of the proportion of users selecting the Big, Medium or Little mode on a data<br \/>\ncenter\u2019s energy consumption. Our evaluations have been done using three kinds of scientific workflows,<br \/>\nenergy consumption measurements for the execution of these workflows on a real platform and traces<br \/>\nof jobs submitted to a production HPC center. We evaluated by simulation the data center energy<br \/>\nconsumption for different proportions of users selecting the three available modes. A paper on this work was accepted and presented at PDP 2017.<\/p>\n<p><strong>Building a workflow for data analysis for anomaly detection:\u00a0<\/strong> We worked on building a workflow<br \/>\nfor anomaly Detection in HPC environments using statistical data. An initial analysis of traffic to and<br \/>\nfrom NERSC data nodes was conducted in order to determine if and to what extend a percentage of<br \/>\nthis traffic is normalized. We focused on the following characteristics: number of connections, connection<br \/>\nfrequency, port range and size of transfers. We were able to identify specific patterns that define a<br \/>\ncustomizable model of normal behavior and flag hosts that deviate from this model. We plan to expand our<br \/>\nmodel to other types of traffic and include more characteristics such as number of packet retransmissions,<br \/>\nfilename path, username.<\/p>\n<p><strong>Data integrity in HPC systems<\/strong> Amir Teshome\u2019s 3-month internship at LBNL (April-June 2017)<br \/>\nunder the supervision of Sean Peisert focused on data integrity in High Performance Computing (HPC)<br \/>\nsystems. Such systems are used by scientists to solve complex science and engineering problems. One of<br \/>\nthe security triad, data integrity, can simply be defined as an absence of improper data alterations. Its<br \/>\ngoal is to ensure that data is written correctly as intended and read from disk, memory or network exactly<br \/>\nas it was written. Assuring such consistency in HPC scientific workflows is challenging and falling to do<br \/>\nso may falsify the result of an experiment. During the internship we have studied where in the workflow<br \/>\ncould the data integrity be affected, what are the current existing solutions and how we can leverage them<br \/>\nto have better security and performance to cope with next generation HPC supercomputers. In general,<br \/>\ndata integrity in HPC environments could be affected: at the source (e.g. experimental setup), in the<br \/>\nnetwork, at processing time and finally in storage. Existing solutions include error correction code (ECC)<br \/>\nat memory level, checksum at the network level and different levels (and types) of replication depending<br \/>\non the sensitivity of applications. Replication could be done at memory level (e.g. replicating MPI<br \/>\nprocesses) or replicating the entire HPC computation (e.g. doing replicated computation at different HPC<br \/>\ncenters).<\/p>\n<p><strong>Design of a Cloud Approach for Dataset integration :<\/strong>\u00a0 Next-generation scientific discoveries<br \/>\nare at the boundaries of datasets, e.g., across multiple science disciplines, institutions and spatial<br \/>\nand temporal scales. This task is based in the context of Deduce. \u00a0 The DST team<br \/>\nworked on a) identifying data change characteristics from a number of different domains b) developing an<br \/>\nelastic resource framework for data integration called E-HPC that manages a dynamic resource pool that<br \/>\ncan grow and shrink, for data workflows c) compared and contrasted existing approaches for real-time<br \/>\nanalyses on HPC. A paper on E-HPC was accepted and presented at WORKS 2017, a workshop held in<br \/>\nconjunction with SC|17. We also developed a Deduce framework that allows a user to compare two data<br \/>\nversions and presents filesystem, metadata changes to the user.<br \/>\nComplementarily, Myriads team worked on evaluating data processing environments deployed in<br \/>\nclouds. We performed a thorough comparative analysis of four data stream processing platforms &#8211; Apache<br \/>\nFlink, Spark Streaming, Apache Storm, and Twitter Heron, that are chosen based on their potential to<br \/>\nprocess both streams and batches in real-time. The goal of the work is guide the choice of a resourceefficient<br \/>\nadaptive streaming platform for a given application. For the comparative performance analysis of<br \/>\nthe chosen platforms, in this work, we have experimented using 8-node clusters on Grid\u20195000 experimentation testbed and have selected a wide variety of applications ranging from the conventional benchmark (word count application) to sensor-based IoT applications (air quality monitoring application) to statistical batch processing (flight delay analysis application).<\/p>\n<p>&nbsp;<\/p>\n<h3><strong>Deep partnerships with scientific collaborations<\/strong><\/h3>\n<p><strong>AmeriFlux and FLUXNET<\/strong><\/p>\n<p><strong>Data exploration:<\/strong> The carbon flux datasets from AmeriFlux (Americas) and FLUXNET<br \/>\n(global) are comprised of long-term time series data and other measurements at each tower site. There<br \/>\nare over 800 flux towers around the world collecting this data. The non-time series measurements include<br \/>\ninformation critical to performing analysis on the site\u2019s data. Examples include: canopy height, species<br \/>\ndistribution, soil properties, leaf area, instrument heights, etc. These measurements are reported as a<br \/>\nvariable group where the value plus information such as method of measurement and other information<br \/>\nare reported together. Each variable group has a different number and type of parameters that are<br \/>\nreported. The current output format is a normalized file. Users have found this file difficult to use.<br \/>\nOur earlier work in the associated team focused on building user interfaces to specify the data. This<br \/>\nyear we jointly worked on developing a Jupyter Notebook that would serve as a tool for users to read in<br \/>\nand explore the data in a personalized tutorial type environment. We developed two notebooks and the<br \/>\nnext step is to start user testing on the notebooks.<\/p>\n<p><strong>Mobile application for reliable collection of field data for Fluxnet:<\/strong> Continuing with our initial usability feedback experiences gathered in 2015 on the application interface designs, we decided on the mobile application workflow for implementation. We developed a first prototype\u00a0 using the PhoneGap2 platform which provided two advantages: (1) reusing some of the existing HTML, CSS and JavaScript web resources and (2) same development code generates mobile application for IOS, Android and Windows platform simultaneously. The main functionality realized in the application prototype is that the user can download all the site data required by logging in through the application; and then view\/edit them at the tower site (even in offline mode). The next logical step would be developing the synchronization and validation of data held locally in the application with the servers.<\/p>\n<p><strong>Astroparticle Physics<\/strong><\/p>\n<p>The Large Synoptic Survey Telescope will soon produce an unprecedented<br \/>\ncatalog of celestial objects. Physicists in the US and in France will be able to exploit this sheer amount<br \/>\nof data through a Data Access Center, which is an end-to-end integrated system for data management,<br \/>\nanalysis, and visualization. In 2017, members of Fred Suter\u2019s team investigated the use of Jupyter<br \/>\nnotebooks as the main interface for data exploration and analysis. A first prototype has been already<br \/>\nused in several training sessions for physicists. Discussions with people at Berkeley Lab will probably<br \/>\nhelp to make this prototype evolve into a production tool suited to user needs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The work of the associate team in 2016-2018 centered around two areas: distributed Infrastructure support for workflow and data management and deep partnerships with scientific collaborations. Distributed Infrastructure Support for Workflow and Data Management Energy-efficient data-intensive workflow execution:\u00a0 We explored a way for energy aware HPC cloud users to reduce their footprint on cloud infrastructures &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"https:\/\/project.inria.fr\/dalhis\/research\/results-2016-2017\/\">Continue reading<\/a><\/p>\n","protected":false},"author":267,"featured_media":0,"parent":267,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-551","page","type-page","status-publish","hentry","nodate","item-wrap"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/551","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/users\/267"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/comments?post=551"}],"version-history":[{"count":7,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/551\/revisions"}],"predecessor-version":[{"id":564,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/551\/revisions\/564"}],"up":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/267"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/media?parent=551"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}