{"id":554,"date":"2017-11-19T15:46:39","date_gmt":"2017-11-19T14:46:39","guid":{"rendered":"https:\/\/project.inria.fr\/dalhis\/?page_id=554"},"modified":"2017-11-19T16:12:30","modified_gmt":"2017-11-19T15:12:30","slug":"workplan-2017","status":"publish","type":"page","link":"https:\/\/project.inria.fr\/dalhis\/research\/workplan-2017\/","title":{"rendered":"Workplan 2017"},"content":{"rendered":"<h3>Distributed Infrastructure Support for Workflow and Data Management<\/h3>\n<p><strong>Design of a Cloud Approach for Dataset integration (Task 1):\u00a0<\/strong> Next-generation scientific discoveries<br \/>\nare at the boundaries of datasets, e.g., across multiple science disciplines, institutions and spatial<br \/>\nand temporal scales. Today, data integration processes and methods are largely ad-hoc or manual.<br \/>\nA generalized resource infrastructure that integrates knowledge of the data and the processing tasks<br \/>\nbeing performed by the user in the context of the data and resource lifecycle is needed.<br \/>\nClouds provide an important infrastructure platform that can be leveraged by including knowledge<br \/>\nfor distributed data integration and that will be the focus of this research area. In 2017, we will work<br \/>\non the cloud system design leveraging the work done at LBNL on the design of the E-HPC system. We<br \/>\nwill further work on real-time data processing platforms targeting elasticity in the context of clouds to<br \/>\ndynamically adjust the resource allocation to the needs.<\/p>\n<p><strong>Data analysis for anomaly detection during workflow execution (Task 2):<\/strong> We plan to expand<br \/>\nthe work done in 2016 to detect anomalies during the execution of scientific workflows. We will use several<br \/>\nscientific computing workflows as exemplars, study the places where integrity failures could occur, and<br \/>\nexamine places at any point during the workflow, including the scientific instruments, networks, and HPC<br \/>\nsystems, where provenance data could tell us more about the existence or cause of the failure. Where<br \/>\nprovenance data can be captured, we will develop systems to capture and analyze that data to build<br \/>\na proof of concept. Where provenance data cannot currently be captured, we will recommend ways in<br \/>\nwhich hardware or software designs could be altered in future implementations to capture such data. If<br \/>\ntime allows, we will prove the necessity and\/or sufficiency of such data to demonstrate completeness of<br \/>\nthe approach. Amir Teshome Wonjiga\u2019s internship will\u00a0 be related to this task.<\/p>\n<p><strong>Dynamic workflow execution on HPC platforms (Task 3):<\/strong> Today, GinFlow is not designed to be<br \/>\ndeployed over an HPC environment. In particular, there is no proper scheduling strategies to optimise the<br \/>\nmapping of the tasks over compute nodes. One direction we will explore is to make GinFlow HPC-ready.<br \/>\nIn particular, we plan to devise strategies to dynamically assign CPU power to tasks as the execution<br \/>\nmoves forward in the graph of tasks. This would allow on one side to make GinFlow more HPC ready,<br \/>\nand in particular deployable over the NERSC computing platform, and on<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Distributed Infrastructure Support for Workflow and Data Management Design of a Cloud Approach for Dataset integration (Task 1):\u00a0 Next-generation scientific discoveries are at the boundaries of datasets, e.g., across multiple science disciplines, institutions and spatial and temporal scales. Today, data integration processes and methods are largely ad-hoc or manual. A generalized resource infrastructure that integrates &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"https:\/\/project.inria.fr\/dalhis\/research\/workplan-2017\/\">Continue reading<\/a><\/p>\n","protected":false},"author":267,"featured_media":0,"parent":267,"menu_order":4,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":"","_members_access_role":[],"_members_access_error":""},"class_list":["post-554","page","type-page","status-publish","hentry","nodate","item-wrap"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/554","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/users\/267"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/comments?post=554"}],"version-history":[{"count":1,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/554\/revisions"}],"predecessor-version":[{"id":555,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/554\/revisions\/555"}],"up":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/pages\/267"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/dalhis\/wp-json\/wp\/v2\/media?parent=554"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}