DALHIS (Since 2013) Data analysis on large-scale heterogeneous infrastructures for science |
Principal Investigators :
- Dr. Christine Morin, MYRIADS project-team, Inria Rennes – Bretagne Atlantique
- Dr. Deb Agarwal, Lawrence Berkeley National Laboratory, University of California Berkeley
Research objectives:
Research areas span:
- Programming environment for scientific data analysis workflows: An integrated capability that will allow users to easily compose their workflows in a programming environment such as Python and execute them on diverse high performance computing (HPC) and cloud resources.
- Adaptive orchestration layer: The adaptation model will use real-time data mining to support elasticity, fault-tolerance, energy efficiency and provenance.
- Infrastructure support for HPC, clusters and cloud systems: The research will determine how to provide execution environments that allow users to seamlessly execute their dynamic data analysis workflows in various research environments and scales.
Scientific achievements:
- Evaluation of the performance/energy efficiency trade-off of Hadoop run on physical and virtual clusters for two deployment modes: collocated data and compute services and dedicated data nodes separated from compute nodes.
- Development and evaluation of a chemical runtime support for TIGRES high-level specification of scientific workflows.
- Evaluation of FRIEDA flexible robust intelligent data management framework for deploying dataintensive scientific applications in clouds.
Publications and Awards:
- 1 Journal article, 1 Book chapter, 1 Conference paper, 3 Workshop papers
- Deb Agarwal is the recipient of a 2015 Inria International chair.
Selected publication:
Eugen Feller, Lavanya Ramakrishnan, Christine Morin, Performance and Energy Efficiency of Big Data Applications in Cloud Environments: A Hadoop Case Study, JPDC, 2015.