List of accepted papers
- Alex Weaver, Krishna Kavi, Pranathi Vasireddy and Gayatri Mehta
Memory-Side Acceleration and Sparse Compression for Quantized Packed Convolutions - Alexander Goponenko, Kenneth Lamar, Christina Peterson, Benjamin Allan, Jim Brandt and Damian Dechev
Metrics for Packing Efficiency and Fairness of HPC Cluster Batch Job Scheduling - Alexander van der Grinten, Geert Custers, Duy Le Thanh and Henning Meyerhenke
An MPI-Parallel Algorithm for Static and Dynamic Top-k Harmonic Centrality - Aravind Sankaran and Paolo Bientinesi
A Test for FLOPs as a Discriminant for Linear Algebra Algorithms - Arthur Krause, Paulo Santos and Philippe Navaux
Avoiding Unnecessary Caching with History-Based Preemptive Bypassing - Brady Testa, Samira Mirbagher and Daniel Jiménez
Dynamic Set Stealing to Improve Cache Performance - Daniel Wladdimiro, Luciana Arantes, Pierre Sens and Nicolas Hidalgo
A predictive approach for dynamic replication of operators in distributed stream processing systems - Elvis Rojas, Diego Pérez and Esteban Meneses
Exploring the Effects of Silent Data Corruption in Distributed Deep Learning Training - Emmanuel Agullo, Marek Felšöci, Amina Guermouche, Hervé Mathieu, Guillaume Sylvand and Bastien Tagliaro
Study of the processor and memory power consumption of coupled sparse/dense solvers - Erhan Tezcan, Tuğba Torun, Fahrican Koşar, Kamer Kaya and Didem Unat
Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection - Fareed Qararyah, Muhammad Waqar Azhar and Pedro Trancoso
FiBHA: Fixed Budget Hybrid CNN Accelerator - Guillaume Didier, Clémentine Maurice, Antoine Geimer and Walid J. Ghandour
Characterizing Prefetchers using CacheObserver - Hammurabi Mendes, Bryce Wiedenbeck and Aidan O’Neill
Seriema: RDMA-based Remote Invocation with a Case-Study on Monte-Carlo Tree Search - Hao Wu, Pangbo Sun, Jiangming Jin, Yifan Gong and Ziyue Jiang
TCUDA: A QoS-based GPU Sharing Framework for Autonomous Navigation Systems - Igor Fontana de Nardin, Patricia Stolf and Stephane Caux
Analyzing Power Decisions in Data Center Powered by Renewable Sources - James Almgren-Bell, Nader Al Awar, Dilip Geethakrishnan, Milos Grigoric and George Biros
A Multi-GPU Python Solver for Low-Temperature Non-Equilibrium Plasmas - Javier Garcia Blas, Javier Fernandez, Jesús Carretero, Fabrizio Marozzo, Domenico Talia, Daniel Martin de Blas and Alberto Fernandez-Pena
Convergence of HPC and Big Data in extreme-scale data analysis through the DCEx programming model - Jing Chen, Madhavan Manivannan, Bhavishya Goel, Mustafa Abduljabbar and Miquel Pericàs
STEER: Asymmetry-aware Energy Efficient Task Scheduler for Cluster-based Multicore Architectures - João Fabrício Filho, Isaías Bittencourt Felzmann and Lucas Wanner
Approximate Memory with Protected Static Allocation - João Vieira, Nuno Roma, Gabriel Falcao and Pedro Tomás
gem5-ndp: Near-Data Processing Architecture Simulation From Low Level Caches to DRAM - Jonathas Silveira, Lucas Castro, Victor Araújo, Rodrigo Zeli, Daniel Lazari, Marcelo Guedes, Rodolfo Azevedo and Lucas Wanner
Prof5: A RISC-V profiler tool - Manuel F. Dolz, Héctor Martínez, Pedro Alonso-Jorda and Enrique S. Quintana-Orti
Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor - Matheus Bernardino and Alfredo Goldman
Parallelizing Git Checkout: a Case Study of I/O Parallelism - Maxim Moraru, Adrien Roussel, Hugo Taboada, Christophe Jaillet, Marc Perache and Michael Krajecki
Performance improvements of parallel applications thanks to MPI-4.0 hints - Miguel Gomes Xavier, Carlos Henrique da Costa Cano Cano, Vinícius Meyer and César A. F. De Rose
IntP: Quantifying cross-application interference via system-level instrumentation - Odin Ugedal and Rakesh Kumar
Mitigating Unnecessary Throttling in Linux CFS Bandwidth Control - Omar Shaaban, Jimmy Aguilar Mena, Vicenç Beltran, Paul Carpenter, Eduard Ayguade and Jesus Labarta Mancho
Automatic aggregation of subtask accesses for nested OpenMP-style tasks - Rafaela Brum, Lúcia Drummond, Luciana Arantes, Maria Clicia Castro and Pierre Sens
Optimizing Execution Time and Costs of Cross-Silo Federated Learning Applications with Datasets on different Cloud Providers - Samuel Cajahuaringa, Leandro N. Zanotto, Daniel L. Z. Caetano, Sandro Rigo, Hervé Yviquel, Munir S. Skaf and Guido Araujo
Ion-Molecule Collision Cross-Section Simulation using Linked-cell and Trajectory Parallelization - Samuel Ferraz, Vinicius Dias, Carlos H. C. Teixeira, George Teodoro and Wagner Meira Jr.
Efficient Strategies for Graph Pattern Mining Algorithms on GPUs - Sandra Catalan, Francisco D. Igual, Rafael Rodríguez-Sánchez, José R. Herrero and Enrique S. Quintana-Orti
NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors - Thierry Arrabal, Lucas Betencourt, Eddy Caron and Laurent Lefevre
Setting up an experimental framework for immersion cooling system and analysis - Vanderlei Pereira, Márcio Castro and Odorico Mendizabal
Strategies for Fault-Tolerant Tightly-coupled HPC Workloads Running on Low-Budget Spot Cloud Infrastructures - Yang Chen, Feng Zhang, Yinhao Hong, Yunpeng Chai, Wei Lu, Hong Chen, Xiaoyong Du, Peipei Wang, Le Mi, Jintao Li, Xilin Tang, Yanliang Zhou, Peng Zhang, Fengyi Chen, Pengfei Li and Yu Li
Managing Petabytes of Data