ICPP 2020
ICPP 2020 Program (all times are EDT / GMT-4)


Tuesday, August 18th


11:00am-11:15am

Opening Remarks
Opening Remark

11:15am-11:55am

Keynote-1
Keynote

12:05pm-12:45pm

Best-Paper Candidates
Huffman Coding with Gap Arrays for GPU Acceleration
CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs
SkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System
GOSH: Embedding Big Graphs on Small Hardware
Paper

12:55pm-1:25pm

Distributed Systems
CARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster
Safe, Fast Sharing of memcached as a Protected Library
DQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms
Paper

Edge Learning and Inference
ShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference
FEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare
Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN
Paper

Memory Systems
An Efficient Wear-level Architecture using Self-adaptive Wear Leveling
CCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates
Balancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System
Paper

1:35pm-2:05pm

Fault-Tolerance
Algorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method
Robustness of the Young/Daly formula for stochastic iterative applications
Energy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms
Paper

Scheduling and Placement in Networks
Cooperative Game for Multiple Chargers with Dynamic Network Topology
Optimizing Flow Bandwidth Consumption with Traffic-diminishing Middlebox Placement
Towards High-Efficiency Data Centers via Job-Aware Network Scheduling
Paper

Systems for Machine Learning
DIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training
E-LAS: Design and Analysis of Completion-Time Agnostic Scheduling for Distributed Deep Learning Cluster
ParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs
Paper

Wednesday, August 19th


11:00am-11:40am

Keynote-2
Keynote

11:50am-12:20pm

Graph Processing and Concurrent Data Structures
Graffix: Efficient Graph Processing with a Tinge of GPU-Specific Approximations
Optimizing Linearizable Bulk Operations on Data Structures
GraBi: Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs
Paper

Large-Scale Applications on Supercomputers
Large-scale Simulations of Peridynamics on Sunway Taihulight Supercomputer
Toward Large-Scale Image Segmentation on Summit
SWMapper: Scalable Read Mapper on SunWay TaihuLight
Paper

Machine Learning for Computing
An Online Learning-Based Task Offloading Framework for 5G Small Cell Networks
A Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost
Deep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing
Paper

12:30pm-1:00pm

Performance Tools and Methodology
Generating Robust Parallel Programs via Model Driven Prediction of Compiler Optimizations for Non-determinism
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications
Automatic Identification and Precise Attribution of DRAM Bandwidth Contention
Paper

Storage Reliability & Memory Security
An Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm
First Time Miss : Low Overhead Mitigation For Shared Memory Cache Side Channels
A Rack-aware Pipeline Repair Scheme for Erasure-coded Distributed Storage Systems
Paper

Supporting Efficient Machine Learning
Extremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures
Vector Forward Mode Automatic Differentiation on SIMD/SIMT architectures
Delta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity
Paper

1:10pm-1:40pm

Data Center Networking
AMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center
PS : Periodic Strategy for the 40-100Gbps Energy Efficient Ethernet
Polo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric
Paper

Parallel Algorithms I
Prune the Unnecessary: Parallel Pull-Push Louvain Algorithms with Automatic Edge Pruning
Fast Spectral Graph Layout on Multicore Platforms
Revisiting Sparse Dynamic Programming for the 0/1 Knapsack Problem
Paper

Parallel and Distributed Machine Learning
Developing a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks
Federated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms
Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning
Paper

Thursday, August 20th


11:00am-11:40am

Keynote-3
Keynote

11:50am-12:20pm

Heterogeneous Systems
Balancing Graph Processing Workloads Using Work Stealing on Heterogeneous CPU-FPGA Systems
Enabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors
Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines
Paper

Performance Evaluation and Characterization
Experiences on the characterization of parallel applications in embedded systems with Extrae/Paraver
SPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads
The Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms
Paper

Routing and Mapping in Networks
XShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN
On Network Locality in MPI-Based HPC Applications
DeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention
Paper

12:30pm-1:00pm

Microarchitecture and Power Management
A GPU Register File using Static Data Compression
HCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems
DNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics
Paper

Parallel Algorithms II
Adaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs
Efficient Block Algorithms for Parallel Sparse Triangular Solve
Selective Coflow Completion for Time-sensitive Distributed Applications with Poco
Paper

Resource Management on the Cloud
Improving Load Balance via Resource Exchange in Large-Scale Search Engines
Rendering Server Allocation for MMORPG Players in Cloud Gaming
Impact of Memory DoS Attacks on Cloud Applications and Real-Time Detection Schemes
Paper

1:10pm-1:40pm

GPU-Accelerated Applications
Parallel Shift-Invert Spectrum Slicing on Distributed Architectures with GPU Accelerators
Detailed Analysis and Optimization of CUDA K-means Algorithm
Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures
Paper

1:10pm-1:50pm

Data Centers and the Edge
OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment
Reducing Latency in Multi-Tenant Data Centers via Cautious Congestion Watch
URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds
Reliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks
Paper

Storage and I/O Optimization
OPS: Optimized Shuffle Management System for Apache Spark
SeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds
Scalable Coordination of Hierarchical Parallelism
Mass: Workload-Aware Storage Policy for OpenStack Swift
Paper

Created 2020-6-25 8:56