ICPP 2020 Program (all times are EDT / GMT-4)


Overview | By Date | By Event Type | By Tag | By Room | Author Index

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | V | W | X | Y | Z

A
Aananthakrishnan, Sriram · morePrune the Unnecessary: Parallel Pull-Push Louvain Algorithms with Automatic Edge Pruning · view
Abad, Pablo · moreSPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads · view
Abdelrahman, Tarek · moreBalancing Graph Processing Workloads Using Work Stealing on Heterogeneous CPU-FPGA Systems · view
Agostini, Matthew · moreBalancing Graph Processing Workloads Using Work Stealing on Heterogeneous CPU-FPGA Systems · view
Akella, Venkatesh · moreHCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems · view
Akyildiz, Taha Atahan · moreGOSH: Embedding Big Graphs on Small Hardware · view
Alabsi Aljundi, Amro · moreGOSH: Embedding Big Graphs on Small Hardware · view
Alibhai, Shakeel · moreA Rack-aware Pipeline Repair Scheme for Erasure-coded Distributed Storage Systems · view
Alkabani, Yousra · moreDNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics · view
Amarasinghe, Saman · moreHow to Make Sparse Fast · view
Amini Salehi, Mohsen · moreThe Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms · view
Angerd, Alexandra · moreA GPU Register File using Static Data Compression · view

B
Bacik, Josef · moreThe Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms · view
Bao, Wei · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Barker, Kevin · moreDetecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines · view
Bensaou, Brahim · moreReducing Latency in Multi-Tenant Data Centers via Cautious Congestion Watch · view

C
Cai, Shangming · moreCARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster · view
Cai, Wentong · moreRendering Server Allocation for MMORPG Players in Cloud Gaming · view
Cai, Xiaoqing · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Cai, Zhiping · moreFEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare · view
Cao, Qiang · moreGraBi: Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs · view
SeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Castro, Fernando · moreEnabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors · view
Chai, Qifei · moreBalancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System · view
Chau, Sid Chi-Kin · moreReliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks · view
Chen, Huaming · moreSaec: Similarity-Aware Embedding Compression in Recommendation Systems · view
Chen, Li · moreE-LAS: Design and Analysis of Completion-Time Agnostic Scheduling for Distributed Deep Learning Cluster · view
FEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare · view
Chen, Mengqiang · moreDual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning · view
Chen, Quan · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Chen, Wuhui · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Chen, Yang · moreOptimizing Flow Bandwidth Consumption with Traffic-diminishing Middlebox Placement · view
Chen, Yu · moreMass: Workload-Aware Storage Policy for OpenStack Swift · view
Chen, Zheng · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
Cheng, Yuchen · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Chuah, Mooi Choo · moreImpact of Memory DoS Attacks on Cloud Applications and Real-Time Detection Schemes · view
Chung, Jae-Won · moreShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference · view
Curtis-Maury, Matthew · moreScalable Coordination of Hierarchical Parallelism · view

D
Deng, Fan · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Deng, Jing · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Denninnart, Chavit · moreThe Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms · view
Devadas, Vinay · moreScalable Coordination of Hierarchical Parallelism · view
Dinh, Canh T. · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Dong, Yuanyuan · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Du, Xiaoyong · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Du, Yishu · moreRobustness of the Young/Daly formula for stochastic iterative applications · view
Duan, Kaiyue · moreImproving Load Balance via Resource Exchange in Large-Scale Search Engines · view
Duan, Xiaohui · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view

E
El-Ghazawi, Tarek · moreDNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics · view
Ellingwood, Nathan · morePerformance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures · view

F
Fan, Pingzhi · moreSelective Coflow Completion for Time-sensitive Distributed Applications with Poco · view
Farrens, Matthew · moreHCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems · view
Feng, Dan · moreMass: Workload-Aware Storage Policy for OpenStack Swift · view
CCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Fröning, Holger · moreOn Network Locality in MPI-Based HPC Applications · view

G
Gansterer, Wilfried N. · moreAlgorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method · view
Gao, Hongyun · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Gao, Yiqin · moreEnergy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms · view
Gavrilovska, Ada · moreGenerating Robust Parallel Programs via Model Driven Prediction of Compiler Optimizations for Non-determinism · view
Ge, Rong · moreDetecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines · view
Ghatrehsamani, Davood · moreThe Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms · view
Gómez Flores, Wilfrido · moreTowards Parallelization of a Texture Description Algorithm for Breast Lesion Classification using OpenMP and CUDA · view
Gong, Ruihao · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Gong, Xiaoli · moreDQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Gong, Yifan · moreEPMA: Efficient Partial Message Access in IoT Era · view
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications · view
Gregorio, Jose Angel · moreSPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads · view
Guo, Minyi · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Guo, Song · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Guo, Yeting · moreFEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare · view

H
H. Tran, Nguyen · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Han, Li · moreEnergy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms · view
Han, Qingchang · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
He, Bingsheng · moreCapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
He, Bo · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
He, Ligang · moreDeveloping a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks · view
He, Tian · moreAMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center · view
He, Xubin · moreA Rack-aware Pipeline Repair Scheme for Erasure-coded Distributed Storage Systems · view
Hedayati, Mohammad · moreSafe, Fast Sharing of memcached as a Protected Library · view
Helm, Christian · moreAutomatic Identification and Precise Attribution of DRAM Bandwidth Contention · view
Herrero, Jose Angel · moreSPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads · view
Hinkle, Jacob · moreToward Large-Scale Image Segmentation on Summit · view
Hong, Zicong · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Hovland, Paul · moreVector Forward Mode Automatic Differentiation on SIMD/SIMT architectures · view
Hu, Jinbin · moreAMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center · view
Hu, Peng · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Hu, Yongmin · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Hu, Zhenbo · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Hua, Yu · moreAn Efficient Wear-level Architecture using Self-adaptive Wear Leveling · view
Huang, Fangting · moreAn Efficient Wear-level Architecture using Self-adaptive Wear Leveling · view
Huang, Jianming · moreAn Efficient Wear-level Architecture using Self-adaptive Wear Leveling · view
Huang, Jiawei · moreAMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center · view
Hueckelheim, Jan · moreVector Forward Mode Automatic Differentiation on SIMD/SIMT architectures · view

I
Inaba, Yoko · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Ito, Yasuaki · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Huffman Coding with Gap Arrays for GPU Acceleration · view

J
Jaya, Iryanto · moreRendering Server Allocation for MMORPG Players in Cloud Gaming · view
Ji, Bo · moreOptimizing Flow Bandwidth Consumption with Traffic-diminishing Middlebox Placement · view
Jia, Xiaohua · moreReliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks · view
Jiang, Hong · moreGraBi: Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs · view
Jiang, Wanchun · morePS : Periodic Strategy for the 40-100Gbps Energy Efficient Ethernet · view
Polo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric · view
Jiang, Zhang · moreDQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Jiang, Ziyue · moreEPMA: Efficient Partial Message Access in IoT Era · view
Jin, Jiangming · moreEPMA: Efficient Partial Message Access in IoT Era · view
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications · view
Jin, Sian · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view

K
Kasagi, Akihiko · moreHuffman Coding with Gap Arrays for GPU Acceleration · view
Katsuki, Ryota · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Kaya, Kamer · moreGOSH: Embedding Big Graphs on Small Hardware · view
Kim, Jae-Yun · moreShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference · view
Kirmani, Shad · moreFast Spectral Graph Layout on Multicore Platforms · view
Kjellqvist, Chris · moreSafe, Fast Sharing of memcached as a Protected Library · view
Kratochvíl, Miroslav · moreDetailed Analysis and Optimization of CUDA K-means Algorithm · view
Kruliš, Martin · moreDetailed Analysis and Optimization of CUDA K-means Algorithm · view

L
Leng, Jingwen · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Levonyak, Markus · moreAlgorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method · view
Li, Ang · moreDetecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines · view
Li, Chao · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Li, Junyu · moreDeveloping a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks · view
Li, Keqiu · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Li, Xin · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view
Li, Xinyuan · moreLarge-scale Simulations of Peridynamics on Sunway Taihulight Supercomputer · view
Li, Yusen · moreImproving Load Balance via Resource Exchange in Large-Scale Search Engines · view
Balancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System · view
Rendering Server Allocation for MMORPG Players in Cloud Gaming · view
Li, Zhaoyi · moreAMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center · view
Li, Zhuozhao · moreImpact of Memory DoS Attacks on Cloud Applications and Real-Time Detection Schemes · view
Li, Zongpeng · moreAn Online Learning-Based Task Offloading Framework for 5G Small Cell Networks · view
Liang, Weifa · moreReliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks · view
Liao, Jianxin · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Liao, kaiqin · morePS : Periodic Strategy for the 40-100Gbps Energy Efficient Ethernet · view
Lim, Seung-Hwan · moreToward Large-Scale Image Segmentation on Summit · view
Lin, Chi · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Lin, Jieyu · moreAdaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN · view
Liu, Bing · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Liu, Chang · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Liu, Cong · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Liu, Fang · moreFEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare · view
Liu, Jing · moreEnergy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms · view
Liu, Jingning · moreCCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Liu, Qi · moreA Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost · view
Liu, Shuyang · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Liu, Tong · moreA Rack-aware Pipeline Repair Scheme for Erasure-coded Distributed Storage Systems · view
Liu, Wei · moreEPMA: Efficient Partial Message Access in IoT Era · view
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications · view
Liu, Weifeng · moreCapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Efficient Block Algorithms for Parallel Sparse Triangular Solve · view
Liu, Weiguo · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view
Liu, Xiaoguang · moreImproving Load Balance via Resource Exchange in Large-Scale Search Engines · view
Liu, Ximing · moreDQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Liu, Yang · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Liu, Yanqiang · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Liu, Zixia · moreDeep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing · view
Llort, German · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view
Lowe-Power, Jason · moreHCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems · view
Lu, Youyou · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Lu, Zhengyang · moreEfficient Block Algorithms for Parallel Sparse Triangular Solve · view
Luan, Zhongzhi · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Lui, John C.S. · moreAn Online Learning-Based Task Offloading Framework for 5G Small Cell Networks · view
Lunga, Dalton · moreToward Large-Scale Image Segmentation on Summit · view
Luo, Qiong · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Luo, Shouxi · moreSelective Coflow Completion for Time-sensitive Distributed Applications with Poco · view

M
M. Abdelmoniem, Ahmed · moreReducing Latency in Multi-Tenant Data Centers via Cautious Congestion Watch · view
Ma, Tao · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
Ma, Yu · moreReliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks · view
Madduri, Kamesh · moreFast Spectral Graph Layout on Multicore Platforms · view
Mao, Rui · moreDeveloping a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks · view
Marbach, Trent · moreImproving Load Balance via Resource Exchange in Large-Scale Search Engines · view
Marchal, Loris · moreRobustness of the Young/Daly formula for stochastic iterative applications · view
McCamant, Stephen · moreFirst Time Miss : Low Overhead Mitigation For Shared Memory Cache Side Channels · view
Meneses Viveros, Amilcar · moreTowards Parallelization of a Texture Description Algorithm for Breast Lesion Classification using OpenMP and CUDA · view
Meng, Xiangxu · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view
Mercadal, Estanislao · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view
Mishra, Ashirbad · moreFast Spectral Graph Layout on Multicore Platforms · view
Moon, Soo-Mook · moreShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference · view
Munera, Adrian · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view
Mururu, Girish · moreGenerating Robust Parallel Programs via Model Driven Prediction of Compiler Optimizations for Non-determinism · view

N
Nakano, Koji · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Huffman Coding with Gap Arrays for GPU Acceleration · view
Narayanan, Sri Hari Krishna · moreVector Forward Mode Automatic Differentiation on SIMD/SIMT architectures · view
Nasre, Rupesh · moreGraffix: Efficient Graph Processing with a Tinge of GPU-Specific Approximations · view
Nguyen, Tuan Dung · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Nie, Lihai · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Nitta, Christopher · moreHCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems · view
Niu, Yuyao · moreEfficient Block Algorithms for Parallel Sparse Triangular Solve · view

O
O'Brien, Francis · moreBalancing Graph Processing Workloads Using Work Stealing on Heterogeneous CPU-FPGA Systems · view

P
Pachajoa, Carlos · moreAlgorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method · view
Pacher, Christina · moreAlgorithm-Based Checkpoint-Recovery for the Conjugate Gradient Method · view
Pal, Lisa · moreTowards Parallelization of a Texture Description Algorithm for Breast Lesion Classification using OpenMP and CUDA · view
Pallez, Guillaume · moreRobustness of the Young/Daly formula for stochastic iterative applications · view
Pande, Santosh · moreGenerating Robust Parallel Programs via Model Driven Prediction of Compiler Optimizations for Non-determinism · view
Peng, Jiaxin · moreDNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics · view
Petrini, Fabrizio · morePrune the Unnecessary: Parallel Pull-Push Louvain Algorithms with Automatic Edge Pruning · view
Prajapati, Nirmal · moreRevisiting Sparse Dynamic Programming for the 0/1 Knapsack Problem · view
Prieto, Pablo · moreSPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads · view
Prieto-Matias, Manuel · moreEnabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors · view
Puente, Valentin · moreSPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads · view

Q
Qi, Qi · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Qi, Zhengwei · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Qian, Depei · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Qiu, Xiaoyu · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Quan, Gang · moreDeep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing · view
Quiñones, Eduardo · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view

R
Rajamanickam, Sivasankaran · morePerformance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures · view
Rajopadhye, Sanjay · moreRevisiting Sparse Dynamic Programming for the 0/1 Knapsack Problem · view
Ramkrishnan, Kartik · moreFirst Time Miss : Low Overhead Mitigation For Shared Memory Cache Side Channels · view
Ravichandran, Kaushik · moreGenerating Robust Parallel Programs via Model Driven Prediction of Compiler Optimizations for Non-determinism · view
Ren, Rui · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Ren, Shenyuan · moreDeveloping a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks · view
Robert, Yves · moreEnergy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms · view
Robustness of the Young/Daly formula for stochastic iterative applications · view
Rodriguez, Matthew A. · moreOptimizing Linearizable Bulk Operations on Data Structures · view
Royuela, Sara · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view
Ruan, Chang · morePolo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric · view

S
Saez, Juan Carlos · moreEnabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors · view
Schanen, Michel · moreVector Forward Mode Automatic Differentiation on SIMD/SIMT architectures · view
Schmidt, Bertil · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view
Schulte, Michael · moreChallenges and Opportunities for Extreme-Scale Computing · view
Scott, Michael L. · moreSafe, Fast Sharing of memcached as a Protected Library · view
Seal, Sudip · moreToward Large-Scale Image Segmentation on Summit · view
Sen, Tanmoy · moreImpact of Memory DoS Attacks on Cloud Applications and Real-Time Detection Schemes · view
Shao, Airan · moreAn Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm · view
Shen, Haiying · moreA Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost · view
Impact of Memory DoS Attacks on Cloud Applications and Real-Time Detection Schemes · view
Sheng, Feng · moreGraBi: Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs · view
Shi, Jiuchen · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Shi, Yang · moreTowards High-Efficiency Data Centers via Job-Aware Network Scheduling · view
Sifat, Tarequl Islam · moreRevisiting Sparse Dynamic Programming for the 0/1 Knapsack Problem · view
Singh, Somesh · moreGraffix: Efficient Graph Processing with a Tinge of GPU-Specific Approximations · view
Sintorn, Erik · moreA GPU Register File using Static Data Compression · view
Song, Zhuo · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
Sorger, Volker · moreDNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics · view
Spear, Michael F. · moreOptimizing Linearizable Bulk Operations on Data Structures · view
Stasiak, Andrzej · morePrune the Unnecessary: Parallel Pull-Push Louvain Algorithms with Automatic Edge Pruning · view
Stenström, Per · moreA GPU Register File using Static Data Compression · view
Straube, Kramer · moreHCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems · view
Su, Jiya · moreCapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Sultana, Abeda · moreE-LAS: Design and Analysis of Completion-Time Agnostic Scheduling for Distributed Deep Learning Cluster · view
Sun, Chao · moreBalancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System · view
Sun, Haifeng · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Sun, Shuai · moreDNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics · view
Sun, Yu · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Susanto, Hengky · moreReducing Latency in Multi-Tenant Data Centers via Cautious Congestion Watch · view

T
Tabaru, Tsuguchika · moreHuffman Coding with Gap Arrays for GPU Acceleration · view
Takafuji, Daisuke · moreHuffman Coding with Gap Arrays for GPU Acceleration · view
Tang, Shanjiang · moreBalancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System · view
Tao, Dingwen · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Tatekawa, Masaru · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Taura, Kenjiro · moreAutomatic Identification and Precise Attribution of DRAM Bandwidth Contention · view
Tian, Zhao · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Tithi, Jesmin Jahan · morePrune the Unnecessary: Parallel Pull-Push Louvain Algorithms with Automatic Edge Pruning · view
Tong, Wei · moreMass: Workload-Aware Storage Policy for OpenStack Swift · view
CCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Tsaris, Aristeidis · moreToward Large-Scale Image Segmentation on Summit · view

V
Vivien, Frédéric · moreEnergy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms · view

W
Wang, Chengning · moreCCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Wang, Dali · moreToward Large-Scale Image Segmentation on Summit · view
Wang, Dongsheng · moreCARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster · view
An Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm · view
Wang, Gang · moreImproving Load Balance via Resource Exchange in Large-Scale Search Engines · view
Wang, Haixia · moreCARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster · view
An Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm · view
Wang, Haoyu · moreA Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost · view
Wang, Huanbin · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Wang, Jian · moreSaec: Similarity-Aware Embedding Compression in Recommendation Systems · view
wang, jianxin · morePS : Periodic Strategy for the 40-100Gbps Energy Efficient Ethernet · view
Polo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric · view
AMRT: Anti-ECN Marking to Improve Utilization of Receiver-driven Transmission in Data Center · view
Wang, Jingyu · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Wang, Lei · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Wang, Lipeng · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Wang, Liqiang · moreDeep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing · view
Wang, Rui · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Wang, Rujia · moreCapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Wang, Shucheng · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Wang, Wenwen · moreDQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Wang, Yanfei · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Wang, Yi · moreJeor: Accelerate Linear Algebra Operation in SSDs · view
Wang, Zhanye · moreCARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster · view
Wang, Zike · moreMass: Workload-Aware Storage Policy for OpenStack Swift · view
Wang, Zizhong · moreAn Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm · view
Wartel, Franck · moreExperiences on the characterization of parallel applications in embedded systems with Extrae/Paraver · view
Wei, Xueliang · moreCCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Wen, Mei · moreTowards High-Efficiency Data Centers via Job-Aware Network Scheduling · view
Williams-Young, David B. · moreParallel Shift-Invert Spectrum Slicing on Distributed Architectures with GPU Accelerators · view
Wu, Chunghsuan · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Wu, Guowei · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Wu, Hao · moreEPMA: Efficient Partial Message Access in IoT Era · view
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications · view
Wu, Jie · moreOptimizing Flow Bandwidth Consumption with Traffic-diminishing Middlebox Placement · view
Wu, Ruofan · moreCapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Wu, Weigang · moreDual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning · view
Wu, Xiaorui · moreSaec: Similarity-Aware Embedding Compression in Recommendation Systems · view
Jeor: Accelerate Linear Algebra Operation in SSDs · view

X
Xia, Wen · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Xiao, Danyang · moreDual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning · view
Xiao, Nong · moreFEEL: A Federated Edge Learning System for Efficient and Privacy-Preserving Mobile Healthcare · view
Xing, Huanlai · moreSelective Coflow Completion for Time-sensitive Distributed Applications with Poco · view
Xu, Fei · moreE-LAS: Design and Analysis of Completion-Time Agnostic Scheduling for Distributed Deep Learning Cluster · view
Xu, Hong · moreSaec: Similarity-Aware Embedding Compression in Recommendation Systems · view
Jeor: Accelerate Linear Algebra Operation in SSDs · view
OPS: Optimized Shuffle Management System for Apache Spark · view
Xu, Jie · moreA Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost · view
Xu, Kai · moreSWMapper: Scalable Read Mapper on SunWay TaihuLight · view
Xu, Wenzheng · moreReliability Augmentation of Requests with Service Function Chain Requirements in Mobile Edge-Cloud Networks · view

Y
Y. Zomaya, Albert · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Yamamoto, Naoya · moreHuffman Coding with Gap Arrays for GPU Acceleration · view
Yamazaki, Ichitaro · morePerformance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures · view
Yan, Shengen · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
yan, yulong · morePS : Periodic Strategy for the 40-100Gbps Energy Efficient Ethernet · view
Yan, Zijie · moreDual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning · view
Yang, Baichen · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Yang, Bin · moreOPS: Optimized Shuffle Management System for Apache Spark · view
Yang, Chao · moreParallel Shift-Invert Spectrum Slicing on Distributed Architectures with GPU Accelerators · view
Yang, Hailong · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Yang, Puyuan · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Yang, Yong · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
Yang, Ziwei · moreCooperative Game for Multiple Chargers with Dynamic Network Topology · view
Yao, Jie · moreSeRW: Adaptively Separating Read and Write upon SSDs of Hybrid Storage Server in Clouds · view
Yasudo, Ryota · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Yazane, Takashi · moreAdaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs · view
Ye, Huang · moreLarge-scale Simulations of Peridynamics on Sunway Taihulight Supercomputer · view
Ye, Liuqing · moreCCHL: Compression-Consolidation Hardware Logging for Efficient Failure-Atomic Persistent Memory Updates · view
Ye, Songgao · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Yelick, Kathy · moreGenomic Analysis and Learning at Scale: Mapping Irregular Computations to Advanced Architectures · view
Yew, Pen · moreFirst Time Miss : Low Overhead Mitigation For Shared Memory Cache Side Channels · view
DQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Yu, Ce · moreBalancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System · view
Yu, Fengwei · moreExtremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures · view
Yu, Hongfang · moreSelective Coflow Completion for Time-sensitive Distributed Applications with Poco · view
Yuan, Rui · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Yuan, Xu · moreE-LAS: Design and Analysis of Completion-Time Agnostic Scheduling for Distributed Deep Learning Cluster · view

Z
Zahn, Felix · moreOn Network Locality in MPI-Based HPC Applications · view
Zhai, Antonia · moreFirst Time Miss : Low Overhead Mitigation For Shared Memory Cache Side Channels · view
Zhai, Jidong · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications · view
Zhan, Yufeng · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Zhang, Chenyang · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
Zhang, Chunyuan · moreTowards High-Efficiency Data Centers via Job-Aware Network Scheduling · view
Zhang, Feng · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs · view
Zhang, Hequan · moreDIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training · view
Zhang, Honglin · moreSaec: Similarity-Aware Embedding Compression in Recommendation Systems · view
Zhang, Jian · moreLarge-scale Simulations of Peridynamics on Sunway Taihulight Supercomputer · view
Zhang, Jianting · moreSkyChain: A Deep Reinforcement Learning-Empowered Dynamic Blockchain Sharding System · view
Zhang, Qi · moreAdaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN · view
Zhang, Sai Qian · moreAdaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN · view
Zhang, Tao · morePolo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric · view
Zhang, Wei · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
Zhang, Weizhe · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Zhang, Xueying · moreAn Online Learning-Based Task Offloading Framework for 5G Small Cell Networks · view
Zhang, Zheng · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Zhao, Laiping · moreXShot: Light-weight Link Failure Localization using Crossed Probing Cycles in SDN · view
Zhao, Ziyi · moreDQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms · view
Zheng, Kevin · moreA Reinforcement Learning Based System for Minimizing Cloud Storage Service Cost · view
Zheng, Ningxin · moreURSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds · view
Zheng, Wenli · moreOVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment · view
Zhou, Amelie Chi · moreParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs · view
Zhou, Bing B. · moreFederated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms · view
Zhou, Jieying · moreDual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning · view
Zhou, Ruiting · moreAn Online Learning-Based Task Offloading Framework for 5G Small Cell Networks · view
Zhou, Wen · moreAn Efficient Wear-level Architecture using Self-adaptive Wear Leveling · view
Zhou, Zhi · moreAn Online Learning-Based Task Offloading Framework for 5G Small Cell Networks · view
Zhuang, Zirui · moreDeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention · view
Zou, Pengfei · moreDetecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines · view
Zou, Xiangyu · moreDelta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity · view
Zuo, Pengfei · moreAn Efficient Wear-level Architecture using Self-adaptive Wear Leveling · view

Created 2020-8-17 12:2