ICA3PP 2023

The 23rd International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2023)
Tianjin, China
20-22 October 2023

Keynote Speakers

Lixin Gao

Lixin Gao

University of Massachusetts, Amherst, USA

Data Parallel Frameworks for Accelerating Machine Learning Algorithms

The advances in sensing, storage, and networking technology have created huge collections of high-volume, high-dimensional data. Making sense of these data is critical for companies and organizations to make better business decisions, and brings convenience to our daily life. Recent advances in data mining, machine learning, and applied statistics have led to a flurry of data analytic techniques that typically require an iterative refinement process. However, the massive amount of data involved and potentially numerous iterations required make performing data analytics in a timely manner challenging. In this talk, we present a series of data parallel frameworks that accelerate iterative machine learning algorithms for massive data.

About Lixin Gao

Lixin Gao is a University Distinguished Professor of Electrical and Computer Engineering at the University of Massachusetts at Amherst. She received a Ph.D. degree in Computer Science from the University of Massachusetts at Amherst. Her research interests include online social networks, and Internet routing, network virtualization and cloud computing. Between May 1999 and January 2000, she was a visiting researcher at AT&T Research Labs and DIMACS. She was an Alfred P. Sloan fellow between 2003-2005, received an NSF CAREER Award in 1999 and a fellowship from Harvard University Radcliffe Institute in 2021. She won the best paper award from IEEE INFOCOM 2010, and the test-of-time award in ACM SIGMETRICS 2010. Her paper in ACM Cloud Computing 2011 was honored with "Paper of Distinction". She is a fellow of IEEE and ACM.

Baochun Li

Baochun Li

Department of Electrical and Computer Engineering at the University of Toronto, Canada

The Rise (and Ultimate Fall) of Federated Learning

Even with the meteoric rise of federated learning in the past five years, with thousands of papers in the literature, relatively few are concerned with its practical relevance. As one of the few practical paradigms that preserves data privacy when training a shared machine learning model in a decentralized fashion, federated learning should eventually be deployed in the real world — or will it? It depends on whether it is really able to preserve data privacy, and whether performance claims in existing papers were to be trusted, especially in the day and age of fine-tuning large language models.

In this talk, I will share a few of our recent experiences that attempted to answer these questions. I first show that privacy may be quite well protected in federated learning, and claims in the existing literature along the lines of gradient leakage attacks may not necessarily be valid. I will then share some of our recent results on evaluating existing performance-enhancing works in federated learning in a fair and reproducible way. Using Plato, an open-source framework that we developed, we show that many of the existing claims may be challenging to verify with reproducible experiments. Despite intensifying interests on federated learning, we believe that better privacy-preserving distributed training paradigms need to be architected, especially when large language models are involved.

About Baochun Li

Baochun Li received his B.Engr. degree from the Department of Computer Science and Technology, Tsinghua University, China, in 1995 and his M.S. and Ph.D. degrees from the Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, in 1997 and 2000. Since 2000, he has been with the Department of Electrical and Computer Engineering at the University of Toronto, where he is currently a Professor. He holds the Bell Canada Endowed Chair in Computer Engineering since August 2005. His current research interests include cloud computing, security and privacy, distributed machine learning, federated learning, and networking.

Dr. Li has co-authored more than 450 research papers, with a total of over 24000 citations, an H-index of 87 and an i10-index of 322, according to Google Scholar Citations. He was the recipient of the IEEE Communications Society Leonard G. Abraham Award in the Field of Communications Systems in 2000, the Multimedia Communications Best Paper Award from the IEEE Communications Society in 2009, the University of Toronto McLean Award in 2009, and the Best Paper Award from IEEE INFOCOM in 2023. He is a Fellow of the Canadian Academy of Engineering and a Fellow of IEEE.

Laurence T. Yang

Laurence T. Yang

Hainan University, China

Cyber-Physical-Social Intelligence

The booming growth and rapid development in embedded systems, wireless communications, sensing techniques and emerging support for cloud computing and social networks have enabled researchers and practitioners to create a wide variety of Cyber-Physical-Social Systems (CPSS) that reason intelligently, act autonomously, and respond to the users' needs in a context and situation-aware manner, namely Cyber-Physical-Social Intelligence. It is the integration of computation, communication and control with the physical world, human knowledge and sociocultural elements. It is a novel emerging computing paradigm and has attracted wide concerns from both industry and academia in recent years.

This talk will present our latest research on Cyber-Physical-Social Intelligence. Corresponding case studies in some typical applications will be shown to demonstrate the feasibility and flexibility.

About Laurence T. Yang

Laurence T. Yang got his BE in Computer Science and Technology and BSc in Applied Physics both from Tsinghua University, China and Ph.D in Computer Science from University of Victoria, Canada. He is the Academic Vice-President and Dean of School of Computer Science and Technology, Hainan University, China. His research includes Cyber-Physical-Social Intelligence. He has published 300+ papers in the above area on top IEEE/ACM Transactions with total citations of 36691 and H-index of 96 including 8 and 40 papers as top 0.1% and top 1% highly-cited ESI papers, respectively.

His recent honors and awards include the member of Academia Europaea, the Academy of Europe (2021), the John B. Stirling Medal (2021) from Engineering Institute of Canada, IEEE Sensor Council Technical Achievement Award (2020), IEEE Canada C. C. Gotlieb Computer Medal (2020), Clarivate Analytics (Web of Science Group) Highly Cited Researcher (2019, 2020, 2022), Fellow of Institution of Engineering and Technology (2020), Fellow of Institute of Electrical and Electronics Engineers (2020), Fellow of Engineering Institute of Canada (2019), Fellow of Canadian Academy of Engineering (2017).

Kun Tan

Kun Tan

Distributed and Parallel Software Lab, 2012 Labs, Huawei, China

Towards general purpose serverless computing in cloud

Serverless computing is emerging as the next cloud computing paradigm due to its ease of programming, rapid elasticity, fine-grained billing, and NoOps. However, while many cloud providers have rolled out various serverless offerings, ranging from Function-as-a-Service (FaaS) to serverless backend services (e.g., serverless databases), application developers are still facing many challenges to build their cloud native applications purly with those serverless offerings. In this talk, we will first outline the challenges and hurdles for developers to build serverless applications. Then, we present an effort in Huawei that designs and implements the first general-purpose serverless platform named YuanRong. YuanRong has been deployed in Huawei Cloud for more than two years and serves tens of thousands customers inside and outside the company. We will discuss the design rationale and implementation details as well as share some our experiences and lessons during the deployment in cloud.

About Kun Tan

TAN, Kun is Chief Expert and Head of Distributed and Parallel Software Lab, 2012 Labs, Huawei. Before joining Huawei, he was Research Manager in MSRA focusing on wireless and networking research. He has published over 100 referred papers in system and network in top venues such as ACM SIGCOMM, Mobicom, NSDI. He also owns over 50 patents. He won USINEX Test-of-time Award in 2019 and Best Paper Award in NSDI 2009. His research interests include distributed system, high performance networking, and cloud computing.

Ahmed Louri

Ahmed Louri

George Washington University, USA

Challenges and Solutions in Modern Machine Learning Accelerator Design

The use of Artificial Intelligence and Machine Learning (ML) is surging now across many applications, driven by advancements in computing hardware and big data. However, as contemporary ML models and datasets continue to expand in size and complexity, so do their computing requirements. Specialized hardware architectures in the form of accelerators have emerged as the prevailing solution to meet such escalating requirements in the post-Moore era. Nevertheless, sustaining the performance scaling of current accelerators poses serious challenges including diversified computing and communication requirements, the presence of non-Euclidean data structures, and inherent technology limitations, among others.

This talk will cover our recent efforts in addressing these major challenges in modern ML accelerator design from both architecture and technology perspectives. I will first provide an overview of current trends in computing and highlight the most pressing challenges in the field of ML accelerators. I will discuss the imperative of flexible accelerators to efficiently orchestrate data movement and improve data locality while adapting to varying Deep Neural Network (DNN) and Graph Neural Network (GNN) applications and data characteristics. I will next present novel flexible and scalable accelerator architectures deploying various architectural and technology solutions including flexible interconnects, chiplets integration, and silicon photonics to address the challenges and speed up both ML training and inference with significant performance, energy efficiency, and scalability advantages. The incorporation of these techniques is a notable development with a major impact on future ML design. I will conclude with future research directions in this area.

About Ahmed Louri

Ahmed Louri is the David and Marilyn Karlgaard Endowed Chair Professor of Electrical and Computer Engineering at George Washington University and a Fellow of the Institute of Electrical and Electronics Engineers (IEEE). He is also the Director of the High-Performance Computing Architectures and Technologies Laboratory (HPCAT https://hpcat.seas.gwu.edu/). Dr. Louri received a Ph.D. degree in Computer Engineering from the University of Southern California, Los Angeles, California in 1988. From 2010 to 2013, Dr. Louri served as a Program Director in the National Science Foundation’s (NSF) Directorate for Computer and Information Science and Engineering. He directed the core computer architecture program and was on the management team of several cross-cutting programs, including Cyber-Physical Systems; Expeditions in Computing; Computing Research Infrastructure; Secure and Trustworthy Cyberspace; Failure-Resistant Systems, Science Engineering and Education for Sustainability; and Cyber-Discovery Initiative, among others.

Dr. Louri conducts research in the broad area of computer architecture and parallel computing, with emphasis on interconnection networks, scalable parallel computing systems, versatile and flexible computing systems, and power-efficient, reliable & secure Network-on-Chips (NoCs) for multicore architectures. Recently, he has been concentrating on energy-efficient, reliable, and high-performance many-core architectures; accelerator-rich reconfigurable heterogeneous architectures; secure network-on-chips for multicores and SoCs; approximate computing and communications; machine learning techniques for efficient computing, memory, and interconnect systems; heterogeneous manycore architectures & chiplet-based designs; emerging interconnect technologies (photonic, wireless, RF, hybrid) for multi-core architectures and chip multiprocessors (CMPs); future parallel computing models and architectures (including convolutional neural networks, deep neural networks, and approximate computing); and cloud-computing and data centers. He has published more than 200 refereed journal articles and peer-reviewed conference papers and is the co-inventor of several US and international patents.

Dr. Louri is the recipient of the IEEE Computer Society 2020 Edward J. McCluskey Technical Achievement Award, "for pioneering contributions to the solution of on-chip and off-chip communication problems for parallel computing and manycore architectures.” The IEEE Computer Society Edward J. McCluskey Technical Achievement Award is given for outstanding and innovative contributions to the fields of computer and information science and engineering or computer technology, usually within the past 10 to 15 years. Contributions must have significantly promoted technical progress in the field. He is also the recipient of the Office of Vice President for Research Distinguished Researcher Award, George Washington University. This award is given to one scholar who has made significant contributions in research and scholarship to the university and society.

Dr. Louri is the Editor-in-Chief of the IEEE Transactions on Computers (2019 -2023), the flagship journal of the IEEE Computer Society. He is also currently serving as associate editor for the IEEE Transactions on Sustainable Computing (2016 – present) and IEEE Transactions on Cloud Computing (2020 – present). He previously served on the editorial boards for IEEE Transactions on Computers (2011 – 2016), IEEE Transactions on Emerging Technologies for Computing (2015 - 2019), and Cluster Computing, the Journal of Networks, Software Tools and Applications (2000 – 2010). Since January 2016, he has served on the steering committee for IEEE Transactions on Sustainable Computing. He also served as a guest editor for special issues in the Journal of Parallel and Distributed Computing (2010). Dr. Louri’s recent IEEE CS committee service includes being: chair of the IEEE Computer Society Edward J. McCluskey Technical Achievement award, member of the IEEE Fellow Committee (January 2022 through 31 December 2023), chair of the IEEE Computer Architecture Letters Editor-in-Chief Search Committee (2021), vice-chair for the IEEE CS Fellow Evaluation Committee (2021), chair for the IEEE Transactions on Cloud Computing Editor-in-Chief Search Committee (2019), chair for the IEEE CS Fellow Evaluation Committee (2019), an evaluator on the IEEE CS Fellow Evaluation Committee (2012), vice-chair for the IEEE CS Fellow Evaluation Committee (2017, 2018, 2021), a member of IEEE CS Computer Entrepreneur Award Committee, among several other assignments and appointments. For over 30 years, Dr. Louri has served and continues to serve on the executive and technical program committees of numerous international conferences, and symposia and is often invited to be the keynote speaker at various conferences.

Hai Jin

Hai Jin

Huazhong University of Science and Technology, China

Dataflow based High Efficient Graph Processing Accelerator

With the rapid growth of big data, it is harder and harder to processing these ever-growing data with traditional computer architecture. Dataflow-based architecture provides a new way to tackle above challenge. This talk first briefly introduce the challenges in processing big data and also the difficulties in processing graph computing, then introduce some research results we have done during these years in using dataflow for graph computing. Finally, some future directions for dataflow architecture and also when used in graph computing are introduced.

About Hai Jin

Hai Jin is a Chair Professor of computer science and engineering at Huazhong University of Science and Technology (HUST) in China. Jin received his PhD in computer engineering from HUST in 1994. In 1996, he was awarded a German Academic Exchange Service fellowship to visit the Technical University of Chemnitz in Germany. Jin worked at The University of Hong Kong between 1998 and 2000, and as a visiting scholar at the University of Southern California between 1999 and 2000. He was awarded Excellent Youth Award from the National Science Foundation of China in 2001.

Jin is a Fellow of IEEE, Fellow of CCF, and a life member of the ACM. He has co-authored more than 20 books and published over 900 research papers. His research interests include computer architecture, parallel and distributed computing, big data processing, data storage, and system security.