SIGMOD 2021: Keynote Talks
Utilizing (and Designing) Modern Hardware for Data-Intensive Computations: The Role of Abstraction
Speaker: Kenneth A. Ross (Columbia University)
Abstract
Modern information-intensive systems, including data management systems, operate on data that is mostly resident in RAM. As a result, the data management community has shifted focus from I/O optimization to addressing performance issues higher in the memory hierarchy.
In this keynote, I will give a personal perspective of these developments, illustrated by work from my group at Columbia University. I will use the concept of abstraction as a lens through which various kinds of optimizations for modern hardware platforms can be understood and evaluated. Through this lens, some “cute implementation tricks” can be seen as much more than mere implementation details.
I will discuss abstractions at various granularities, from single lines of code to whole programming/query languages. I will touch on software and hardware design for data-intensive computations. I will also discuss data processing in a conventional programming language, and how the data management community might contribute to the design of compilers.
Bio
Kenneth Ross is a Professor in the Computer Science Department at Columbia University in New York City. His research interests touch on various aspects of database systems, including query processing and architecture-sensitive database system design. He also has an interest in computational biology, including the study of autoimmunity. Professor Ross received his PhD from Stanford University in 1991. He has received several awards, including a Packard Foundation Fellowship, a Sloan Foundation Fellowship, and an NSF Young Investigator award.
Deep Data Integration
Speaker: Wang-Chiew Tan (Facebook AI)
Abstract
We are witnessing the widespread adoption of deep learning techniques as avant-garde solutions to different computational problems in recent years. In data integration, the use of deep learning techniques has helped establish several state-of-the-art results in long standing problems, including information extraction, entity matching, data cleaning, and table understanding. In this talk, I will reflect on the strengths of deep learning and how that has helped move forward the needle in data integration. I will also discuss a few challenges associated with solutions based on deep learning techniques and describe some opportunities for the data management community.
Bio
Wang-Chiew is a research scientist at Facebook AI. Prior to joining Facebook AI, she was at Megagon Labs where she led the research efforts on building advanced technologies to enhance search by experience. Prior to Megagon Labs, she was a Professor of Computer Science at University of California, Santa Cruz and also spent two years at IBM Research - Almaden. Her research interests include data integration and exchange, data provenance, and natural language processing. She is a co-recipient of the 2014 ACM PODS Alberto O. Mendelzon Test-of-Time Award, the 2018 ICDT Test-of-Time Award, and the 2020 Alonzo Church Award. She received the 2019 VLDB Women in Database Research Award and she is a Fellow of the ACM.
Complexity Theory and Efficient Algorithms for Big Data Computation with Limited Computing Resources
Speaker: Jianzhong Li (Shenzhen Institute of Advanced Technology and Harbin Institute of Technology)
Abstract
With the explosive growth of available data in recent years, big data research has attracted much attention from both academic and industrial researchers, and many significant advances have been achieved. Nevertheless, the fundamental research results are far from the actual needs, a number of key issues remain unresolved, considerable work remains to be accomplished, and a complete theory of big data computation needs to be established. This talk focuses on the theoretical aspects of big data research, especially the complexity theory and efficient algorithms of big data computation. First, big data computation is formally defined. Then, the challenges and the research issues of big data computation are discussed. Finally, a number of research results on the complexity theory and efficient algorithms of big data computation, achieved by the speaker’s group, are surveyed.
Bio
Jianzhong Li is a chair professor at Shenzhen Institute of Advanced Technology and a professor at Harbin Institute of Technology, China. He worked in the University of California at Berkeley as a visiting scholar in 1985. From 1986 to 1987 and from 1992 to 1993, he was a scientist in the Information Research Group in the Department of Computer Science at Lawrence Berkeley National Laboratory, USA. He was also a visiting professor at the University of Minnesota at Minneapolis, Minnesota, USA, from 1991 to 1992 and from 1998 to 1999. His current research interests include big data computation and wireless sensor networks. He has published more than 400 papers in refereed journals and conference proceedings, such as VLDB Journal, Algorithmica, IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Parallel and Distributed Systems, SIGMOD, VLDB, ICDE, INFOCOM, Mobihoc and Sensys. His papers have been cited more 20000 times and His H-index is 65. He has been involved in the program committees of major computer science and technology conferences, including SIGMOD, VLDB, ICDE, INFOCOM, ICDCS, and WWW. He has also served on the editorial boards for distinguished journals, such as Knowledge and Data Engineering, and refereed papers for varied journals and proceedings.