2021 Whova User Guide Videos: Youtube | Bilibili
We prepare two similar runs on each day for the convenience of people in different time zones. Note that all the time has been adjusted to your local time zone.


Beijing Time



THURSDAY (JUNE 24, 2021, Beijing Time)


24 JUN

08:00 - 09:00

SIGMOD Award Talks

Session Chair: Fatma Ozcan; Vanessa Braganholo

Speakers: Alon Halevy; Huanchen Zhang

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

09:00 - 09:30

Sponsor Talk of Ant Group

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

09:30 - 11:00

SIGMOD Curated Session:
Systems and ML


Session Chair:
Carsten Binnig
Ryan Marcus

Multimedia II Hall 1 (多二1厅)

Zoom Link
Youtube Live
Bilibili Live

Keynote
Slot 1: Marcus Ryan
Slot 2: Carsten Binnig

Slot 1: ML for Systems

Demonstrating UDO: A Unified Approach for Optimizing Transaction Code, Physical Design, and System Parameters via Reinforcement Learning

Junxiong Wang (Cornell University)*; Immanuel Trummer (Cornell); Debabrota Basu (Inria)

Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra

Shangyu Luo (Rice University)*; Dimitrije Jankov (Rice University); Binhang Yuan (Rice University); Chris Jermaine (Rice University)

Dendrite: Bolt-on Adaptivity for Data Systems

Brad Glasbergen (University of Waterloo)*; Fangyu Wu (University of Waterloo); Khuzaima Daudjee (University of Waterloo)

Slot 2: Learned DBMS Components 2.0: From Workload-Driven to Zero-Shot Learned DBMS Components

Tuplex: Data Science in Python at Native Code Speed

Leonhard Spiegelberg (Brown University)*; Rahul V Yesantharao (MIT); Malte Schwarzkopf (Brown University); Tim Kraska (MIT)

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

Hakbin Kim (Inha University); Dong-Wan Choi (Inha University)*

SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging

Svetlana Sagadeeva (Graz University of Technology); Matthias Boehm (Graz University of Technology)*

Algorithms for a Topology-aware Massively Parallel Computation Model

Xiao Hu, Paraschos Koutris and Spyros Blanas

Hybrid Evaluation for Distributed Iterative Matrix Computation

Zihao Chen (East China Normal University); Chen Xu (East China Normal University)*; Juan Soto (TU Berlin); Volker Markl (Technische Universität Berlin); Weining Qian (East China Normal University); Aoying Zhou (East China Normal University )

Efficient String Sort with Multi-Character Encoding and Adaptive Sampling

Wen Jin (Independent Researcher)*; Weining Qian (East China Normal University); Aoying Zhou (East China Normal University)

VSS: A Storage System for Video Analytics

Brandon Haynes (Gray Systems Lab, Microsoft)*; Maureen Daum (University of Washington); Dong He (University of Washington); Amrita Mazumdar (University of Washington); Magdalena Balazinska (UW); Alvin Cheung (University of California, Berkeley); Luis Ceze (University of Washington and OctoML)

... ...

SIGMOD Curated Session:
NL Querying and
Recommendations


Session Chair:
Georgia Koutrika
Abdul H Quamar

Multimedia II Hall 3 (多二3厅)

Zoom Link
Youtube Live
Bilibili Live

Slot 1: Understanding Answers in NL

Towards Enhancing Database Education: Natural Language Generation Meets Query Execution Plans

Weiguo Wang (Xidian University); Sourav S Bhowmick (Nanyang Technological University)*; Hui Li (Xidian University); Shafiq Joty (Nanyang Technological University); Siyuan Liu (Nanyang Technological University); Peng Chen (Xidian University)

Putting Things into Context: Rich Explanations for Query Answers using Join Graphs

Chenjie Li (Illinois Institute of Technology ); Zhengjie Miao (Duke University); Qitian Zeng (Illinois Institute of Technology); Boris Glavic (Illinois Institute of Technology)*; Sudeepa Roy (Duke University, USA)

ARM-Net: Adaptive Relation Modeling Network for Structured Data

Shaofeng Cai (National University of Singapore); Kaiping Zheng (National University of Singapore); Gang Chen (Zhejiang University); H. V. Jagadish (University of Michigan); Beng Chin Ooi (NUS)*; Meihui Zhang (Beijing Institute of Technology)

To not miss the forest for the trees - A holistic approach for explaining missing answers over nested data

Ralf Diestelkämper (University of Stuttgart); Seokki Lee (University of Cincinnati); Melanie Herschel (Universität Stuttgart); Boris Glavic (Illinois Institute of Technology)*

TSExplain: Surfacing Evolving Explanations for Time Series

Yiru Chen (Columbia University)*; Silu Huang (Microsoft)

Demonstrating Robust Voice Querying with MUVE: Optimally Visualizing Results of Phonetically Similar Queries

Ziyun Wei (Cornell University)*; Immanuel Trummer (Cornell); Connor Anderson (Cornell University)

Slot 2: Querying in NL

Proportionality in Spatial Keyword Search

Georgios Kalamatianos (Uppsala University); George Fakas (Uppsala University)*; Nikos Mamoulis (University of Ioannina)

Scalable and Usable Relational Learning With Automatic Language Bias

Jose Picado (Oregon State University); Arash Termehchy (Oregon State University)*; Alan Fern (Oregon State University); Sudhanshu Pathak (Oregon State University); Praveen Ilango (Oregon State University); John Davis (Oregon State University)

QuTE: Answering Quantity Queries from Web Tables

Vinh Thinh Ho (Max Planck Institute for Informatics)*; Koninika Pal (Max Planck Institute for Informatics ); Gerhard Weikum (Max-Planck-Institut fur Informatik)

Marrying Top-k with Skyline Queries: Relaxing the Preference Input while Producing Output of Controllable Size

Kyriakos Mouratidis (Singapore Management University)*; Keming Li (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology)

PyExplore: Query Recommendations for Data Exploration without Query Logs

Apostolos Glenis (UNIPI)*; Georgia Koutrika (ATHENA Research Center)

An In-Depth Benchmarking of Text-to-SQL Systems

Orest Gkini (Athena Research Center); Theofilos Belmpas (Athena Research Center); Georgia Koutrika (Athena Research Center)*; Yannis Ioannidis (University of Athens)

... ...

SIGMOD Curated Session:
Data Curation and Integration


Session Chair:
Sunita Sarawagi
Rachel Pottinger

Multimedia II Hall 5 (多二5厅)

Zoom Link
Youtube Live
Bilibili Live

Slot 1: The many faces of Entity Resolution, Matching, and Canonicalization

BEER: Blocking for Effective Entity Resolution

Sainyam Galhotra (University of Massachusetts Amherst)*; Donatella Firmani (Roma Tre University); Barna Saha (University of California, Berkeley); Divesh Srivastava (AT&T Labs Research)

TENET: Joint Entity and Relation Linking with Coherence Relaxation

Xueling Lin (Hong Kong University of Science and Technology)*; Lei Chen (Hong Kong University of Science and Technology); Chaorui Zhang (Huawei)

Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples

Peng Li (GATECH); Xiang Cheng (GATECH); Xu Chu (GATECH); Yeye He (Microsoft Research)*; Surajit Chaudhuri (Microsoft)

Joint Open Knowledge Base Canonicalization and Linking

Yinan Liu (Nankai University)*; Wei Shen (Nankai University); Yuanfei Wang (Nankai University); Jianyong Wang (Tsinghua University); Zhenglu Yang (Nankai University); Xiaojie Yuan (Nankai Univeristy)

Medical Entity Disambiguation using Graph Neural Networks

Alina Vretinaris (IBM Germany); Chuan Lei (IBM Research - Almaden)*; Vasilis Efthymiou (FORTH-ICS); Xiao Qin (IBM Research); Fatma Ozcan (Google)

Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts

Dong Deng (Rutgers Universituy - New Brunswick)*

Slot 2: The many uses of Schemas and Constraints

Reducing Ambiguity in Json Schema Discovery

William Spoth (University at Buffalo)*; Oliver A Kennedy (University at Buffalo, SUNY); Ying Lu (Oracle); Beda Hammerschmidt (Oracle); Zhen Hua Liu (Oracle)

BullFrog: Online Schema Evolution via Lazy Evaluation

Souvik Bhattacherjee (University of Maryland, College Park); GANG LIAO (UNIVERSITY OF MARYLAND); Michael Hicks (University of Maryland, College Park); Daniel J Abadi (UMD)*

GRIP: Constraint-based Explanation of Missing Answers for Graph Queries

Qi Song (Amazon.com)*; Hanchao Ma (Case Western Reserve University); Peng Lin (Washington State University); Yinghui Wu (Case Western Reserve University)

DataMingler: A Novel Approach to Data Virtualization

Damianos Chatziantoniou (Athens University of Economics and Business)*; Verena Kantere (National Technical University of Athens)

Slot 3: Data Generation and Benchmarking

Synthesizing Linked Data Under Cardinality and Integrity Constraints

Amir Gilad (Duke University)*; Shweta Patwa (Duke University); Ashwin Machanavajjhala (Duke)

Benchmarking Approximate Consistent Query Answering

Marco Calautti, Marco Console and Andreas Pieris

... ...

SIGMOD Tutorial:
Cohesive Subgraph Search over
Big Heterogeneous Information Networks:
Applications, Challenges, and Solutions


Third International
Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live
Youtube Video
Bilibili Video

Presenters: Yixiang Fang (The Chinese University of Hong Kong, Shenzhen); Kai Wang (University of New South Wales); Xuemin Lin (University of New South Wales); Wenjie Zhang (University of New South Wales) 
Abstract: With the advent of a wide spectrum of recent applications, querying heterogeneous information networks (HINs) has received a great deal of attention from both academic and industry societies. HINs involve objects (vertices) and links (edges) that are classified into multiple types; examples include bibliography networks, knowledge networks, and user-item networks in E-business. An important component of these HINs is the cohesive subgraph, or a subgraph containing vertices that are densely connected internally. Searching cohesive subgraphs over HINs has found many real applications, such as community search, event organization, and friend recommendation. Consequently, how to design effective cohesive subgraph models and how to efficiently search cohesive subgraphs on large HINs become important research topics in the era of big data. In this tutorial, we first highlight the importance of cohesive subgraph search over HINs in various applications and the unique challenges that need to be addressed. Subsequently, we conduct a thorough review of existing works of cohesive subgraph search over HINs. Then, we analyze and compare the models and solutions in these works. Finally, we point out new research directions. We believe that this tutorial not only helps researchers to have a better understanding of existing cohesive subgraph search models and solutions, but also provides them insights for future study.

... ...


24 JUN

11:00 - 11:30

SIGMOD Tutorial:
Querying in the age of Graph Databases and Knowledge Graphs


Third International
Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live
Youtube Video
Bilibili Video

Presenters: Marcelo Arenas (PUC Chile); Claudio Gutierrez (Universidad de Chile, Chile); Juan Sequeda (data.world)
Abstract: Graphs have become the best way we know of representing knowledge. The computing community has investigated and developed the support for managing graphs by means of digital technology. Graph databases and Knowledge graphs surface as the most successful solutions to this program. This tutorial will provide a conceptual map of the data management tasks underlying these developments, paying particular attention to data models and query languages for graphs.

... ...


24 JUN

11:30 - 12:30


24 JUN

12:30 - 13:00

Break


24 JUN

13:00 - 13:15

Sponsor Talk of eBay

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

13:15 - 13:30

Sponsor Talk of Enmotech

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

13:30 - 13:45

Sponsor Talk of Didi

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

13:45 - 14:00

Sponsor Talk of Oushu

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

14:00 - 15:00

SIGMOD Keynote:
Deep Data Integration


Session Chair: Divesh Srivastava

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live
Youtube Video
Bilibili Video

Speaker: Wang-Chiew Tan (Facebook AI)
Abstract: We are witnessing the widespread adoption of deep learning techniques as avant-garde solutions to different computational problems in recent years. In data integration, the use of deep learning techniques has helped establish several state-of-the-art results in long standing problems, including information extraction, entity matching, data cleaning, and table understanding. In this talk, I will reflect on the strengths of deep learning and how that has helped move forward the needle in data integration. I will also discuss a few challenges associated with solutions based on deep learning techniques and describe some opportunities for the data management community.


24 JUN

15:00 - 15:15

Sponsor Talk of Baidu

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

15:15 - 15:30

Sponsor Talk of ByteDance

Third International Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live


24 JUN

15:30 - 17:00

SIGMOD Curated Session:
ML-based Data Management


Session Chair:
Fatma Ozcan
Guoliang Li

Multimedia II Hall 1 (多二1厅)

Zoom Link
Youtube Live
Bilibili Live

Slot 1:

Bao: Making Learned Query Optimization Practical

Ryan C Marcus (MIT)*; Parimarjan Negi (MIT CSAIL); Hongzi Mao (MIT CSAIL); Nesime Tatbul (Intel Labs and MIT); Mohammad Alizadeh (MIT CSAIL); Tim Kraska (MIT)

Learned Cardinality Estimation for Similarity Queries

Ji Sun (Tsinghua University); Guoliang Li (Tsinghua University)*; Nan Tang (Qatar Computing Research Institute, HBKU)

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

Peizhi Wu (Nanyang Technological University)*; Gao Cong (Nanyang Technological Univesity)

SIA: Optimizing Queries using Learned Predicates

Qi Zhou (Georgia Institute of Technology)*; Joy Arulraj (Georgia Tech); Shamkant Navathe (Georgia Institute of Technology); William Harris (Galois Inc); jinpeng wu (Alibaba)

Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload

Johan Zhi Kang Kok (Grab)*; Gaurav Gaurav (Grab); Sienyi Tan (Grab); Feng Cheng (Grab); Shixuan Sun (National University of Singapore); Bingsheng He (National University of Singapore)

Steering Query Optimizers: A Practical Take on Big Data Workloads

Parimarjan Negi (MIT CSAIL)*; Matteo Interlandi (Microsoft); Ryan Marcus (MIT CSAIL); Mohammad Alizadeh (Massachusetts Institute of Technology); Tim Kraska (MIT); Marc Friedman (Microsoft); Alekh Jindal (Microsoft)

Slot 2:

Scalable Multi-Query Execution using Reinforcement Learning

Panagiotis Sioulas (EPFL)*; Anastasia Ailamaki (EPFL)

ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases

Xinyi Zhang (Peking University); HONG WU (Alibaba); Tieying Zhang (Alibaba Group); Chang Zhuo (Peking University); Shuowei Jin (Alibaba Group); Jian Tan (Alibaba); Feifei Li (Alibaba Group); Bin Cui (Peking University)*

Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond

Zhengjie Miao (Duke University)*; Yuliang Li (Megagon Labs); Xiaolan Wang (Megagon Labs)

MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems

Lin Ma (Carnegie Mellon University)*; William Zhang (Carnegie Mellon University); Jie Jiao (Carnegie Mellon University); Wuwen Wang (Carnegie Mellon University); Matthew Butrovich (Carnegie Mellon University); Wan Shen Lim (Carnegie Mellon University); Prashanth Menon (Carnegie Mellon Universiy); Andrew Pavlo (Carnegie Mellon University)

Expand your Training Limits! Generating Training Data for ML-based Data Management

Francesco Ventura (Politecnico di Torino)*; Zoi Kaoudi (TU Berlin); Jorge Arnulfo Quiane Ruiz (TU Berlin); Volker Markl (Technische Universität Berlin)

Learning-Aided Heuristics Design for Storage System

Yingtian Tang (Huawei Noah's Ark Lab); Han Lu ( Huawei Noah's Ark Lab); Xijun Li (Huawei Noah's Ark Lab)*; Lei CHEN (Huawei Noah's Ark Lab); Mingxuan Yuan (Huawei); Jia Zeng (Huawei Noah's Ark Lab)

Slot 3 Mini-keynotes:

Database Systems 2.0
Johannes Gehrke (Microsoft)

Towards AI-Native Query Optimization
Olga Papaemmanouil (Brandeis University)

... ...

SIGMOD Curated Session:
Transactions and Blockchain


Session Chair: Alan Fekete

Multimedia II Hall 3 (多二3厅)

Zoom Link
Youtube Live
Bilibili Live

Slot 1: Transactions

Attaining Workload Scalability and Strong Consistency for Replicated Databases with Hihooi

Michael Georgiou (Cyprus University of Technology); Michael Panayiotou (Cyprus University of Technology); Lambros Odysseos (Cyprus University of Technology); Aristodemos Paphitis (Cyprus University of Technology); Michael Sirivianos (Cyprus University of Technology); Herodotos Herodotou (Cyprus University of Technology)*

TardisDB: Extending SQL to Support Versioning

Maximilian E Schüle (Technical University of Munich)*; Josef Schmeißer (Technical University of Munich); Thomas Blum (TUM); Alfons Kemper (TUM); Thomas Neumann (TUM)

Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows

Pedro Silvestre (TU Delft); Marios Fragkoulis (TU Delft)*; Diomidis Spinellis (TU Delft); Asterios Katsifodimos (TU Delft)

Rethink the Scan in MVCC Databases

Jongbin Kim (Hanyang University); Kihwang Kim (Hanyang University); Hyunsoo Cho (Hanyang University); Jaeseon Yu (Hanyang University); Sooyong Kang (Hanyang University); Hyungsoo Jung (Hanyang University)*

Releasing Locks As Early As You Can: Reducing Contention of Hotspots by Violating Two-Phase Locking

Zhihan Guo (University of Wisconsin-Madison)*; Kan Wu (University of Wisconsin-Madison); cong yan (Microsoft Research); Xiangyao Yu (University of Wisconsin-Madison)

Blockchains vs. Distributed Databases: Dichotomy and Fusion

Pingcheng Ruan (National University of Singapore); Tien Tuan Anh Dinh (Singapore University of Technology and Design); Dumitrel Loghin (National University of Singapore); Meihui Zhang (Beijing Institute of Technology)*; Gang Chen (Zhejiang University); Qian Lin (ByteDance); Beng Chin Ooi (NUS)

Slot 2: Panel on Blockchain and the Database research community
Mo Sadoghi, Hank Korth, Anh Dinh, Mohammad Amini, Amr El Abbadi, Divy Agrawal, Jeeta Chacko, Ruben Mayer

Slot 3: Blockchain

SharPer: Sharding Permissioned Blockchains Over Network Clusters

Mohammad Javad Amiri (University of Pennsylvania)*; Divy Agrawal (University of California, Santa Barbara); Amr El Abbadi (UC Santa Barbara)

Why Do My Blockchain Transactions Fail? A Study of Hyperledger Fabric

Jeeta Ann Chacko (Technical University of Munich)*; Ruben Mayer (Technical University of Munich); Hans-Arno Jacobsen (TUM)

DIV: Resolving the Dynamic Issues of Zero-knowledge Set Membership Proof in the Blockchain

Zihuan XU (Hong Kong University of Science and Technology)*; Lei Chen (Hong Kong University of Science and Technology)

Do the Rich Get Richer? Fairness Analysis for Blockchain Incentives

YUMING HUANG (National University of Singapore); Jing Tang (National University of Singapore)*; Qianhao Cong (National University of Singapore); Andrew Lim (National University of Singapore); Jianliang Xu (Hong Kong Baptist University)

A Byzantine Fault Tolerant Storage for Permissioned Blockchain

Xiaodong Qi (East China Normal University)*; Zhihao Chen (East China Normal University); Zhao Zhang (East China Normal University); Cheqing Jin (East China Normal University); Aoying Zhou (East China Normal University ); Haizhen Zhuo (Ant Group); Quangqing Xu (Ant Group)

P^2B-Trace: Privacy-Preserving Blockchain-based Contact Tracing to Combat Pandemics

Zhe PENG (Hong Kong Baptist University)*; Cheng Xu (Hong Kong Baptist University); Haixin Wang (HKBU); Jinbin Huang (Hong Kong Baptist University); Jianliang Xu (Hong Kong Baptist University); Xiaowen Chu (Hong Kong Baptist University)

... ...

SIGMOD Curated Session:
Interactive Data Exploration


Session Chair:
Sourav Bhowmick
Nan Tang

Multimedia II Hall 5 (多二5厅)

Zoom Link
Youtube Live
Bilibili Live

Slot 1: Raw data Exploration

Keynote 1:
Interactive Scalable Visualizations for Data Discoveries and Interpretable AI
Polo Chau

RawVis: A System for Efficient In-situ Visual Analytics

Stavros Maroulis (Research Center ATHENA)*; Nikos Bikakis (Athena); George Papastefanatos (ATHENA Research Center); Panos Vassiliadis (University of Ioannina); Yannis Vassiliou (NTUA)

ExDRa: Exploratory Data Science on Federated Raw Data

Sebastian Baunsgaard (Graz University of Technology); Matthias Boehm (Graz University of Technology)*; Ankit Chaudhary (TU Berlin); Behrouz Derakhshan (DFKI); Stefan Geißelsöder (Siemens AG); Philipp Marian Grulich (Technische Universität Berlin); Michael Hildebrand (Siemens AG); Kevin Innerebner (Graz University of Technology); Volker Markl (Technische Universität Berlin); Claus Neubauer (Siemens AG); Sarah Osterburg (Siemens AG); Olga Ovcharenko (Graz University of Technology); Sergey Redyuk (TU Berlin); Tobias Rieger (Graz University of Technology); Alireza Rezaei Mahdiraji (DFKI); Sebastian Benjamin Wrede (Know-Center GmbH); Steffen Zeuch (Humboldt Universität zu Berlin)

DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python

Jinglin Peng (Simon Fraser University); Weiyuan Wu (Simon Fraser University)*; Brandon Lockhart (Simon Fraser University); Song Bian (The Chinese University of Hong Kong); Jing Nathan Yan (Cornell University); Linghao Xu (Simon Fraser University); Zhixuan Chi (Simon Fraser University); Jeffrey M Rzeszotarski (Cornell University); Jiannan Wang (Simon Fraser University)

Slot 2: Structured Data Exploration

CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning

Anna Fariha (University of Massachusetts Amherst)*; Ashish Tiwari (Microsoft); Alexandra Meliou (University of Massachusetts Amherst); Arjun Radhakrishna (Microsoft); Sumit Gulwani (Microsoft Research)

Interactive Search for One of the Top-k

Weicheng Wang (Hong Kong University of Science and Technology)*; Raymond Chi-Wing Wong (Hong Kong University of Science and Technology); Min Xie (Shenzhen Institute of Computing Sciences )

INCA: Inconsistency-Aware Data Profiling and Querying

Ousmane Issa (UCA, LIMOS)*; Angela Bonifati (Univ. of Lyon); Farouk Toumani (UCA, LIMOS)

MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis

Pingchuan Ma (HKUST)*; Rui Ding (Microsoft Research); Shi Han (Microsoft Research); Dongmei Zhang (Microsoft Research Asia)

An Ecosystem of Applications for Modeling Political Violence

Aline Bessa (New York University); Vito D'Orazio (University of Texas at Dallas)*; Sonia Castelo (New York University); Mike Shoemate (Harvard University); Aécio Santos (New York University); Juliana Freire (New York University); Remi Rampin (NYU)

Slot 3: Semistructured/Unstructured Data Exploration

Keynote 2:
Natural Language Exploration with Relational Databases in Chatbot
Wook-Shin Han

Boomerang: Proactive Insight-Based Recommendations for Guiding Conversational Data Analysis

Doris Lee (UC Berkeley); Abdul H Quamar (IBM Research Almaden)*; Eser Kandogan (Megagon Labs); Fatma Ozcan (Google)

Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks

Yuyu Luo (Tsinghua University); Nan Tang (Qatar Computing Research Institute, HBKU); Guoliang Li (Tsinghua University)*; Chengliang Chai (Tsinghua University); Wenbo Li (Tsinghua University); Xuedi Qin (Tsinghua University)

MIDAS: Towards Efficient and Effective Maintenance of Canned Patterns in Visual Graph Query Interfaces

Kai Huang (Fudan University); Huey Eng CHUA (Nanyang Technological University); Sourav S Bhowmick (Nanyang Technological University)*; Byron Choi (Hong Kong Baptist University); Shuigeng Zhou (Fudan University)

Exploring Ratings in Subjective Databases

Sihem Amer-Yahia (CNRS); Tova Milo (Tel Aviv University); Brit Youngmann (Tel Aviv University)*

... ...

SIGMOD Tutorial:
Not your Grandpa's SSD: The Era of Co-Designed Storage Devices


Third International
Conference Hall (国三)

Zoom Link
Youtube Live
Bilibili Live
Youtube Video
Bilibili Video

Presenters: Alberto Lerner (University of Friborug, Switzerland); Philippe Bonnet (IT Univ Copenhagen, Denmark)
Abstract: The Solid-State Drive (SSD) landscape is in constant evolution. For years, this evolution was hidden behind the unchanging abstractions of block devices and POSIX I/O. However, these abstractions have become problematic. They hinder performance and no longer reduce software complexity. Such a state of affairs impacts the database community in at least two ways.
First, using SSDs through legacy interfaces that hide internal mechanisms invariably results in erratic performance. The blame often goes to SSDs' notoriously expensive garbage collection. In truth, several other complex processes result in non-linear effects in terms of latency and bandwidth. In this tutorial, we describe these processes and how they are implemented in modern devices. This knowledge will help system designers better choose SSDs and shape database workloads to match their performance characteristics.
Second, the inadequacy of the traditional I/O abstractions opens up an entire research field focused on the co-design of SSD and database management systems (DBMS). Such research aims at devising mechanisms and policies coupling the storage manager of a DBMS and SSD internals: e.g., placing an SSD FTL (its "brains") under the control of an application, changing SSD subsytems in response to the workload, or executing logic within a SSD on a database's behalf. In this tutorial, we describe the research opportunities and challenges through this continuum of DBMS/SSD co-design techniques, and present platforms supporting their simulation and prototyping.
We believe that those two areas---a more seamless integration of Database and Storage, and the study of SSD variations adapted to Database computations---are central to the development of the next generation of Database Systems. This (opinionated) survey will equip both researchers and practitioners alike to enter the field.

... ...


24 JUN

17:00 - 18:30


24 JUN

18:30 - 20:00

Break


Second Run


24 JUN

20:00 - 21:00

SIGMOD Award Talks

Session Chair: Juliana Freire

Speakers: Ashwin Machanavajjhala; Alex Szalay

Zoom Link
Youtube Live


24 JUN

21:00 - 21:15

Sponsor Talk of SAP

Zoom Link
Youtube Live


24 JUN

21:15 - 21:30

Sponsor Talk of Snowflake

Zoom Link
Youtube Live


24 JUN

21:30 - 23:00

SIGMOD Curated Session:
Systems and ML


Session Chair:
Carsten Binnig
Ryan Marcus

Zoom Link
Youtube Live

Keynote
Slot 1: Marcus Ryan
Slot 2: Carsten Binnig

Slot 1: ML for Systems

Demonstrating UDO: A Unified Approach for Optimizing Transaction Code, Physical Design, and System Parameters via Reinforcement Learning

Junxiong Wang (Cornell University)*; Immanuel Trummer (Cornell); Debabrota Basu (Inria)

Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra

Shangyu Luo (Rice University)*; Dimitrije Jankov (Rice University); Binhang Yuan (Rice University); Chris Jermaine (Rice University)

Dendrite: Bolt-on Adaptivity for Data Systems

Brad Glasbergen (University of Waterloo)*; Fangyu Wu (University of Waterloo); Khuzaima Daudjee (University of Waterloo)

Slot 2: Learned DBMS Components 2.0: From Workload-Driven to Zero-Shot Learned DBMS Components

Tuplex: Data Science in Python at Native Code Speed

Leonhard Spiegelberg (Brown University)*; Rahul V Yesantharao (MIT); Malte Schwarzkopf (Brown University); Tim Kraska (MIT)

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

Hakbin Kim (Inha University); Dong-Wan Choi (Inha University)*

SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging

Svetlana Sagadeeva (Graz University of Technology); Matthias Boehm (Graz University of Technology)*

Algorithms for a Topology-aware Massively Parallel Computation Model

Xiao Hu, Paraschos Koutris and Spyros Blanas

Hybrid Evaluation for Distributed Iterative Matrix Computation

Zihao Chen (East China Normal University); Chen Xu (East China Normal University)*; Juan Soto (TU Berlin); Volker Markl (Technische Universität Berlin); Weining Qian (East China Normal University); Aoying Zhou (East China Normal University )

Efficient String Sort with Multi-Character Encoding and Adaptive Sampling

Wen Jin (Independent Researcher)*; Weining Qian (East China Normal University); Aoying Zhou (East China Normal University)

VSS: A Storage System for Video Analytics

Brandon Haynes (Gray Systems Lab, Microsoft)*; Maureen Daum (University of Washington); Dong He (University of Washington); Amrita Mazumdar (University of Washington); Magdalena Balazinska (UW); Alvin Cheung (University of California, Berkeley); Luis Ceze (University of Washington and OctoML)

... ...

SIGMOD Curated Session:
NL Querying and
Recommendations


Session Chair:
Georgia Koutrika
Abdul H Quamar

Zoom Link
Youtube Live

Slot 1: Understanding Answers in NL

Towards Enhancing Database Education: Natural Language Generation Meets Query Execution Plans

Weiguo Wang (Xidian University); Sourav S Bhowmick (Nanyang Technological University)*; Hui Li (Xidian University); Shafiq Joty (Nanyang Technological University); Siyuan Liu (Nanyang Technological University); Peng Chen (Xidian University)

Putting Things into Context: Rich Explanations for Query Answers using Join Graphs

Chenjie Li (Illinois Institute of Technology ); Zhengjie Miao (Duke University); Qitian Zeng (Illinois Institute of Technology); Boris Glavic (Illinois Institute of Technology)*; Sudeepa Roy (Duke University, USA)

ARM-Net: Adaptive Relation Modeling Network for Structured Data

Shaofeng Cai (National University of Singapore); Kaiping Zheng (National University of Singapore); Gang Chen (Zhejiang University); H. V. Jagadish (University of Michigan); Beng Chin Ooi (NUS)*; Meihui Zhang (Beijing Institute of Technology)

To not miss the forest for the trees - A holistic approach for explaining missing answers over nested data

Ralf Diestelkämper (University of Stuttgart); Seokki Lee (University of Cincinnati); Melanie Herschel (Universität Stuttgart); Boris Glavic (Illinois Institute of Technology)*

TSExplain: Surfacing Evolving Explanations for Time Series

Yiru Chen (Columbia University)*; Silu Huang (Microsoft)

Demonstrating Robust Voice Querying with MUVE: Optimally Visualizing Results of Phonetically Similar Queries

Ziyun Wei (Cornell University)*; Immanuel Trummer (Cornell); Connor Anderson (Cornell University)

Slot 2: Querying in NL

Proportionality in Spatial Keyword Search

Georgios Kalamatianos (Uppsala University); George Fakas (Uppsala University)*; Nikos Mamoulis (University of Ioannina)

Scalable and Usable Relational Learning With Automatic Language Bias

Jose Picado (Oregon State University); Arash Termehchy (Oregon State University)*; Alan Fern (Oregon State University); Sudhanshu Pathak (Oregon State University); Praveen Ilango (Oregon State University); John Davis (Oregon State University)

QuTE: Answering Quantity Queries from Web Tables

Vinh Thinh Ho (Max Planck Institute for Informatics)*; Koninika Pal (Max Planck Institute for Informatics ); Gerhard Weikum (Max-Planck-Institut fur Informatik)

Marrying Top-k with Skyline Queries: Relaxing the Preference Input while Producing Output of Controllable Size

Kyriakos Mouratidis (Singapore Management University)*; Keming Li (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology)

PyExplore: Query Recommendations for Data Exploration without Query Logs

Apostolos Glenis (UNIPI)*; Georgia Koutrika (ATHENA Research Center)

An In-Depth Benchmarking of Text-to-SQL Systems

Orest Gkini (Athena Research Center); Theofilos Belmpas (Athena Research Center); Georgia Koutrika (Athena Research Center)*; Yannis Ioannidis (University of Athens)

... ...

SIGMOD Curated Session:
Data Curation and Integration


Session Chair:
Renée Miller

Zoom Link
Youtube Live

Slot 1: The many faces of Entity Resolution, Matching, and Canonicalization

BEER: Blocking for Effective Entity Resolution

Sainyam Galhotra (University of Massachusetts Amherst)*; Donatella Firmani (Roma Tre University); Barna Saha (University of California, Berkeley); Divesh Srivastava (AT&T Labs Research)

TENET: Joint Entity and Relation Linking with Coherence Relaxation

Xueling Lin (Hong Kong University of Science and Technology)*; Lei Chen (Hong Kong University of Science and Technology); Chaorui Zhang (Huawei)

Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples

Peng Li (GATECH); Xiang Cheng (GATECH); Xu Chu (GATECH); Yeye He (Microsoft Research)*; Surajit Chaudhuri (Microsoft)

Joint Open Knowledge Base Canonicalization and Linking

Yinan Liu (Nankai University)*; Wei Shen (Nankai University); Yuanfei Wang (Nankai University); Jianyong Wang (Tsinghua University); Zhenglu Yang (Nankai University); Xiaojie Yuan (Nankai Univeristy)

Medical Entity Disambiguation using Graph Neural Networks

Alina Vretinaris (IBM Germany); Chuan Lei (IBM Research - Almaden)*; Vasilis Efthymiou (FORTH-ICS); Xiao Qin (IBM Research); Fatma Ozcan (Google)

Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts

Dong Deng (Rutgers Universituy - New Brunswick)*

Slot 2: The many uses of Schemas and Constraints

Reducing Ambiguity in Json Schema Discovery

William Spoth (University at Buffalo)*; Oliver A Kennedy (University at Buffalo, SUNY); Ying Lu (Oracle); Beda Hammerschmidt (Oracle); Zhen Hua Liu (Oracle)

BullFrog: Online Schema Evolution via Lazy Evaluation

Souvik Bhattacherjee (University of Maryland, College Park); GANG LIAO (UNIVERSITY OF MARYLAND); Michael Hicks (University of Maryland, College Park); Daniel J Abadi (UMD)*

GRIP: Constraint-based Explanation of Missing Answers for Graph Queries

Qi Song (Amazon.com)*; Hanchao Ma (Case Western Reserve University); Peng Lin (Washington State University); Yinghui Wu (Case Western Reserve University)

DataMingler: A Novel Approach to Data Virtualization

Damianos Chatziantoniou (Athens University of Economics and Business)*; Verena Kantere (National Technical University of Athens)

Slot 3: Data Generation and Benchmarking

Synthesizing Linked Data Under Cardinality and Integrity Constraints

Amir Gilad (Duke University)*; Shweta Patwa (Duke University); Ashwin Machanavajjhala (Duke)

Benchmarking Approximate Consistent Query Answering

Marco Calautti, Marco Console and Andreas Pieris

... ...

SIGMOD Tutorial:
Cohesive Subgraph Search over
Big Heterogeneous Information Networks:
Applications, Challenges, and Solutions


Zoom Link
Youtube Live
Youtube Video
Bilibili Video

Presenters: Yixiang Fang (The Chinese University of Hong Kong, Shenzhen); Kai Wang (University of New South Wales); Xuemin Lin (University of New South Wales); Wenjie Zhang (University of New South Wales) 
Abstract: With the advent of a wide spectrum of recent applications, querying heterogeneous information networks (HINs) has received a great deal of attention from both academic and industry societies. HINs involve objects (vertices) and links (edges) that are classified into multiple types; examples include bibliography networks, knowledge networks, and user-item networks in E-business. An important component of these HINs is the cohesive subgraph, or a subgraph containing vertices that are densely connected internally. Searching cohesive subgraphs over HINs has found many real applications, such as community search, event organization, and friend recommendation. Consequently, how to design effective cohesive subgraph models and how to efficiently search cohesive subgraphs on large HINs become important research topics in the era of big data. In this tutorial, we first highlight the importance of cohesive subgraph search over HINs in various applications and the unique challenges that need to be addressed. Subsequently, we conduct a thorough review of existing works of cohesive subgraph search over HINs. Then, we analyze and compare the models and solutions in these works. Finally, we point out new research directions. We believe that this tutorial not only helps researchers to have a better understanding of existing cohesive subgraph search models and solutions, but also provides them insights for future study.

... ...


24 JUN

23:00 - 23:30

SIGMOD Tutorial:
Querying in the age of Graph Databases and Knowledge Graphs


Zoom Link
Youtube Live
Youtube Video
Bilibili Video

Presenters: Marcelo Arenas (PUC Chile); Claudio Gutierrez (Universidad de Chile, Chile); Juan Sequeda (data.world)
Abstract: Graphs have become the best way we know of representing knowledge. The computing community has investigated and developed the support for managing graphs by means of digital technology. Graph databases and Knowledge graphs surface as the most successful solutions to this program. This tutorial will provide a conceptual map of the data management tasks underlying these developments, paying particular attention to data models and query languages for graphs.

... ...


24 JUN

23:30 - 00:30 (+1 day)

SIGMOD Panel:
Data Management to Social Science
and Back in the Future of Work


Session Chair:
Sihem Amer-Yahia
Senjuti Basu Roy

Zoom Link
Youtube Live


25 JUN

00:30 - 01:30

SIGMOD Round Table: Student Experiences on D&I in DB Conferences and Community

Session Chair: Pinar Tozun

Zoom Link
Youtube Live


25 JUN

01:30 - 02:00

Break


25 JUN

02:00 - 03:00

SIGMOD Keynote:
Deep Data Integration


Session Chair: Divesh Srivastava

Zoom Link
Youtube Live
Bilibili Live
Youtube Video
Bilibili Video

Speaker: Wang-Chiew Tan (Facebook AI)
Abstract: We are witnessing the widespread adoption of deep learning techniques as avant-garde solutions to different computational problems in recent years. In data integration, the use of deep learning techniques has helped establish several state-of-the-art results in long standing problems, including information extraction, entity matching, data cleaning, and table understanding. In this talk, I will reflect on the strengths of deep learning and how that has helped move forward the needle in data integration. I will also discuss a few challenges associated with solutions based on deep learning techniques and describe some opportunities for the data management community.


25 JUN

03:00 - 03:30

Break


25 JUN

03:30 - 05:00

SIGMOD Curated Session:
ML-based Data Management


Session Chair:
Fatma Ozcan
Guoliang Li

Zoom Link
Youtube Live

Slot 1:

Bao: Making Learned Query Optimization Practical

Ryan C Marcus (MIT)*; Parimarjan Negi (MIT CSAIL); Hongzi Mao (MIT CSAIL); Nesime Tatbul (Intel Labs and MIT); Mohammad Alizadeh (MIT CSAIL); Tim Kraska (MIT)

Learned Cardinality Estimation for Similarity Queries

Ji Sun (Tsinghua University); Guoliang Li (Tsinghua University)*; Nan Tang (Qatar Computing Research Institute, HBKU)

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

Peizhi Wu (Nanyang Technological University)*; Gao Cong (Nanyang Technological Univesity)

SIA: Optimizing Queries using Learned Predicates

Qi Zhou (Georgia Institute of Technology)*; Joy Arulraj (Georgia Tech); Shamkant Navathe (Georgia Institute of Technology); William Harris (Galois Inc); jinpeng wu (Alibaba)

Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload

Johan Zhi Kang Kok (Grab)*; Gaurav Gaurav (Grab); Sienyi Tan (Grab); Feng Cheng (Grab); Shixuan Sun (National University of Singapore); Bingsheng He (National University of Singapore)

Steering Query Optimizers: A Practical Take on Big Data Workloads

Parimarjan Negi (MIT CSAIL)*; Matteo Interlandi (Microsoft); Ryan Marcus (MIT CSAIL); Mohammad Alizadeh (Massachusetts Institute of Technology); Tim Kraska (MIT); Marc Friedman (Microsoft); Alekh Jindal (Microsoft)

Slot 2:

Scalable Multi-Query Execution using Reinforcement Learning

Panagiotis Sioulas (EPFL)*; Anastasia Ailamaki (EPFL)

ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases

Xinyi Zhang (Peking University); HONG WU (Alibaba); Tieying Zhang (Alibaba Group); Chang Zhuo (Peking University); Shuowei Jin (Alibaba Group); Jian Tan (Alibaba); Feifei Li (Alibaba Group); Bin Cui (Peking University)*

Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond

Zhengjie Miao (Duke University)*; Yuliang Li (Megagon Labs); Xiaolan Wang (Megagon Labs)

MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems

Lin Ma (Carnegie Mellon University)*; William Zhang (Carnegie Mellon University); Jie Jiao (Carnegie Mellon University); Wuwen Wang (Carnegie Mellon University); Matthew Butrovich (Carnegie Mellon University); Wan Shen Lim (Carnegie Mellon University); Prashanth Menon (Carnegie Mellon Universiy); Andrew Pavlo (Carnegie Mellon University)

Expand your Training Limits! Generating Training Data for ML-based Data Management

Francesco Ventura (Politecnico di Torino)*; Zoi Kaoudi (TU Berlin); Jorge Arnulfo Quiane Ruiz (TU Berlin); Volker Markl (Technische Universität Berlin)

Learning-Aided Heuristics Design for Storage System

Yingtian Tang (Huawei Noah's Ark Lab); Han Lu ( Huawei Noah's Ark Lab); Xijun Li (Huawei Noah's Ark Lab)*; Lei CHEN (Huawei Noah's Ark Lab); Mingxuan Yuan (Huawei); Jia Zeng (Huawei Noah's Ark Lab)

Slot 3 Mini-keynotes:

Database Systems 2.0
Johannes Gehrke (Microsoft)

Towards AI-Native Query Optimization
Olga Papaemmanouil (Brandeis University)

... ...

SIGMOD Curated Session:
Transactions and Blockchain


Session Chair: Alan Fekete

Zoom Link
Youtube Live

Slot 1: Transactions

Attaining Workload Scalability and Strong Consistency for Replicated Databases with Hihooi

Michael Georgiou (Cyprus University of Technology); Michael Panayiotou (Cyprus University of Technology); Lambros Odysseos (Cyprus University of Technology); Aristodemos Paphitis (Cyprus University of Technology); Michael Sirivianos (Cyprus University of Technology); Herodotos Herodotou (Cyprus University of Technology)*

TardisDB: Extending SQL to Support Versioning

Maximilian E Schüle (Technical University of Munich)*; Josef Schmeißer (Technical University of Munich); Thomas Blum (TUM); Alfons Kemper (TUM); Thomas Neumann (TUM)

Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows

Pedro Silvestre (TU Delft); Marios Fragkoulis (TU Delft)*; Diomidis Spinellis (TU Delft); Asterios Katsifodimos (TU Delft)

Rethink the Scan in MVCC Databases

Jongbin Kim (Hanyang University); Kihwang Kim (Hanyang University); Hyunsoo Cho (Hanyang University); Jaeseon Yu (Hanyang University); Sooyong Kang (Hanyang University); Hyungsoo Jung (Hanyang University)*

Releasing Locks As Early As You Can: Reducing Contention of Hotspots by Violating Two-Phase Locking

Zhihan Guo (University of Wisconsin-Madison)*; Kan Wu (University of Wisconsin-Madison); cong yan (Microsoft Research); Xiangyao Yu (University of Wisconsin-Madison)

Blockchains vs. Distributed Databases: Dichotomy and Fusion

Pingcheng Ruan (National University of Singapore); Tien Tuan Anh Dinh (Singapore University of Technology and Design); Dumitrel Loghin (National University of Singapore); Meihui Zhang (Beijing Institute of Technology)*; Gang Chen (Zhejiang University); Qian Lin (ByteDance); Beng Chin Ooi (NUS)

Slot 2: Panel on Blockchain and the Database research community
Mo Sadoghi, Hank Korth, Anh Dinh, Mohammad Amini, Amr El Abbadi, Divy Agrawal, Jeeta Chacko, Ruben Mayer

Slot 3: Blockchain

SharPer: Sharding Permissioned Blockchains Over Network Clusters

Mohammad Javad Amiri (University of Pennsylvania)*; Divy Agrawal (University of California, Santa Barbara); Amr El Abbadi (UC Santa Barbara)

Why Do My Blockchain Transactions Fail? A Study of Hyperledger Fabric

Jeeta Ann Chacko (Technical University of Munich)*; Ruben Mayer (Technical University of Munich); Hans-Arno Jacobsen (TUM)

DIV: Resolving the Dynamic Issues of Zero-knowledge Set Membership Proof in the Blockchain

Zihuan XU (Hong Kong University of Science and Technology)*; Lei Chen (Hong Kong University of Science and Technology)

Do the Rich Get Richer? Fairness Analysis for Blockchain Incentives

YUMING HUANG (National University of Singapore); Jing Tang (National University of Singapore)*; Qianhao Cong (National University of Singapore); Andrew Lim (National University of Singapore); Jianliang Xu (Hong Kong Baptist University)

A Byzantine Fault Tolerant Storage for Permissioned Blockchain

Xiaodong Qi (East China Normal University)*; Zhihao Chen (East China Normal University); Zhao Zhang (East China Normal University); Cheqing Jin (East China Normal University); Aoying Zhou (East China Normal University ); Haizhen Zhuo (Ant Group); Quangqing Xu (Ant Group)

P^2B-Trace: Privacy-Preserving Blockchain-based Contact Tracing to Combat Pandemics

Zhe PENG (Hong Kong Baptist University)*; Cheng Xu (Hong Kong Baptist University); Haixin Wang (HKBU); Jinbin Huang (Hong Kong Baptist University); Jianliang Xu (Hong Kong Baptist University); Xiaowen Chu (Hong Kong Baptist University)

... ...

SIGMOD Curated Session:
Interactive Data Exploration


Session Chair:
Sourav Bhowmick
Nan Tang

Zoom Link
Youtube Live

Slot 1: Raw data Exploration

Keynote 1:
Interactive Scalable Visualizations for Data Discoveries and Interpretable AI
Polo Chau

RawVis: A System for Efficient In-situ Visual Analytics

Stavros Maroulis (Research Center ATHENA)*; Nikos Bikakis (Athena); George Papastefanatos (ATHENA Research Center); Panos Vassiliadis (University of Ioannina); Yannis Vassiliou (NTUA)

ExDRa: Exploratory Data Science on Federated Raw Data

Sebastian Baunsgaard (Graz University of Technology); Matthias Boehm (Graz University of Technology)*; Ankit Chaudhary (TU Berlin); Behrouz Derakhshan (DFKI); Stefan Geißelsöder (Siemens AG); Philipp Marian Grulich (Technische Universität Berlin); Michael Hildebrand (Siemens AG); Kevin Innerebner (Graz University of Technology); Volker Markl (Technische Universität Berlin); Claus Neubauer (Siemens AG); Sarah Osterburg (Siemens AG); Olga Ovcharenko (Graz University of Technology); Sergey Redyuk (TU Berlin); Tobias Rieger (Graz University of Technology); Alireza Rezaei Mahdiraji (DFKI); Sebastian Benjamin Wrede (Know-Center GmbH); Steffen Zeuch (Humboldt Universität zu Berlin)

DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python

Jinglin Peng (Simon Fraser University); Weiyuan Wu (Simon Fraser University)*; Brandon Lockhart (Simon Fraser University); Song Bian (The Chinese University of Hong Kong); Jing Nathan Yan (Cornell University); Linghao Xu (Simon Fraser University); Zhixuan Chi (Simon Fraser University); Jeffrey M Rzeszotarski (Cornell University); Jiannan Wang (Simon Fraser University)

Slot 2: Structured Data Exploration

CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning

Anna Fariha (University of Massachusetts Amherst)*; Ashish Tiwari (Microsoft); Alexandra Meliou (University of Massachusetts Amherst); Arjun Radhakrishna (Microsoft); Sumit Gulwani (Microsoft Research)

Interactive Search for One of the Top-k

Weicheng Wang (Hong Kong University of Science and Technology)*; Raymond Chi-Wing Wong (Hong Kong University of Science and Technology); Min Xie (Shenzhen Institute of Computing Sciences )

INCA: Inconsistency-Aware Data Profiling and Querying

Ousmane Issa (UCA, LIMOS)*; Angela Bonifati (Univ. of Lyon); Farouk Toumani (UCA, LIMOS)

MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis

Pingchuan Ma (HKUST)*; Rui Ding (Microsoft Research); Shi Han (Microsoft Research); Dongmei Zhang (Microsoft Research Asia)

An Ecosystem of Applications for Modeling Political Violence

Aline Bessa (New York University); Vito D'Orazio (University of Texas at Dallas)*; Sonia Castelo (New York University); Mike Shoemate (Harvard University); Aécio Santos (New York University); Juliana Freire (New York University); Remi Rampin (NYU)

Slot 3: Semistructured/Unstructured Data Exploration

Keynote 2:
Natural Language Exploration with Relational Databases in Chatbot
Wook-Shin Han

Boomerang: Proactive Insight-Based Recommendations for Guiding Conversational Data Analysis

Doris Lee (UC Berkeley); Abdul H Quamar (IBM Research Almaden)*; Eser Kandogan (Megagon Labs); Fatma Ozcan (Google)

Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks

Yuyu Luo (Tsinghua University); Nan Tang (Qatar Computing Research Institute, HBKU); Guoliang Li (Tsinghua University)*; Chengliang Chai (Tsinghua University); Wenbo Li (Tsinghua University); Xuedi Qin (Tsinghua University)

MIDAS: Towards Efficient and Effective Maintenance of Canned Patterns in Visual Graph Query Interfaces

Kai Huang (Fudan University); Huey Eng CHUA (Nanyang Technological University); Sourav S Bhowmick (Nanyang Technological University)*; Byron Choi (Hong Kong Baptist University); Shuigeng Zhou (Fudan University)

Exploring Ratings in Subjective Databases

Sihem Amer-Yahia (CNRS); Tova Milo (Tel Aviv University); Brit Youngmann (Tel Aviv University)*

... ...

SIGMOD Tutorial:
Not your Grandpa's SSD: The Era of Co-Designed Storage Devices


Zoom Link
Youtube Live
Youtube Video
Bilibili Video

Presenters: Alberto Lerner (University of Friborug, Switzerland); Philippe Bonnet (IT Univ Copenhagen, Denmark)
Abstract: The Solid-State Drive (SSD) landscape is in constant evolution. For years, this evolution was hidden behind the unchanging abstractions of block devices and POSIX I/O. However, these abstractions have become problematic. They hinder performance and no longer reduce software complexity. Such a state of affairs impacts the database community in at least two ways.
First, using SSDs through legacy interfaces that hide internal mechanisms invariably results in erratic performance. The blame often goes to SSDs' notoriously expensive garbage collection. In truth, several other complex processes result in non-linear effects in terms of latency and bandwidth. In this tutorial, we describe these processes and how they are implemented in modern devices. This knowledge will help system designers better choose SSDs and shape database workloads to match their performance characteristics.
Second, the inadequacy of the traditional I/O abstractions opens up an entire research field focused on the co-design of SSD and database management systems (DBMS). Such research aims at devising mechanisms and policies coupling the storage manager of a DBMS and SSD internals: e.g., placing an SSD FTL (its "brains") under the control of an application, changing SSD subsytems in response to the workload, or executing logic within a SSD on a database's behalf. In this tutorial, we describe the research opportunities and challenges through this continuum of DBMS/SSD co-design techniques, and present platforms supporting their simulation and prototyping.
We believe that those two areas---a more seamless integration of Database and Storage, and the study of SSD variations adapted to Database computations---are central to the development of the next generation of Database Systems. This (opinionated) survey will equip both researchers and practitioners alike to enter the field.

... ...


25 JUN

05:00 - 06:30


25 JUN

06:30 - 07:00

Break


25 JUN

07:00 - 07:30

Sponsor Talk of Facebook

Zoom Link
Youtube Live


25 JUN

07:30 - 09:00

Break