We prepare two similar runs on each day for the convenience of people in different time zones. Note that all the time has been adjusted to your local time zone.
Dynamic Structural Clustering on Graphs Boyu Ruan (University of Queensland); Junhao Gan (University of Melbourne)*; Hao Wu (University of Melbourne); Anthony Wirth (The University of Melbourne)
Graph Iso/Auto-morphism: A Divide-&-Conquer Approach Can Lu (The Chinese University of Hong Kong)*; Jeffrey Xu Yu (Chinese University of Hong Kong); Zhiwei Zhang (Hong Kong Baptist University); Hong Cheng (Chinese University of Hong Kong)
APAN: Asynchronous Propagation Attention Network for Real-time Temporal Graph Embedding Xuhong Wang (Shanghai Jiao Tong University); Ding Lyu (Shanghai Jiao Tong University); Mengjian Li (Ant Group); Yang Xia (Ant Group); Qi Yang (Ant Group); Xinwen Wang (Ant Group); Xinguang Wang (Ant Group); Ping Cui (Shanghai Jiao Tong University); Yupu Yang (Shanghai Jiao Tong University); Bowen Sun (Ant Group); Zhenyu Guo (Ant Group)*
HUGE: An Efficient and Scalable Subgraph Enumeration System Zhengyi Yang (University of New South Wales)*; Longbin Lai (Alibaba Corporation); Xuemin Lin (University of New South Wales); Kongzhang Hao (University of New South Wales); Wenjie Zhang (University of New South Wales)
PG-Keys: Keys for Property Graphs Renzo Angles (Universidad de Talca); Angela Bonifati (Univ. of Lyon); Stefania Dumbrava (ENSIIE); George Fletcher (Eindhoven University of Technology)*; Keith Hare (JCC Consulting, Inc.); Jan Hidders (Birkbeck, University of London); Victor Lee (TigerGraph); Bei Li (Google); Leonid Libkin (University of Edinburgh); Wim Martens (University of Bayreuth); Filip Murlak (University of Warsaw, Poland); Josh Perryman (Interos, Inc.); Ognjen Savkovic (Free University of Bozen-Bolzano); Michael Schmidt (Amazon Web Services); Juan Sequeda (data.world); Sławek Staworko (University of Lille); Dominik Tomaszuk (University of Bialystok)
Cache-Efficient Fork-Processing Patterns on Large Graphs Shengliang Lu (National University of Singapore)*; Shixuan Sun (National University of Singapore); Johns Paul (NUS); Yuchen Li (Singapore Management University); Bingsheng He (National University of Singapore)
VeriDB: An SGX-based Verifiable Database Wenchao Zhou (Georgetown University)*; Yifan Cai (University of Pennsylvania); Yanqing Peng (University of Utah); Sheng Wang (Alibaba Group); Ke Ma (Shanghai Jiaotong University); Feifei Li (Alibaba Group)
Properties of Inconsistency Measures for Databases Ester Livshits (Technion)*; Rina Kochirgan (Technion); Segev Tsur (Technion); Ihab F Ilyas (U. of Waterloo); Benny Kimelfeld (Technion); Sudeepa Roy (Duke University, USA)
Structural Generalizability: The Case of Similarity Search Yodsawalai Chodpathumwan (University of Illinois); Arash Termehchy (Oregon State University)*; Stephen Ramsey (Oregon State University); Aayam Shrestha (Oregon State University); Amy Glen (Oregon State University); Zheng Liu (Oregon State University)
FastVer: Making Data Integrity a Commodity Arvind Arasu (Microsoft)*; Badrish Chandramouli (Microsoft Research); Johannes Gehrke (Microsoft); Esha Ghosh (Microsoft); Donald Kossmann (Microsoft Research); Jonathan Protzenko (Microsoft); Ravi Ramamurthy (MICROSOFT); Tahina Ramananandro (Microsoft); Aseem Rastogi (Microsoft); Srinath Setty (Microsoft Research); Nikhil Swamy (Microsoft); Alexander van Renen (TUM); Min Xu (University of Chicago)
Instance-Optimized Data Layouts for Cloud Analytics Workloads Jialin Ding (MIT)*; Umar Farooq Minhas (Microsoft Research); Badrish Chandramouli (Microsoft Research); Chi Wang (Microsoft Research); Yinan Li (Microsoft Research); Ying Li (Microsoft); Donald Kossmann (Microsoft Research); Johannes Gehrke (Microsoft); Tim Kraska (MIT)
PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers Wei Cao (Alibaba); Yingqiang Zhang (Alibaba Group); Jimmy Yang (Alibaba Group); Feifei Li (Alibaba Group)*; Zongzhi Chen (Alibaba Group); Qingda Hu (Alibaba Group); Zhenjun Liu (Alibaba); Sheng Wang (Alibaba Group); Xuntao Cheng (Alibaba Group); Jing Fang (Alibaba Group); Bo Wang (Alibaba Group); Yuhui Wang (Alibaba Group); Haiqing Sun (Alibaba Group); Ze Yang (Alibaba Group); Zhushi Cheng (Alibaba); Sen Chen (Alibaba Group); Jian Wu (Alibaba Group); Wei Hu (Alibaba Group); Jianwei Zhao (Alibaba Group); Yusong Gao (Alibaba Cloud); Songlu Cai (Alibaba Group); Yunyang Zhang (Alibaba Group); Jiawang Tong (Alibaba Group)
Bringing Cloud-Native Storage to SAP IQ Mohammed Abouzour (SAP); Gunes Aluc (SAP Labs)*; Ivan Bowman (SAP); Xi Deng (SAP); Nandan Marathe (SAP); Sagar Ranadive (SAP); Muhammed Sharique (SAP); John C. Smirnios (SAP)
Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka Guozhang Wang (Confluent Inc.)*; Lei Chen (Bloomberg L.P.); Ayusman Dikshit (Expedia Group); Jason Gustafson (Confluent Inc.); Boyang Chen (Confluent Inc. ); Matthias J Sax (Confluent Inc.); John Roesler (Confluent Inc. ); Sophie Blee-Goldman (Confluent Inc.); Bruno Cadonna (Confluent Inc. ); Apurva Mehta (Confluent Inc. ); Varun Madan (Confluent Inc.); Jun Rao (Confluent Inc. )
FoundationDB: A Distributed Unbundled Transactional KeyValue Store Jingyu Zhou (Apple)*; Meng Xu (Apple); Alexander Shraer (Apple); Alex Miller (Apple); Bala Namasivayam (Apple); Evan Tschannen (Apple); Rusty Sears (Apple); John Leach (Apple); Dave Rosenthal (Apple); Will Wilson (antithesis.com); Ben Collins (antithesis.com); David Scherer (antithesis.com); Steve Atherton (Apple); Andrew Beamon (Apple); Xin Dong (Apple); Alec Grieser (Apple); Young Liu (Apple); Alvin Moore (Apple); Bhaskar Muppana (Apple); Xiaoge Su (Apple); Vishesh Yadav (Apple)
LogStore: A Cloud-Native and Multi-Tenant Log Database Wei Cao (Alibaba); Xiaojie Feng (Alibaba Cloud); Boyuan Liang (Alibaba Group); Tianyu Zhang (Alibaba Group); Yusong Gao (Alibaba Cloud)*; Yunyang Zhang (Alibaba Group); Feifei Li (Alibaba Group)
QuiCK: a Queuing System in CloudKit Kfir Lev-Ari (Apple)*; Yizuo Tian (Apple); Alexander Shraer (Apple); Chris Douglas (Apple); Hao Fu (Apple); Andrey Andreev (Apple); Kevin Beranek (Apple); Scott Dugas (Apple); Alec Grieser (Apple); Jeremy Hemmo (Apple)
KEA: Tuning an Exabyte-Scale Data Infrastructure Yiwen Zhu (Microsoft)*; Subru Krishnan (Microsoft); Konstantinos Karanasos (Microsoft); Isha Tarte (Microsoft); Conor Power (Microsoft); Abhishek Modi (Microsoft); Manoj Kumar (Microsoft); Deli Zhang (Microsoft); Kartheek Muthyala (Microsoft); Nick Jurgens (Microsoft); Sarvesh Sakalanaga (Salesforce); Sudhir Darbha (Microsoft); Minu Iyer (Microsoft); Ankita Agarwal (Microsoft); Carlo Curino (Microsoft)
Presenters: Mohammad Javad Amiri (University of Pennsylvania); Divyakant Agrawal (University of California Santa Barbara); Amr El Abbadi (University of California Santa Barbara) Abstract: The unique features of blockchains such as immutability, transparency, provenance, and authenticity have been used by many large-scale data management systems to deploy a wide range of distributed applications including supply chain management, healthcare, and crowdworking in permissioned settings. Unlike permissionless settings, e.g., Bitcoin, where the network is public, and anyone can participate without a specific identity, a permissioned blockchain system consists of a set of known, identified nodes that might not fully trust each other. While the characteristics of permissioned blockchains are appealing to a wide range of large-scale data management systems, these systems, have to satisfy four main requirements: confidentiality, verifiability, performance, and scalability. Various approaches have been developed in industry and academia to satisfy these requirements with varying assumptions and costs. The focus of this tutorial is on presenting many of these techniques while highlighting the trade-offs among them. We demonstrate the practicality of such techniques in real-life by presenting three different applications, i.e., supply chain management, large-scale databases, and multi-platform crowdworking environments, and show how those techniques can be utilized to meet the requirements of such applications.
Speaker: Jianzhong Li (Shenzhen Institute of Advanced Technology and Harbin Institute of Technology) Abstract: With the explosive growth of available data in recent years, big data research has attracted much attention from both academic and industrial researchers, and many significant advances have been achieved. Nevertheless, the fundamental research results are far from the actual needs, a number of key issues remain unresolved, considerable work remains to be accomplished, and a complete theory of big data computation needs to be established. This talk focuses on the theoretical aspects of big data research, especially the complexity theory and efficient algorithms of big data computation. First, big data computation is formally defined. Then, the challenges and the research issues of big data computation are discussed. Finally, a number of research results on the complexity theory and efficient algorithms of big data computation, achieved by the speaker’s group, are surveyed.
Top-K Deep Video Analytics: A Probabilistic Approach Ziliang Lai (Chinese University of Hong Kong)*; Chenxia Han (Chinese University of Hong Kong); Chris Liu (Chinese University of Hong Kong); Pengfei Zhang (Chinese University of Hong Kong); Eric Lo (Chinese University of Hong Kong); Ben Kao (University of Hong Kong)
Evaluating Temporal Queries Over Video Feeds Yueting Chen (York University )*; Xiaohui Yu (York University); Nick Koudas (University of Toronto); Ziqiang Yu (Yantai University)
Consistent and Flexible Selectivity Estimation for High-Dimensional Data Yaoshu Wang (Shenzhen Institute of Computing Sciences, Shenzhen University); Chuan Xiao (Osaka University); Jianbin Qin (Shenzhen Institute of Computing Sciences, Shenzhen University)*; Rui Mao (Shenzhen Institute of Computing Sciences, Shenzhen University); Makoto Onizuka (Osaka University); Wei Wang (University of New South wales); Rui Zhang (" University of Melbourne, Australia"); Yoshiharu Ishikawa (Nagoya University)
Milvus: A Purpose-Built Vector Data Management System Jianguo Wang (Purdue University)*; Xiaomeng Yi (Zilliz); Rentong Guo (Zilliz); Hai Jin (Zilliz); Peng Xu (Zilliz); Shengjun Li (Zilliz); Xiangyu Wang (Zilliz); Xiangzhou Guo (Zilliz); Chengming Li (Zilliz); Xiaohai Xu (Zilliz); Kun Yu (Zilliz); Yuxing Yuan (Zilliz); Yinghao Zou (Zilliz); Jiquan Long (Zilliz); Yudong Cai (Zilliz); Zhenxiang Li (Zilliz); Zhifeng Zhang (Zilliz); Yihua Mo (Zilliz); Jun Gu (Zilliz); Ruiyi Jiang (Zilliz); Yi Wei (Zilliz); Chao Xie (Zilliz)
Spatial Independent Range Sampling Dong Xie (University of Utah)*; Jeff Phillips (University of Utah); Michael Matheny (Amazon); Feifei Li (University of Utah)
On m-Impact Regions and Standing Top-k Influence Problems Bo Tang (Southern University of Science and Technology); Kyriakos Mouratidis (Singapore Management University)*; Mingji Han (Southern University of Science and Technology)
Crosstown Foundry: A Scalable Data-driven Journalism Platform for Hyper-local News Luciano Nocera (University of Southern California)*; Giorgos Constantinou (University of Southern California); Luan V Tran (University of Southern California); Seon Ho Kim (University of Southern California); Gabriel Kahn (University of Southern California); Cyrus Shahabi (Computer Science Department. University of Southern California)
Minimizing the Regret of an Influence Provider Yipeng Zhang (RMIT University); Yuchen Li (Singapore Management University); Zhifeng Bao (RMIT University)*; Baihua Zheng (Singapore Management University); H. V. Jagadish (University of Michigan)
AutoAI-TS:AutoAI for Time Series Forecasting Syed Yousaf Shah (IBM T.J Watson Research Center)*; Dhaval Patel (IBM TJ Watson Research Center); Long Vu (IBM TJ Watson Research Center); Xuan-Hong Dang (IBM T.J Watson Research Center); Bei Chen (IBM Research); Peter Kirchner (IBM Research); Horst Samulowitz (IBM Research); David Wood (IBM Research); Gregory Bramble (IBM Research); Wesley M Gifford (IBM T. J. Watson Research Center); Venkata Sitaramagiridharganesh Ganapavarapu (IBM Research); Roman Vaculin (IBM Research); Petros Zerfos (IBM T.J Watson Research Center)
Self-adaptive Graph Traversal on GPUs Mo Sha (National University of Singapore)*; Yuchen Li (Singapore Management University); Kian-Lee Tan (National University of Singapore)
Efficient and Effective Algorithms for Revenue Maximization in Social Advertising Kai Han (University of Science and Technology of China)*; Benwei Wu (University of Science and Technology of China); Jing Tang (National University of Singapore); Shuang Cui (University of Science and Technology of China); Cigdem Aslay (Aarhus University); Laks V.S. Lakshmanan (The University of British Columbia)
Presenters: George Katsogiannis-Meimarakis (Athena Research Center, Greece); GreeceGeorgia Koutrika (Athena Research Center, Greece) Abstract: Data is a prevalent part of every business and scientific domain, but its explosive volume and increasing complexity make data querying challenging even for experts. For this reason, numerous text-to-SQL systems have been developed that enable querying relational databases using natural language. The recent advances on deep neural networks along with the creation of two large datasets specifically made for training text-to-SQL systems, have paved the path for a novel and very promising research area. The purpose of this tutorial is a deep dive into this area, covering state-of-the-art techniques for natural language representation in neural networks, benchmarks that sparked research and competition, recent text-to-SQL systems using deep learning techniques, as well as open problems and research opportunities.
Presenters: Abdul Wasay (Harvard University, USA); Subarna Chatterjee (Harvard University, USA); Stratos Idreos (Harvard University) Abstract: Deep neural networks enable numerous and diverse applications of machine learning. We present a tutorial on deep learning, highlighting the data systems nature of neural networks and research opportunities for the data management community. In particular we focus on three critical aspects: 1) classic tradeoffs and design problems in neural networks which can be enriched if seen through a systems and data management perspective, e.g., thinking critically about storage, data movement, and computation; 2) classic data systems design problems which can be reconsidered if neural networks can be considered as a viable design option, e.g., to replace or help system components that make decisions such as a database optimizer; 3) important ethics considerations for the application of neural networks in critical human-facing problems in society and how these also link to data management and performance. While these are a diverse set of rich topics, their combination offers additional rich opportunities for future research. The tutorial is designed to be accessible to data management researchers with no background in neural networks. This tutorial can be offered in both a 1.5-hour or as a 3-hour version.
Dynamic Structural Clustering on Graphs Boyu Ruan (University of Queensland); Junhao Gan (University of Melbourne)*; Hao Wu (University of Melbourne); Anthony Wirth (The University of Melbourne)
Graph Iso/Auto-morphism: A Divide-&-Conquer Approach Can Lu (The Chinese University of Hong Kong)*; Jeffrey Xu Yu (Chinese University of Hong Kong); Zhiwei Zhang (Hong Kong Baptist University); Hong Cheng (Chinese University of Hong Kong)
APAN: Asynchronous Propagation Attention Network for Real-time Temporal Graph Embedding Xuhong Wang (Shanghai Jiao Tong University); Ding Lyu (Shanghai Jiao Tong University); Mengjian Li (Ant Group); Yang Xia (Ant Group); Qi Yang (Ant Group); Xinwen Wang (Ant Group); Xinguang Wang (Ant Group); Ping Cui (Shanghai Jiao Tong University); Yupu Yang (Shanghai Jiao Tong University); Bowen Sun (Ant Group); Zhenyu Guo (Ant Group)*
HUGE: An Efficient and Scalable Subgraph Enumeration System Zhengyi Yang (University of New South Wales)*; Longbin Lai (Alibaba Corporation); Xuemin Lin (University of New South Wales); Kongzhang Hao (University of New South Wales); Wenjie Zhang (University of New South Wales)
PG-Keys: Keys for Property Graphs Renzo Angles (Universidad de Talca); Angela Bonifati (Univ. of Lyon); Stefania Dumbrava (ENSIIE); George Fletcher (Eindhoven University of Technology)*; Keith Hare (JCC Consulting, Inc.); Jan Hidders (Birkbeck, University of London); Victor Lee (TigerGraph); Bei Li (Google); Leonid Libkin (University of Edinburgh); Wim Martens (University of Bayreuth); Filip Murlak (University of Warsaw, Poland); Josh Perryman (Interos, Inc.); Ognjen Savkovic (Free University of Bozen-Bolzano); Michael Schmidt (Amazon Web Services); Juan Sequeda (data.world); Sławek Staworko (University of Lille); Dominik Tomaszuk (University of Bialystok)
Cache-Efficient Fork-Processing Patterns on Large Graphs Shengliang Lu (National University of Singapore)*; Shixuan Sun (National University of Singapore); Johns Paul (NUS); Yuchen Li (Singapore Management University); Bingsheng He (National University of Singapore)
VeriDB: An SGX-based Verifiable Database Wenchao Zhou (Georgetown University)*; Yifan Cai (University of Pennsylvania); Yanqing Peng (University of Utah); Sheng Wang (Alibaba Group); Ke Ma (Shanghai Jiaotong University); Feifei Li (Alibaba Group)
Properties of Inconsistency Measures for Databases Ester Livshits (Technion)*; Rina Kochirgan (Technion); Segev Tsur (Technion); Ihab F Ilyas (U. of Waterloo); Benny Kimelfeld (Technion); Sudeepa Roy (Duke University, USA)
Structural Generalizability: The Case of Similarity Search Yodsawalai Chodpathumwan (University of Illinois); Arash Termehchy (Oregon State University)*; Stephen Ramsey (Oregon State University); Aayam Shrestha (Oregon State University); Amy Glen (Oregon State University); Zheng Liu (Oregon State University)
FastVer: Making Data Integrity a Commodity Arvind Arasu (Microsoft)*; Badrish Chandramouli (Microsoft Research); Johannes Gehrke (Microsoft); Esha Ghosh (Microsoft); Donald Kossmann (Microsoft Research); Jonathan Protzenko (Microsoft); Ravi Ramamurthy (MICROSOFT); Tahina Ramananandro (Microsoft); Aseem Rastogi (Microsoft); Srinath Setty (Microsoft Research); Nikhil Swamy (Microsoft); Alexander van Renen (TUM); Min Xu (University of Chicago)
Instance-Optimized Data Layouts for Cloud Analytics Workloads Jialin Ding (MIT)*; Umar Farooq Minhas (Microsoft Research); Badrish Chandramouli (Microsoft Research); Chi Wang (Microsoft Research); Yinan Li (Microsoft Research); Ying Li (Microsoft); Donald Kossmann (Microsoft Research); Johannes Gehrke (Microsoft); Tim Kraska (MIT)
PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers Wei Cao (Alibaba); Yingqiang Zhang (Alibaba Group); Jimmy Yang (Alibaba Group); Feifei Li (Alibaba Group)*; Zongzhi Chen (Alibaba Group); Qingda Hu (Alibaba Group); Zhenjun Liu (Alibaba); Sheng Wang (Alibaba Group); Xuntao Cheng (Alibaba Group); Jing Fang (Alibaba Group); Bo Wang (Alibaba Group); Yuhui Wang (Alibaba Group); Haiqing Sun (Alibaba Group); Ze Yang (Alibaba Group); Zhushi Cheng (Alibaba); Sen Chen (Alibaba Group); Jian Wu (Alibaba Group); Wei Hu (Alibaba Group); Jianwei Zhao (Alibaba Group); Yusong Gao (Alibaba Cloud); Songlu Cai (Alibaba Group); Yunyang Zhang (Alibaba Group); Jiawang Tong (Alibaba Group)
Bringing Cloud-Native Storage to SAP IQ Mohammed Abouzour (SAP); Gunes Aluc (SAP Labs)*; Ivan Bowman (SAP); Xi Deng (SAP); Nandan Marathe (SAP); Sagar Ranadive (SAP); Muhammed Sharique (SAP); John C. Smirnios (SAP)
Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka Guozhang Wang (Confluent Inc.)*; Lei Chen (Bloomberg L.P.); Ayusman Dikshit (Expedia Group); Jason Gustafson (Confluent Inc.); Boyang Chen (Confluent Inc. ); Matthias J Sax (Confluent Inc.); John Roesler (Confluent Inc. ); Sophie Blee-Goldman (Confluent Inc.); Bruno Cadonna (Confluent Inc. ); Apurva Mehta (Confluent Inc. ); Varun Madan (Confluent Inc.); Jun Rao (Confluent Inc. )
FoundationDB: A Distributed Unbundled Transactional KeyValue Store Jingyu Zhou (Apple)*; Meng Xu (Apple); Alexander Shraer (Apple); Alex Miller (Apple); Bala Namasivayam (Apple); Evan Tschannen (Apple); Rusty Sears (Apple); John Leach (Apple); Dave Rosenthal (Apple); Will Wilson (antithesis.com); Ben Collins (antithesis.com); David Scherer (antithesis.com); Steve Atherton (Apple); Andrew Beamon (Apple); Xin Dong (Apple); Alec Grieser (Apple); Young Liu (Apple); Alvin Moore (Apple); Bhaskar Muppana (Apple); Xiaoge Su (Apple); Vishesh Yadav (Apple)
LogStore: A Cloud-Native and Multi-Tenant Log Database Wei Cao (Alibaba); Xiaojie Feng (Alibaba Cloud); Boyuan Liang (Alibaba Group); Tianyu Zhang (Alibaba Group); Yusong Gao (Alibaba Cloud)*; Yunyang Zhang (Alibaba Group); Feifei Li (Alibaba Group)
QuiCK: a Queuing System in CloudKit Kfir Lev-Ari (Apple)*; Yizuo Tian (Apple); Alexander Shraer (Apple); Chris Douglas (Apple); Hao Fu (Apple); Andrey Andreev (Apple); Kevin Beranek (Apple); Scott Dugas (Apple); Alec Grieser (Apple); Jeremy Hemmo (Apple)
KEA: Tuning an Exabyte-Scale Data Infrastructure Yiwen Zhu (Microsoft)*; Subru Krishnan (Microsoft); Konstantinos Karanasos (Microsoft); Isha Tarte (Microsoft); Conor Power (Microsoft); Abhishek Modi (Microsoft); Manoj Kumar (Microsoft); Deli Zhang (Microsoft); Kartheek Muthyala (Microsoft); Nick Jurgens (Microsoft); Sarvesh Sakalanaga (Salesforce); Sudhir Darbha (Microsoft); Minu Iyer (Microsoft); Ankita Agarwal (Microsoft); Carlo Curino (Microsoft)
Presenters: Mohammad Javad Amiri (University of Pennsylvania); Divyakant Agrawal (University of California Santa Barbara); Amr El Abbadi (University of California Santa Barbara) Abstract: The unique features of blockchains such as immutability, transparency, provenance, and authenticity have been used by many large-scale data management systems to deploy a wide range of distributed applications including supply chain management, healthcare, and crowdworking in permissioned settings. Unlike permissionless settings, e.g., Bitcoin, where the network is public, and anyone can participate without a specific identity, a permissioned blockchain system consists of a set of known, identified nodes that might not fully trust each other. While the characteristics of permissioned blockchains are appealing to a wide range of large-scale data management systems, these systems, have to satisfy four main requirements: confidentiality, verifiability, performance, and scalability. Various approaches have been developed in industry and academia to satisfy these requirements with varying assumptions and costs. The focus of this tutorial is on presenting many of these techniques while highlighting the trade-offs among them. We demonstrate the practicality of such techniques in real-life by presenting three different applications, i.e., supply chain management, large-scale databases, and multi-platform crowdworking environments, and show how those techniques can be utilized to meet the requirements of such applications.
Top-K Deep Video Analytics: A Probabilistic Approach Ziliang Lai (Chinese University of Hong Kong)*; Chenxia Han (Chinese University of Hong Kong); Chris Liu (Chinese University of Hong Kong); Pengfei Zhang (Chinese University of Hong Kong); Eric Lo (Chinese University of Hong Kong); Ben Kao (University of Hong Kong)
Evaluating Temporal Queries Over Video Feeds Yueting Chen (York University )*; Xiaohui Yu (York University); Nick Koudas (University of Toronto); Ziqiang Yu (Yantai University)
Consistent and Flexible Selectivity Estimation for High-Dimensional Data Yaoshu Wang (Shenzhen Institute of Computing Sciences, Shenzhen University); Chuan Xiao (Osaka University); Jianbin Qin (Shenzhen Institute of Computing Sciences, Shenzhen University)*; Rui Mao (Shenzhen Institute of Computing Sciences, Shenzhen University); Makoto Onizuka (Osaka University); Wei Wang (University of New South wales); Rui Zhang (" University of Melbourne, Australia"); Yoshiharu Ishikawa (Nagoya University)
Milvus: A Purpose-Built Vector Data Management System Jianguo Wang (Purdue University)*; Xiaomeng Yi (Zilliz); Rentong Guo (Zilliz); Hai Jin (Zilliz); Peng Xu (Zilliz); Shengjun Li (Zilliz); Xiangyu Wang (Zilliz); Xiangzhou Guo (Zilliz); Chengming Li (Zilliz); Xiaohai Xu (Zilliz); Kun Yu (Zilliz); Yuxing Yuan (Zilliz); Yinghao Zou (Zilliz); Jiquan Long (Zilliz); Yudong Cai (Zilliz); Zhenxiang Li (Zilliz); Zhifeng Zhang (Zilliz); Yihua Mo (Zilliz); Jun Gu (Zilliz); Ruiyi Jiang (Zilliz); Yi Wei (Zilliz); Chao Xie (Zilliz)
Spatial Independent Range Sampling Dong Xie (University of Utah)*; Jeff Phillips (University of Utah); Michael Matheny (Amazon); Feifei Li (University of Utah)
On m-Impact Regions and Standing Top-k Influence Problems Bo Tang (Southern University of Science and Technology); Kyriakos Mouratidis (Singapore Management University)*; Mingji Han (Southern University of Science and Technology)
Crosstown Foundry: A Scalable Data-driven Journalism Platform for Hyper-local News Luciano Nocera (University of Southern California)*; Giorgos Constantinou (University of Southern California); Luan V Tran (University of Southern California); Seon Ho Kim (University of Southern California); Gabriel Kahn (University of Southern California); Cyrus Shahabi (Computer Science Department. University of Southern California)
Minimizing the Regret of an Influence Provider Yipeng Zhang (RMIT University); Yuchen Li (Singapore Management University); Zhifeng Bao (RMIT University)*; Baihua Zheng (Singapore Management University); H. V. Jagadish (University of Michigan)
AutoAI-TS:AutoAI for Time Series Forecasting Syed Yousaf Shah (IBM T.J Watson Research Center)*; Dhaval Patel (IBM TJ Watson Research Center); Long Vu (IBM TJ Watson Research Center); Xuan-Hong Dang (IBM T.J Watson Research Center); Bei Chen (IBM Research); Peter Kirchner (IBM Research); Horst Samulowitz (IBM Research); David Wood (IBM Research); Gregory Bramble (IBM Research); Wesley M Gifford (IBM T. J. Watson Research Center); Venkata Sitaramagiridharganesh Ganapavarapu (IBM Research); Roman Vaculin (IBM Research); Petros Zerfos (IBM T.J Watson Research Center)
Self-adaptive Graph Traversal on GPUs Mo Sha (National University of Singapore)*; Yuchen Li (Singapore Management University); Kian-Lee Tan (National University of Singapore)
Efficient and Effective Algorithms for Revenue Maximization in Social Advertising Kai Han (University of Science and Technology of China)*; Benwei Wu (University of Science and Technology of China); Jing Tang (National University of Singapore); Shuang Cui (University of Science and Technology of China); Cigdem Aslay (Aarhus University); Laks V.S. Lakshmanan (The University of British Columbia)
Presenters: George Katsogiannis-Meimarakis (Athena Research Center, Greece); GreeceGeorgia Koutrika (Athena Research Center, Greece) Abstract: Data is a prevalent part of every business and scientific domain, but its explosive volume and increasing complexity make data querying challenging even for experts. For this reason, numerous text-to-SQL systems have been developed that enable querying relational databases using natural language. The recent advances on deep neural networks along with the creation of two large datasets specifically made for training text-to-SQL systems, have paved the path for a novel and very promising research area. The purpose of this tutorial is a deep dive into this area, covering state-of-the-art techniques for natural language representation in neural networks, benchmarks that sparked research and competition, recent text-to-SQL systems using deep learning techniques, as well as open problems and research opportunities.
Presenters: Abdul Wasay (Harvard University, USA); Subarna Chatterjee (Harvard University, USA); Stratos Idreos (Harvard University) Abstract: Deep neural networks enable numerous and diverse applications of machine learning. We present a tutorial on deep learning, highlighting the data systems nature of neural networks and research opportunities for the data management community. In particular we focus on three critical aspects: 1) classic tradeoffs and design problems in neural networks which can be enriched if seen through a systems and data management perspective, e.g., thinking critically about storage, data movement, and computation; 2) classic data systems design problems which can be reconsidered if neural networks can be considered as a viable design option, e.g., to replace or help system components that make decisions such as a database optimizer; 3) important ethics considerations for the application of neural networks in critical human-facing problems in society and how these also link to data management and performance. While these are a diverse set of rich topics, their combination offers additional rich opportunities for future research. The tutorial is designed to be accessible to data management researchers with no background in neural networks. This tutorial can be offered in both a 1.5-hour or as a 3-hour version.