SIGMOD/PODS Detailed Program
HILDA - Workshop on Human-In-the-Loop Data Analytics
Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Charlottenburg I/II/III
HILDA brings together researchers and practitioners to exchange ideas and results on human-data interaction. It explores how data management and analysis can be made more effective when taking into account the people who design and build these processes as well as those who are impacted by their results.
In HILDA 2022, we implemented a mentoring program (inspired by workshops such as PLATEAU) and are continuing it this year. Our focus is on promising and early-stage research, with a core component of the program being that each paper is assigned a mentor. More details on the process are below.
The theme for this edition of the workshop is HILDA and Large Language Models (LLMs), however, the workshop is not limited to this theme and other topics are also of interest. We encourage research on guidelines and best practices for effective human-LLM collaboration. We also encourage research that questions the role of humans in traditional data pipelines with the emergence of LLMs.
Workshop Chairs
- Remco Chang (Tufts University)
- Kexin Rong (Georgia Institute of Technology)
- Roee Shraga (Worcester Polytechnic Institute)
NOVAS - Novel Optimizations for Visionary AI Systems
Time: Sunday, 22.06.2025, 08:30 - 12:30
Location: Köpenick I/II/III
https://www.novasworkshop.org/
We want to bridge the gap between "data management'' and "generative AI'' research. We are calling for work or early ideas which may be deemed innovative, controversial, or disruptive if considered from the perspective of more established research areas.
Workshop Organizers
- Gerardo Vitagliano (MIT)
- Chunwei Liu (MIT)
- Lei Cao (University of Arizona)
- Huan Sun (OSU)
- Paolo Papotti (EURECOM)
MIDAS - Workshop connecting academia and industry on Modern Integrated Database and AI Systems
Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Tegel
https://sites.google.com/view/midas2025/home
This one-day workshop is designed to foster meaningful collaboration between researchers and industry practitioners by identifying and addressing complex challenges in the field of Generative AI (GenAI) and Data. These challenges often require a longer-term research perspective while simultaneously needing to remain grounded in real-world constraints and operational scenarios.
The primary goals of this workshop are twofold:
For Researchers: To inform and shape their research agendas by exposing them to the most pressing, unsolved challenges encountered by industry professionals. This engagement will help ensure that academic research remains relevant and aligned with practical needs, ultimately accelerating the path from theoretical advancements to real-world applications.
For Practitioners: To gain fresh perspectives and cutting-edge insights from the research community on emerging or unresolved topics in GenAI and Data. By engaging with researchers, industry professionals can explore novel methodologies, validate ideas, and potentially adopt innovative solutions to enhance their work.
We envision this workshop as an interactive and collaborative platform where participants from both academia and industry can share insights, challenges, and advancements in the rapidly evolving domains of GenAI and Data. Through panel discussions, presentations, and breakout sessions, attendees will have the opportunity to:
Identify Key Industry Challenges: Engage in discussions that highlight the most pressing problems faced in real-world GenAI and data-driven applications.
Explore Long-Term Research Directions: Examine areas where foundational research can contribute to addressing these challenges.
Build Cross-Sector Partnerships: Establish meaningful connections between researchers and practitioners, fostering collaborations that can lead to impactful innovations.
Exchange Practical & Theoretical Insights: Leverage the diverse expertise of participants to bridge the gap between theoretical advancements and their practical implementation.
By bringing together a diverse group of experts, this workshop aims to create a dynamic space where ideas are exchanged, research is informed by industry needs, and groundbreaking solutions can emerge at the intersection of academic rigor and real-world application.
Workshop Co-Chairs
- Avrilia Floratou, Microsoft, USA
- Jignesh M. Patel, CMU, USA
- Subru Krishnan, Microsoft, Spain
aiDM - Workshop on Exploiting Artificial Intelligence Techniques for Data Management
Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Schöneberg I/II/III
Recently, the field of Artificial Intelligence (AI) has been experiencing a resurgence. AI broadly covers a wide swath of techniques, which include logic-based approaches, probabilistic graphical models, machine learning approaches such as deep learning. Advances in specialized hardware capabilities (e.g., Graphics Processing Units (GPUs), Tensor Processing Units (TPUs), Field-Programmable Gate Arrays (FPGAs), etc.), software ecosystem (e.g., programming languages such as Python, Data Science frameworks, and accelerated ML libraries), and systems infrastructure (e.g., cloud servers with AI accelerators) have led to wide-spread adoption of AI techniques in a variety of domains. Examples of such domains include image classification, autonomous driving, automatic speech recognition, and conversational systems (e.g., chatbots). AI solutions not only support multiple data types (e.g., images, speech, or text), but also are available in various configurations and settings, from personal devices to large-scale distributed systems.
Despite the widespread adoption of AI across diverse domains, its integration with data management systems remains in its infancy. Currently, most database management systems (DBMS) serve primarily as repositories for feeding input data to AI models and storing results. Recently, there has been increasing interest in using AI techniques within data management systems, including natural language interfaces to relational databases and machine learning techniques for query optimization and performance tuning. However, significant opportunities remain to harness the full potential of AI for enhancing data management workloads.
aiDM'24 is a one-day workshop that will bring together people from academia and industry to explore innovative ways to integrate AI techniques into data management systems. The workshop will focus on leveraging AI to enhance various components of data management systems, including user interfaces, tooling, performance optimizations, and support for new query types and workloads. Special attention will be given to transparently exploiting AI techniques, such as Generative AI frameworks, for enterprise-class data management workloads. We aim to identify key research areas and inspire new initiatives in this emerging and transformative field.
Workshop Program Chairs
- Manisha Luthra Agnihotri, TU Darmstadt and DFKI
- Renata Borovica-Gajic, School of Computing and Information Systems, The University of Melbourne
- Ryan Marcus, Department of Computer Science, University of Pennsylvania
LLM-DPM - Next Gen Data and Process Management: Large Language Models and Beyond
Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Tiergarten I/II/III
https://dbpmworkshop.github.io/
The Workshop of Next Gen Data and Process Management: Large Language Models and Beyond (LLM-DPM), held in conjunction with the 2025 ACM SIGMOD Conference in Berlin, Germany, will explore the transformative role of Explainable AI (XAI), Trustworthy AI, and Large Language Models (LLMs) in revolutionizing Data and Process Management systems. Organizations across industries rely on complex processes to deliver products, services, and outcomes. Understanding these processes is critical for uncovering inefficiencies, addressing bottlenecks, ensuring compliance, and driving operational excellence.
Process mining, which leverages event logs from data systems, has emerged as a powerful approach for visualizing workflows, identifying anomalies, and optimizing processes. However, traditional methods such as surveys and interviews remain costly, error-prone, and disconnected from real operations. This workshop aims to bridge this gap by examining how cutting-edge AI techniques, particularly LLMs, can advance process mining and data management.
The workshop will focus on the emerging role of explainable AI and LLMs in addressing long-standing challenges such as query interpretation, data augmentation, user interaction, system optimization, future process prediction, and actionable insights for proactive decision-making. Particular attention will be paid to accountability and fairness to ensure these advancements lead to transparent, equitable, and resilient systems.
Additionally, the workshop will tackle critical challenges in integrating AI into process mining, including data quality, scalability of analysis techniques, and the complexity of large datasets. Discussions will fill gaps left by main track topics, delving into specific use cases, risks, and technical innovations in data-centric environments. By fostering dialogue between researchers and practitioners, the workshop will provide advancements at the intersection of AI, process mining, and database systems, driving both research and enterprise adoption.
Organizers
- Faiza Allah Bukhsh (University of Twente)
- Paolo Ceravolo (University of Milan)
- Samira Maghool (University of Milan)
- Xu Chu (Celonis)
- Eugene Wu (Columbia University)
Cong Yu (Celonis) - cong.yu@celonis.com
Tutorial 1: Advances in Designing Scalable Graph Neural Networks: The Perspective of Graph Data Management (part 1)
Time: Sunday, 22.06.2025, 09:00 - 10:30
Location: Bellevue
Ningyi Liao (Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Xiaokui Xiao (National University of Singapore), Reynold Cheng (The University of Hong Kong)
Tutorial 1: Advances in Designing Scalable Graph Neural Networks: The Perspective of Graph Data Management (part 2)
Time: Sunday, 22.06.2025, 11:00 - 12:30
Location: Bellevue
Ningyi Liao (Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Xiaokui Xiao (National University of Singapore), Reynold Cheng (The University of Hong Kong)
Tutorial 2: Supporting Human-Centric Data Exploration Through Semantics and Natural Language Interaction
Time: Sunday, 22.06.2025, 13:30 - 15:00
Location: Bellevue
Vidya Setlur (Tableau Research)
Qdata - Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications
Time: Sunday, 22.06.2025, 13:30 - 17:00
Location: Köpenick I/II/III
https://itrummer.github.io/qdata/
Whereas quantum computing started out as a purely theoretical concept, the last few years have seen a “Cambrian explosion” of first-generation commercial quantum hardware culminating from decades of foundational research. Players, including the likes of Google, IBM, and Intel, as well as startup companies like IQM, D-Wave, IonQ, and Rigetti, are now producing hardware devices that implement quantum computing using various technologies. At the same time, the recent advances in quantum computing have inspired a new generation of classical hardware accelerators, offered commercially by providers such as Fujitsu, Toshiba, and 1Qubit, that mirror the interfaces and take inspiration from internal processes of quantum computers. These accelerators, including digital annealers, as well as GPU- and FPGA-based simulators of quantum computation, obtain approximate solutions for extremely large, combinatorial optimization problems quickly.
Using quantum computing and related technologies has become convenient and possible with standard IT interfaces. Several software frameworks have recently appeared that make solving a diverse range of problems using quantum computers easier. At the same time, multiple cloud providers nowadays offer quantum computing as a service, making the technology accessible to broad shares of the population. Taken together, these developments have recently spawned a flurry of research in various communities, ranging from operations research to machine learning, and aimed at analyzing the transformative potential of quantum computing for specific use cases.
The primary objective of the Q-Data workshop is to explore how quantum computing and related technologies can enhance data processing, data management, data analysis systems, and techniques. It also focuses on hybrid approaches that integrate both quantum and classical computing methodologies to enhance such data systems and techniques. This workshop will spur new research efforts in this emerging field and pave the way for building next-generation data-intensive systems with quantum computing support.
Workshop Chairs
- Ibrahim Sabek (University of Southern California, USA)
- Immanuel Trummer (Cornell University, USA)
Tutorial 3: Learned Indexes From the One-dimensional to the Multi-dimensional Spaces: Challenges, Techniques, and Opportunities
Time: Sunday, 22.06.2025, 15:30 - 17:00
Location: Bellevue
Abdullah Al-Mamun (Purdue University), Jianguo Wang (Purdue University), Walid G. Aref (Purdue University)
SIGMOD/PODS Warmup: Query Optimization Unleashed
Time: Sunday, 22.06.2025, 17:15 - 18:30
Location: Charlottenburg I/II/III
What happens when cutting-edge theory meets AI-driven intelligence and real-world database engineering?
Prepare for an electrifying session that will shatter conventional wisdom on query optimization and cardinality estimation. This is not just another academic discussion — this is a battle of ideas where the sharpest minds from theory, machine learning, and database systems go head-to-head to define the future of query performance.
Several domain experts will challenge each other’s methodologies in a high-intensity, cross-disciplinary debate moderated by Dan Suciu (University of Washington) and Volker Markl (BIFOLD & Technical University of Berlin):
- Theoretical Powerhouse: Can information-theoretic guaranteed cardinality upper bounds lead the way?
Speaker: Dan Olteanu (University of Zurich) - Machine Learning Revolution: Is AI the future of query optimization?
Speaker: Carsten Binnig (Technical University of Darmstadt) - Real-World Systems: What actually works?
Speaker: Viktor Leis (Technical University of Munich) - Filling in the Gaps: What we don't talk about in research.
Speaker: Surajit Chaudhuri (Microsoft Research)
This is the session where database theory and systems really meet. Don’t just attend—be part of the revolution.
The event is organized by Floris Geerts, Benny Kimelfeld, and Volker Markl.
See the dedicated page for more details.
PODS Opening and Keynote
Time: Monday, 23.06.2025, 08:30 - 09:50
Location: Potsdam I/III
Session chair: Benny Kimelfeld
- Keynote: A fine-grained approach to algorithms and complexity (Virginia Vassilevska Williams) -
Virginia Vassilevska Williams (MIT)
Bio: Virginia Vassilevska Williams is a Professor at MIT EECS and CSAIL. She obtained her Ph.D. from Carnegie Mellon University in 2008. After research and postdoctoral positions at the IAS in Princeton, UC Berkeley and Stanford, she spent 3.5 years as an assistant professor at Stanford University before joining MIT in early 2017. She is the recipient of an NSF CAREER award, a Google Faculty Research Award, an Alfred P. Sloan Research Fellowship, a 2023 Simons Investigator Award and a FOCS 2024 Test of Time Award. In 2018 she gave an invited lecture at the International Congress of Mathematicians.
DaMoN - Data Management on New Hardware
Time: Monday, 23.06.2025, 10:00 - 18:30
Location: Charlottenburg I/II
The continued evolution of computing hardware and infrastructure imposes new challenges and bottlenecks to program performance. As a result, traditional database architectures that focus solely on I/O optimization increasingly fail to utilize hardware resources efficiently. Multi-core CPUs, GPUs, FPGAs, new memory and storage technologies (such as flash and non-volatile memory), and low-power hardware imposes a significant challenge to optimizing database performance. Consequently, exploiting the characteristics of modern hardware has become an essential topic of database systems research.
The goal is to make database systems adapt automatically to sophisticated hardware characteristics, thus maximizing performance transparently for applications. To achieve this goal, the data management community needs interdisciplinary collaboration with researchers from computer architecture, compilers, operating systems, and storage. This involves rethinking traditional data structures, query processing algorithms, and database software architectures to adapt to the advances in the underlying hardware infrastructure.
Chairs
- Carsten Binnig, TU Darmstadt, Germany
- Eric Sedlar, Oracle Labs
PODS Research 1: Gems of PODS & Test of Time Award
Time: Monday, 23.06.2025, 10:00 - 11:00
Location: Potsdam I/III
Session chair: Nicole Schweikardt
- Gems of PODS: Querying Graph Data: Where We Are and Where To Go - Wim Martens (University of Bayreuth)
- Test of Time Award: Joins via Geometric Resolutions: Worst-case and Beyond - Mahmoud Abo Khamis (RelationalAI, Inc), Hung Ngo (RelationalAI Inc.), Christopher Ré (Stanford University), Atri Rudra (University at Buffalo)
PODS Research 2: Join Evaluation & Maintenance (including the Best Paper Awards)
Time: Monday, 23.06.2025, 11:30 - 13:00
Location: Potsdam I/III
Session chair: Andreas Pieris
- Output-Optimal Algorithms for Join-Aggregate Queries (BPA) - Xiao Hu (University of Waterloo)
- Output-sensitive Conjunctive Query Evaluation (BPA) - Shaleen Deep (University of Wisconsin-Madison), Hangdong Zhao (University of Wisconsin, Madison), Austen Fan (UW Madison), Paraschos Koutris (University of Wisconsin-Madison)
- An Improved Fully Dynamic Algorithm for Counting 4-Cycles in General Graphs using Fast Matrix Multiplication - Sepehr Assadi (University of Waterloo), Vihan Shah (University of Waterloo)
- Insert-Only versus Insert-Delete in Dynamic Query Evaluation - Mahmoud Abo Khamis (RelationalAI, Inc), Ahmet Kara (University of Zurich), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington)
- Towards Update-Dependent Analysis of Query Maintenance - Xiao Hu (University of Waterloo), Qichen Wang (Hong Kong Baptist University)
- Fast Matrix Multiplication meets the Submodular Width - Mahmoud Abo Khamis (RelationalAI), Xiao Hu (University of Waterloo), Dan Suciu (University of Washington)
PODS Research 3: Explanation and Minimization
Time: Monday, 23.06.2025, 14:30 - 16:00
Location: Potsdam I/III
Session chair: Wolfgang Gatterbauer
- Circuits and Formulas for Datalog over Semirings - Austen Z. Fan (Department of Computer Sciences, University of Wisconsin-Madison), Paraschos Koutris (Department of Computer Sciences, University of Wisconsin-Madison), Sudeepa Roy (Department of Computer Science, Duke University)
- Improved Approximation Algorithms for Relational Clustering - Aryan Esmailpour (University of Illinois Chicago), Stavros Sintos (University of Illinois at Chicago)
- A Lower Bound on Unambiguous Context Free Grammars via Communication Complexity - Stefan Mengel (Univ. Artois, CNRS, Centre de Recherche en Informatique de Lens (CRIL)), Harry Vinall-Smeeth (Technische Universität Ilmenau)
- Minimizing Conjunctive Regular Path Queries (Distinguished Paper Award) - Diego Figueira (Univ. Bordeaux, CNRS, LaBRI), Rémi Morvan (Univ. Bordeaux, CNRS, LaBRI), Miguel Romero (Universidad Católica de Chile)
- Explaining k-Nearest Neighbors: Counterfactual and Abductive Explanations (Distinguished Paper Award) - Pablo Barcelo (Pontifical Catholic University of Chile), Alexander Kozachinskiy (Pontifical Catholic University of Chile), Miguel Romero (Pontifical Catholic University of Chile), Bernardo Subsercaseaux (Carnegie Mellon University), Jose Verschae (Pontifical Catholic University of Chile)
- Shapley Revisited: Tractable Responsibility Measures for Query Answers (Distinguished Paper Award) - Meghyn Bienvenu (CNRS, University of Bordeaux), Diego Figueira (CNRS, University of Bordeaux), Pierre Lafourcade (University of Bordeaux)
PODS Research 4: Database Queries, beyond Evaluation
Time: Monday, 23.06.2025, 16:30 - 18:30
Location: Potsdam I/III
Session chair: Bas Ketsman
- Smallest Synthetic Witnesses for Conjunctive Queries - Aryan Esmailpour (University of Illinois Chicago), Boris Glavic (University of illinois, Chicago), Xiao Hu (University of Waterloo), Stavros Sintos (University of Illinois Chicago)
- Towards Tractability of the Diversity of Query Answers: Ultrametrics to the Rescue - Marcelo Arenas (PUC Chile), Timo Merkl (TU Vienna), Reinhard Pichler (TU Wien), Cristian Riveros (PUC Chile)
- Computing A Well-Representative Summary of Conjunctive Query Results - Pankaj Agarwal (Duke University), Aryan Esmailpour (University of Illinois at Chicago), Xiao Hu (University of Waterloo), Stavros Sintos (University of Illinois at Chicago), Jun Yang (Duke University)
- Resilience for Regular Path Queries: Towards a Complexity Classification - Antoine Amarilli (Inria Lille), Wolfgang Gatterbauer (Northeastern University), Neha Makhija (Northeastern University), Mikaël Monet (Inria Lille)
- Soft and Constrained Hypertree Width - Matthias Lanzinger (TU Wien), Cem Okulmus (Umeå University), Reinhard Pichler (TU Wien), Alexander Selzer (TU Wien), Georg Gottlob (University of Calabria)
- Efficient Algorithms for Cardinality Estimation and Conjunctive Query Evaluation With Simple Degree Constraints - Sungjin Im (University of California, Merced), Benjamin Moseley (Carnegie Mellon University), Hung Ngo (RelationalAI Inc.), Kirk Pruhs (University of Pittsburgh)
- Optimal (Multiway) Spatial Joins - Ru Wang (The Chinese University of Hong Kong), Yufei Tao (The Chinese University of Hong Kong)
PODS Business Meeting
Time: Monday, 23.06.2025, 20:00 - 21:00
Location: Charlottenburg III
SIGMOD Opening & Keynote 1 & Awards Talks 1
Time: Tuesday, 24.06.2025, 08:30 - 10:00
Location: Potsdam I & III
Session chair: NN
- Keynote: How to Build a Brain (Christos H. Papadimitriou) -
Christos H. Papadimitriou (Columbia University)
Abstract: My previous talk at SIGMOD/PODS happened exactly three decades ago. That was the moment when the world of computation and the field of databases was transformed by the advent of the Internet. It was a change that went beyond mere paradigm. We realized that CS is not about the computer at all -- it is about computation, a sublime scientific phenomenon that permeates the universe. CS became a natural science because the Internet seemed to us as mysterious as the universe, the cell, the brain, the market, and we had to approach it through experiments, falsifiable theories, and new applied math. And, because the Internet is about people in a more intimate way than the computer was, at the same time CS became a social science. A new mode of CS research emerged, often referred to as the lens of computation: since computation underlies everything, when computer scientists look at challenging problems in other sciences, unexpected progress often ensues. CS researchers thought productively about game theory and economics, the quantum universe, phase transitions, biology and evolution, social phenomena and the law, the brain. Finally, nearly two decades after that moment in the 1990s, AI happened, a new powerful intellectual tsunami and irresistible frame of mind. In my talk I will contemplate this fascinating story, connecting its twists and turns with the subject of databases. I will conclude with a snapshot of my work over the past decade on understanding how the brain begets the mind: how the activity of individual neurons and synapses results in cognition, behavior, intelligence, and ultimately language, arguably the crowning achievement of the animal brain.
Bio: Christos H. Papadimitriou is the Donovan Family Professor of Computer Science at Columbia University. Before joining Columbia in 2017, he was a professor at UC Berkeley for the previous 22 years, and before that he taught at Harvard, MIT, NTU Athens, Stanford, and UCSD. He has written five textbooks and many articles on algorithms and complexity, and their applications to databases, optimization, control, AI, robotics, economics and game theory, the Internet, evolution, and the brain. He holds a PhD from Princeton (1976), and nine honorary doctorates, including from ETH, University of Athens, EPFL, and Univ. de Paris Dauphine. He is a member of the National Academy of Sciences of the US, the American Academy of Arts and Sciences, and the National Academy of Engineering, and he has received the Knuth prize, the Goedel prize, the Babbage award, the IEEE von Neumann medal, the von Neumann Theory Prize, the IEEE Women of the Edvac prize, as well as the 2018 Harvey Prize by Technion. He has also written fiction, including a New York Times bestseller.
- Award talks
Poster Session 1
Time: Tuesday, 24.06.2025, 10:30 - 11:30
Location: nan
- P1: Output-Optimal Algorithms for Join-Aggregate Queries (BPA)
Xiao Hu (University of Waterloo) - P2: Output-sensitive Conjunctive Query Evaluation (BPA)
Shaleen Deep (University of Wisconsin-Madison), Hangdong Zhao (University of Wisconsin, Madison), Austen Fan (UW Madison), Paraschos Koutris (University of Wisconsin-Madison) - P3: An Improved Fully Dynamic Algorithm for Counting 4-Cycles in General Graphs using Fast Matrix Multiplication
Sepehr Assadi (University of Waterloo), Vihan Shah (University of Waterloo) - P4: Insert-Only versus Insert-Delete in Dynamic Query Evaluation
Mahmoud Abo Khamis (RelationalAI, Inc), Ahmet Kara (University of Zurich), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington) - P5: Towards Update-Dependent Analysis of Query Maintenance
Xiao Hu (University of Waterloo), Qichen Wang (Hong Kong Baptist University) - P6: Fast Matrix Multiplication meets the Submodular Width
Mahmoud Abo Khamis (RelationalAI), Xiao Hu (University of Waterloo), Dan Suciu (University of Washington) - P7: Circuits and Formulas for Datalog over Semirings
Austen Z. Fan (Department of Computer Sciences, University of Wisconsin-Madison), Paraschos Koutris (Department of Computer Sciences, University of Wisconsin-Madison), Sudeepa Roy (Department of Computer Science, Duke University) - P8: Improved Approximation Algorithms for Relational Clustering
Aryan Esmailpour (University of Illinois Chicago), Stavros Sintos (University of Illinois at Chicago) - P9: A Lower Bound on Unambiguous Context Free Grammars via Communication Complexity
Stefan Mengel (Univ. Artois, CNRS, Centre de Recherche en Informatique de Lens (CRIL)), Harry Vinall-Smeeth (Technische Universität Ilmenau) - P10: Minimizing Conjunctive Regular Path Queries (Distinguished Paper Award)
Diego Figueira (Univ. Bordeaux, CNRS, LaBRI), Rémi Morvan (Univ. Bordeaux, CNRS, LaBRI), Miguel Romero (Universidad Católica de Chile) - P11: Explaining k-Nearest Neighbors: Counterfactual and Abductive Explanations (Distinguished Paper Award)
Pablo Barcelo (Pontifical Catholic University of Chile), Alexander Kozachinskiy (Pontifical Catholic University of Chile), Miguel Romero (Pontifical Catholic University of Chile), Bernardo Subsercaseaux (Carnegie Mellon University), Jose Verschae (Pontifical Catholic University of Chile) - P12: Shapley Revisited: Tractable Responsibility Measures for Query Answers (Distinguished Paper Award)
Meghyn Bienvenu (CNRS, University of Bordeaux), Diego Figueira (CNRS, University of Bordeaux), Pierre Lafourcade (University of Bordeaux) - P13: Smallest Synthetic Witnesses for Conjunctive Queries
Aryan Esmailpour (University of Illinois Chicago), Boris Glavic (University of illinois, Chicago), Xiao Hu (University of Waterloo), Stavros Sintos (University of Illinois Chicago) - P14: Towards Tractability of the Diversity of Query Answers: Ultrametrics to the Rescue
Marcelo Arenas (PUC Chile), Timo Merkl (TU Vienna), Reinhard Pichler (TU Wien), Cristian Riveros (PUC Chile) - P15: Computing A Well-Representative Summary of Conjunctive Query Results
Pankaj Agarwal (Duke University), Aryan Esmailpour (University of Illinois at Chicago), Xiao Hu (University of Waterloo), Stavros Sintos (University of Illinois at Chicago), Jun Yang (Duke University) - P16: Resilience for Regular Path Queries: Towards a Complexity Classification
Antoine Amarilli (Inria Lille), Wolfgang Gatterbauer (Northeastern University), Neha Makhija (Northeastern University), Mikaël Monet (Inria Lille) - P17: Soft and Constrained Hypertree Width
Matthias Lanzinger (TU Wien), Cem Okulmus (Umeå University), Reinhard Pichler (TU Wien), Alexander Selzer (TU Wien), Georg Gottlob (University of Calabria) - P18: Efficient Algorithms for Cardinality Estimation and Conjunctive Query Evaluation With Simple Degree Constraints
Sungjin Im (University of California, Merced), Benjamin Moseley (Carnegie Mellon University), Hung Ngo (RelationalAI Inc.), Kirk Pruhs (University of Pittsburgh) - P19: Optimal (Multiway) Spatial Joins
Ru Wang (The Chinese University of Hong Kong), Yufei Tao (The Chinese University of Hong Kong) - P20: A Quantum-Leap into Schema Matching: Beyond 1-to-1 Matchings
Luisa Gerlach, Tobias Köppl, Stefanie Scherzinger, Nicole Schweikardt and René Zander - P21: Optimal Dynamic Parameterized Subset Sampling
Junhao Gan (The University of Melbourne), Seeun William Umboh (The University of Melbourne), Hanzhi Wang (Renmin University of China), Anthony Wirth (The University of Sydney), Zhuo Zhang (The University of Melbourne) - P22: Perfect Sampling in Turnstile Streams Beyond Small Moments
David P. Woodruff (Carnegie Mellon University), Shenghao Xie (Texas A&M University), Samson Zhou (Texas A&M University) - P23: Robust Statistical Analysis on Streaming Data with Near-Duplicates in General Metric Spaces
Qin Zhang (Indiana University Bloomington) - P24: Efficient Algorithms for k-Clustering with Noisy and Exact Oracles
Sainyam Galhotra (Cornell University), Rahul Raychaudhury (Duke University), Stavros Sintos (University of Illinois at Chicago) - P25: On the adversarial robustness of Locality-Sensitive Hashing in Hamming space
Mikhail Makarov (EPFL), Michael Kapralov (EPFL), Christian Sohler (University of Cologne) - P26: A Theoretical Framework for Distribution-Aware Dataset Search
Aryan Esmailpour (University of Illinois Chicago), Sainyam Galhotra (Cornell University), Rahul Raychaudhury (Duke University), Stavros Sintos (University of Illinois Chicago) - P27: Private Synthetic Data Generation in Small Memory
Rayne Holland (CSIRO's Data61), Jason Xue (CSIRO's Data61), Chandra Thapa (CSIRO's Data61), Seyit Camtepe (CSIRO's Data61) - P28: Fully Dynamic Algorithms for Graph Databases with Edge Differential Privacy
Sofya Raskhodnikova (Boston University), Teresa Anna Steiner (University of Southern Denmark) - P29: Optimal Bounds for Private Minimum Spanning Trees via Input Perturbation
Rasmus Pagh (University of Copenhagen), Lukas Retschmeier (University of Copenhagen), Hao Wu (University of Waterloo), Hanwen Zhang (University of Copenhagen) - P30: Differentially Private Hierarchical Heavy Hitters
Ari Biswas (University of Warwick), Graham Cormode (Meta AI/ University Of Warwick), Yaron Kanza (AT&T Research), Divesh Srivastava (AT&T Research), Zhengyi Zhou (AT&T Research) - P31: Differentially Private Substring and Document Counting (Best Newcomer Award)
Giulia Bernardini (University of Trieste), Philip Bille (Technical University of Denmark), Inge Li Gørtz (Technical University of Denmark), Teresa Anna Steiner (University of Southern Denmark) - P32: Parallel Communication Obliviousness: One Round and Beyond
Yufei Tao (The Chinese University of Hong Kong), Ru Wang (The Chinese University of Hong Kong), Shiyuan Deng (the Chinese University of Hong Kong) - P33: Revisiting Weighted Information Extraction: A Simpler and Faster Algorithm for Ranked Enumeration
Pawel Gawrychowski (University of Wroclaw), Florin Manea (University of Göttingen), Markus L. Schmid (Humboldt Universität Berlin) - P34: Output-Sensitive Evaluation of Regular Path Queries
Mahmoud Abo Khamis (RelationalAI), Ahmet Kara (OTH Regensburg), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington) - P35: The Complexity of Maximal Common Subsequence Enumeration
Giovanni Buzzega (Università di Pisa), Alessio Conte (Università di Pisa), Yasuaki Kobayashi (Hokkaido University), Kazuhiro Kurita (Nagoya University), Giulia Punzi (Università di Pisa) - P36: Towards practical FPRAS for #NFA: Exploiting the Power of Dependence
Alexis de Colnet (Algorithms and Complexity Group, TU Wien), Kuldeep S. Meel (University of Toronto) - P37: Complex event recognition meets hierarchical conjunctive queries
Dante Pinto (Pontificia Universidad Católica de Chile), Cristian Riveros (Pontificia Universidad Católica de Chile) - P38: Complex event recognition under time constraints: towards a formal framework for efficient query evaluation
Julián García (Pontificia Universidad Católica de Chile), Cristian Riveros (Pontificia Universidad Católica de Chile) - P39: Restricted Chase Termination: You Want More than Fairness
David Carral (LIRMM, Inria, University of Montpellier, CNRS), Lukas Gerlach (Knowledge-Based Systems Group, TU Dresden), Lucas Larroque (DI ENS), Michaël Thomazo (Inria, DIENS, ENS, CNRS, PSL University) - P40: No Cliques Allowed: The Next Step Towards BDD/FC Conjecture
Lucas Larroque (DI ENS), Piotr Ostropolski-Nalewaja (University of Wroc?aw / TU Dresden), Michaël Thomazo (Inria, DIENS, ENS, CNRS, PSL University) - P41: Polynomial Time Convergence of the Iterative Evaluation of Datalogo Programs
Sungjin Im (University of California, Merced), Benjamin Moseley (Carnegie Mellon University), Hung Ngo (RelationalAI Inc.), Kirk Pruhs (University of Pittsburgh) - P42: Below and Above Why-Provenance for Datalog Queries
Marco Calautti (University of Milano), Ester Livshits (University of Edinburgh), Andreas Pieris (University of Edinburgh and University of Cyprus), Markus Schneider (University of Edinburgh) - P43: Rewriting Consistent Answers on Annotated Data
Phokion Kolaitis (University of California Santa Cruz and IBM Research), Nina Pardal (University of Southampton), Jonni Virtema (University of Sheffield), Jef Wijsen (Université de Mons) - P44: Computing Range Consistent Answers to Aggregation Queries via Rewriting
Aziz Amezian El Khalfioui (University of Mons), Jef Wijsen (University of Mons) - 1: Extending SQL to Return a Subdatabase
Joris Nix ( Saarland University, Saarland Informatics Campus), Jens Dittrich (Saarland University, Saarland Informatics Campus) - 2: MAST: Towards Efficient Analytical Query Processing on Point Cloud Data
Jiangneng Li (Nanyang Technological University), Haitao Yuan (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Han Mao Kiah (Nanyang Technological University), Shuhao Zhang (Huazhong University of Science and Technology) - 3: Rethinking The Compaction Policies in LSM-trees
Hengrui Wang (Tsinghua University), Jiansheng Qiu (Tsinghua University), Fangzhou Yuan (Tsinghua University), Huanchen Zhang (Tsinghua University) - 4: OBIR-tree: Secure and Efficient Oblivious Index for Spatial Keyword Queries
Zikai Ye (Xidian University), Xiangyu Wang (Xidian University), Zesen Liu (Xidian University), DAN ZHU (Northwestern Polytechnical University), Jianfeng Ma (Xidian University) - 5: Fast and Scalable Data Transfer across Data Systems
Haralampos Gavriilidis (Technische Universität Berlin), Kaustubh Beedkar (Indian Institute of Technology Delhi), Matthias Boehm (Technische Universität Berlin), Volker Markl (Technische Universität Berlin) - 6: cuMatch: A GPU-based memory-efficient worst-case optimal join processing method for subgraph queries with complex patterns
Sungwoo Park (KAIST), Seyeon Oh (DGIST), Min-Soo Kim (KAIST) - 7: Agree to Disagree: Robust Anomaly Detection with Noisy Labels
Dennis Hofmann (Worcester Polytechnic Institute), Peter VanNostrand (WPI), Lei Ma (WPI), Huayi Zhang (WPI), Joshua DeOliveira (Worcester Polytechnic Institute), Lei Cao (University of Arizona), Elke Rundensteiner (Worcester Polytechnic Institute) - 8: DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training
Renjie Liu (Southern University of Science and Technology), Yichuan Wang (SJTU), Xiao Yan (Centre for Perceptual and Interactive Intelligence (CPII) ), Zhenkun Cai (Amazon), Minjie Wang (Amazon), Haitian Jiang (New York University), Bo Tang (Southern University of Science and Technology), Jinyang Li (New York University) - 9: Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search
Jianyang Gao (Nanyang Technological University), Yutong Gou (Nanyang Technological University), Yuexuan Xu (Nanyang Technological University), Yongyi Yang (University of Michigan), Cheng Long (Nanyang Technological University), Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) - 10: Minimum Spanning Tree Maintenance in Dynamic Graphs
Lantian Xu (University of Technology Sydney), Dong Wen (University of New South Wales), Lu Qin (UTS), Ronghua Li (Beijing Institute of Technology), Ying Zhang (University of Technology Sydney), Yang Lu (UTS), Xuemin Lin (Shanghai Jiaotong University) - 11: Multivariate Time Series Cleaning under Speed Constraints
Aoqian Zhang (Beijing Institute of Technology), Zexue Wu (Beijing Institute of Technology), Yifeng Gong (Beijing Institute of Technology), Ye Yuan ( Beijing Institute of Technology), Guoren Wang (Beijing Institute of Technology) - 12: DIGRA: A Dynamic Graph Indexing for Approximate Nearest Neighbor Search with Range Filter
Mengxu Jiang (The Chinese University of Hong Kong), Zhi Yang (Huazhong University of Science and Technology), Fangyuan Zhang (The Chinese University of Hong Kong), Guanhao Hou (The Chinese University of Hong Kong), Jieming Shi (The Hong Kong Polytechnic University), Wenchao Zhou (Alibaba Group), Feifei Li (Alibaba Group), Sibo Wang (The Chinese University of Hong Kong) - 13: Accelerating Skyline Path Enumeration with a Core Attribute Index on Multi-attribute Graphs
Yuanyuan Zeng (Chinese University of Hong Kong, Shenzhen), Yixiang Fang (School of Data Science, The Chinese University of Hong Kong, Shenzhen), Wensheng Luo (School of Data Science, The Chinese University of Hong Kong, Shenzhen), Chenhao Ma (The Chinese University of Hong Kong, Shenzhen) - 14: SHARQ: Explainability Framework for Association Rules on Relational Data
Hadar Ben Efraim (Bar-Ilan University), Susan Davidson (University of Pennsylvania), Amit Somech (Bar-Ilan University) - 15: On Graph Representation for Attributed Hypergraph Clustering
Zijin Feng (The Chinese University of Hong Kong), Miao Qiao (The University of Auckland), Chengzhi Piao (Hong Kong Baptist University), Hong Cheng (Chinese University of Hong Kong) - 16: Aster: Enhancing LSM-structures for Scalable Graph Database
Dingheng Mo (Nanyang Technological University), Junfeng Liu (Nanyang Technological University), FAN WANG (Nanyang Technological University), Siqiang Luo (Nanyang Technological University) - 17: Common Neighborhood Estimation over Bipartite Graphs under Local Differential Privacy
Yizhang He (The University of New South Wales), Kai Wang (Shanghai Jiao Tong University), Wenjie Zhang (University of New South Wales), Xuemin Lin (Shanghai Jiaotong University), Ying Zhang (University of Technology Sydney) - 18: Yannakakis+: Practical Acyclic Query Evaluation with Theoretical Guarantees
Qichen Wang (Hong Kong Baptist University), Bingnan Chen (HKUST ), Binyang DAI (Hong Kong University of Science and Technology), Ke Yi (Hong Kong Univ. of Science and Technology), Feifei Li (Alibaba Group), Liang Lin (Alibaba) - 19: Efficiently Counting Triangles in Large Temporal Graphs
Yuyang Xia (The Chinese University of Hong Kong, Shenzhen), Yixiang Fang (School of Data Science, The Chinese University of Hong Kong, Shenzhen), Wensheng Luo (School of Data Science, The Chinese University of Hong Kong, Shenzhen) - 20: Wait and See: A Delayed Transactions Partitioning Approach in Deterministic Database Systems for Better Performance
Yuan Sui (Northeastern University), Xiaochun Yang (Northeastern University), Bin Wang (Northeastern University), Yujie Zhang (Northeastern University), Baihua Zheng (Singapore Management University) - 21: GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models
Mengyi Yan (Beihang University), Yaoshu Wang (Shenzhen Institute of Computing Sciences, Shenzhen University), Yue Wang (Shenzhen Institute of Computing Sciences), Xiaoye Miao (Zhejiang University), Jianxin Li (Beihang University) - 22: Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs
Wenjing Deng (East China Normal University), Qiuyang Mang (The Chinese University of Hong Kong, Shenzhen), Chengyu Zhang (ETH Zurich), Manuel Rigger (National University of Singapore) - 23: Rapid Data Ingestion through DB-OS Co-design
Hyungsoo Jung (Seoul National University), Alan Fekete (University of Sydney), Minseok Yoon (Hanyang University), Kihwan Kim (Hanyang University), Kyungmin Lim (Hanyang University) - 24: Memento Filter: A Fast, Dynamic, and Robust Range Filter
Navid Eslami (University of Toronto), Niv Dayan (University of Toronto) - 25: Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search
Yuzheng Cai (Fudan University), Jiayang Shi (Fudan university), Yizhuo Chen (Fudan University), Weiguo Zheng (Fudan University) - 26: Tao: Improving Resource Utilization while Guaranteeing SLO in Multi-tenant Relational Database-as-a-Service
Haotian Liu (Southern University of Science and Technology), Runzhong LI (Southern University of Science and Technology), Ziyang Zhang (Southern University of Science and Technology), Bo Tang (Southern University of Science and Technology) - 27: An Elephant Under the Microscope: Analyzing the Interaction of Optimizer Components in PostgreSQL
Rico Bergmann (Technische Universität Dresden), Claudio Hartmann (Technische Universität Dresden), Dirk Habich (TU Dresden), Wolfgang Lehner (TU Dresden) - 28: Revisiting Graph Analytics Benchmarks
Yu Shao (East China Normal University), Lingkai Meng (Shanghai Jiao Tong University), Long Yuan (Nanjing University of Science and Technology), Longbin Lai (Alibaba Group), Peng Cheng (East China Normal University), Xue Li (Alibaba Group), Wenyuan Yu (Alibaba Group), Wenjie Zhang (University of New South Wales), Jingren Zhou (Alibaba Group), Xuemin Lin (Shanghai Jiaotong University) - 29: Multi-Level Graph Representation Learning Through Predictive Community-based Partitioning
Bo-Young Lim (Seoul National University of Science and Technology), Jeong-Ha Park (Seoul National University of Science and Technology), Kisung Lee (Louisiana State University), Hyuk-Yoon Kwon (Seoul National University of Science and Technology) - 30: B-Trees Are Back: Engineering Fast and Pageable Node Layouts
Marcus Müller (Technische Universität München), Lawrence Benson (TU München), Viktor Leis (Technische Universität München) - 31: LeaFi: Data Series Indexes on Steroids with Learned Filters
Qitong Wang (Harvard University), Ioana Ileana (Université Paris Cité), Themis Palpanas (Université Paris Cité) - 32: Largest Triangle Sampling for Visualizing Time Series in Database
Lei Rui (Tsinghua University), Xiangdong Huang (Tsinghua University), Shaoxu Song (Tsinghua University), Chen Wang (" Tsinghua University, China"), Jianmin Wang ("Tsinghua University, China"), zhao cao (Huawei Technologies Co., Ltd) - 33: SpareLLM: Automatically Selecting Task-Specific Minimum-Cost Large Language Models under Equivalence Constraint
Saehan Jo (Cornell University), Immanuel Trummer (Cornell University) - 34: Femur: A Flexible Framework for Fast and Secure Querying from Public Key-Value Store
Jiaoyi Zhang (Tsinghua University), Liqiang Peng (Alibaba Group), Mo Sha (Alibaba Group), Weiran Liu (Alibaba Group), Xiang Li (Tsinghua University), Sheng Wang (Alibaba Group), Feifei Li (Alibaba Group), Mingyu Gao (Tsinghua University), Huanchen Zhang (Tsinghua University) - 35: Nezha: An Efficient Distributed Graph Processing System on Heterogeneous Hardware
pengjie cui (Northeastern University), Haotian Liu (Southern University of Science and Technology), Dong Jiang (Northeastern University), Bo Tang (Southern University of Science and Technology), Ye Yuan ( Beijing Institute of Technology) - 36: Fast Approximate Similarity Join in Vector Databases
Jiadong Xie (The Chinese University of Hong Kong), Jeffrey Xu Yu (Chinese University of Hong Kong), Yingfan Liu (Xidian University) - 37: Efficient and Accurate PageRank Approximation on Large Graphs
Siyue Wu (Shenzhen University), Dingming Wu (Shenzhen University), Junyi Quan (Shenzhen University), Tsz Nam Chan (Shenzhen University), Kezhong Lu (Shenzhen University) - 38: SWASH: A Flexible Communication Framework with Sliding Window-Based Cache Sharing for Scalable DGNN Training
Zhen Song (Northeastern University), Yu Gu (Northeastern University), Tianyi Li (Aalborg University), Yushuai Li (Aalborg University), Qing Sun (Northeastern University), Yanfeng Zhang (Northeastern University), Christian S. Jensen (Aalborg University), Ge Yu (Northeastern University) - 39: Efficient Dynamic Indexing for Range Filtered Approximate Nearest Neighbor Search
Fangyuan Zhang (The Chinese University of Hong Kong), Mengxu Jiang (The Chinese University of Hong Kong), Guanhao Hou (The Chinese University of Hong Kong), Jieming Shi (The Hong Kong Polytechnic University), Hua Fan (Alibaba Cloud), Wenchao Zhou (Alibaba Group), Feifei Li (Alibaba Group), Sibo Wang (The Chinese University of Hong Kong) - 40: Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System
Muhammad Imam Luthfi Balaka (University of Indonesia), David Alexander (University of Indonesia), Qiming Wang (The University of Chicago), Yue Gong (The University of Chicago), Adila Krisnadhi (Faculty of Computer Science Universitas Indonesia), Raul Castro Fernandez (The University of Chicago) - 41: AJOSC: Adaptive join order selection for continuous queries on data streams
Xinyi Ye (Peking University), Xiangyang Gou (University of New South Wales), Lei Zou (Peking University), Wenjie Zhang (University of New South Wales) - 42: Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation
Sai Sundaresan (Adobe Research), Shubham Agarwal (Adobe Research), Subrata Mitra (Adobe Research), Debabrata Mahapatra (Adobe Research), Archit Gupta (IIT Bombay), Rounak Sharma (Indian Institute of Technology Kanpur), Nirmal Joshua Kapu (Indian Institute of Technology Kanpur), Tong Yu (Adobe Research), Shiv Saini (Adobe Research) - 43: SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs
Suhyun Lee (Yonsei University), Chaemin Lim (Yonsei University), Jinwoo Choi (Yonsei University), heelim choi (Yonsei University), Chan Lee (Yonsei University), Yongjun Park (Yonsei university), Kwanghyun Park (Yonsei University), Hanjun Kim (Yonsei University), Youngsok Kim (Yonsei University) - 44: DFlush: DPU-Offloaded Flush for Disaggregated LSM-based Key-Value Stores
Chen Ding (Huazhong University of Science and Technology), Kai Lu (Huazhong University of Science and Technology), QuanYi Zhang (Huazhong University of Science and Technology), zekun ye (华中科技大学), Ting Yao (Huawei Cloud Computing Technology Co., Ltd.), Daohui Wang (Huawei Cloud Computing Technology Co., Ltd.), huatao wu (huawei), Jiguang Wan (Huazhong University of Science and Technology)
Demo Session A
Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Bellevue
- Demonstrating CatDB: LLM-based Generation of Data-centric ML Pipelines
Saeed Fathollahzadeh (Concordia University), Essam Mansour (Concordia University), Matthias Boehm (TU Berlin) - D-Bot: An LLM-Powered DBA Copilot
Zhaoyan Sun (Tsinghua University), Xuanhe Zhou (Shanghai Jiao Tong University), Jianming Wu (Tsinghua University), Wei Zhou (Huawei Company), Guoliang Li (Tsinghua University) - Demonstrating SQLBarber: Leveraging Large Language Models to Generate Customized, Constraint-aware, and Realistic SQL
Jiale Lao (Cornell University), Immanuel Trummer (Cornell University) - DANTE: Hybrid AI System for Context-Aware Interpretable Feature Engineering
Mohamed BOUADI (Université Paris Cité, LIPADE, SAP Labs Paris), Arta Alavi (SAP France), Salima Benbernou (Université Paris Cité, LIPADE), Mourad Ouziri (Université Paris Cite, LIPADE) - RTS+: Reliable Text to SQL
kaiwen chen (university of Toronto), Yueting Chen (Seattle University), Nick Koudas (university of Toronto), Xiaohui Yu (York University) - SwellDB: Dynamic Query-Driven Table Generation with Large Language Models
Victor Giannakouris (Cornell University), Immanuel Trummer (Cornell University) - Database as Runtime: Compiling LLMs to SQL for In-database Model Serving
Wenbo Sun (Delft University of Technology), Ziyu Li (Delft University of Technology), Riahn Hai (Delft University of Technology) - Prompt Editor: A Taxonomy-driven System for Guided LLM Prompt Development in Enterprise Settings
Jeffery Cao (Celonis Inc.), Lampros Flokas (Celonis Inc.), Yujian Xu (Celonis Inc.), Eugene Wu (Columbia University), Xu Chu (Celonis Inc.), Cong Yu (Celonis Inc.) - DataDazzle: Intelligent Data Exploration through Natural Language
Mike Xydas (Athena R.C.), Anna Mitsopoulou (Athena Research Cetner), George Katsogiannis-Meimarakis (Athena Research Center), Chris Tsapelas (Athena Research Center), Stavroula Eleftherakis (Athena Research Cetner), Antonis Mandamadiotis (Athena Research Center), Georgia Koutrika (Athena Research Center) - ScaleLLM: A technique for scalable LLM-augmented data systems
Paul Loh (University of Pennsylvania), Ashwin Alaparthi (University of Pennsylvania), Ryan Marcus (University of Pennsylvania) - PalimpChat: Declarative and Interactive AI analytics
Paul Loh (University of Pennsylvania), Ashwin Alaparthi (University of Pennsylvania), Ryan Marcus (University of Pennsylvania) - UNITQA: A Unified Automated Tabular Question Answering System with Multi-Agent Large Language Models [Industry]
Jun-Peng Zhu (East China Normal University &, PingCAP), Peng Cai (East China Normal University), Kai Xu (PingCAP), Li Li (PingCAP), Yishen Sun (PingCAP), Shuai Zhou (PingCAP), Haihuang Su (PingCAP), Liu Tang (PingCAP), Qi Liu (PingCAP) - LLM-Matcher: a Name-Based Schema Matching Tool using Large Language Models
Marcel Parciak (Hasselt University), Brecht Vandevoort (Hasselt University), Frank Neven (Hasselt University), Liesbet M. Peeters (Hasselt University), Stijn Vansummeren (Hasselt University) - Sentence to Model: Cost-Effective Data Collection LLM Agent
Yael Einy (Tel Aviv University), Guy Dar (Tel Aviv University), Slava Novgorodov (Tel Aviv University), Tova Milo (Tel Aviv University) - OmniTune: A universal framework for query refinement via LLMs
Amit Somech (Bar-Ilan University), Yuval Moskovitch (Ben Gurion University), Eldar Hacohen (Bar-Ilan University) - Real Time Sentinel –, An LLM Based PII detector
Bhushan Khaladkar (Striim Inc.) - Andromeda: Debugging Database Performance Issues with Retrieval-Augmented Large Language Models
Pengyi Wang (Renmin University of China), Sibei Chen (Renmin University of China), Ju Fan (Renmin University of China), Bin Wu (Alibaba Cloud Computing), Nan Tang (HKUST (GZ) / HKUST), Jian Tan (Alibaba Cloud Computing)
SIGMOD Research 1: Indexing
Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Potsdam I
Session chair: NN
- Efficient Indexing for Flexible Label-Constrained Shortest Path Queries in Road Networks - Libin Wang (Hong Kong University of Science and Technology)*, Raymond Chi-Wing Wong (Hong Kong University of Science and Technology)
- How to Grow an LSM-tree: Towards Bridging The Gap Between Theory and Practice - Dingheng Mo (Nanyang Technological University), Siqiang Luo (Nanyang Technological University)*, Stratos Idreos (Harvard)
- NEXT: A New Secondary Index Framework for LSM-based Data Storage - JIACHEN SHI (Nanyang Technological University)*, Jingyi Yang (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Xiaoli Li (Institute for Infocomm Research , A*STAR, Singapore/Nanyang Technological University)
- A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach - Taiyi Wang (University of Cambridge)*, Liang Liang (Imperial College London), Guang Yang (Neo4j), Thomas Heinis (Imperial College), Eiko Yoneki (University of Cambridge)
- BT-Tree: A Reinforcement Learning Based Index for Big Trajectory Data - Tu Gu (Nanyang Technological University)*, Kaiyu Feng (Beijing Institute of Technology), Jingyi Yang (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Cheng Long (Nanyang Technological University), Rui Zhang (ruizhang.info)
SIGMOD Research 2: Graph Algorithms
Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Potsdam III
Session chair: NN
- B$\circledS X$ : Subgraph Matching with Batch Backtracking Search - Yujie Lu (Fudan University), Zhijie Zhang (Fudan University), Weiguo Zheng (Fudan University)*
- Constant-time Connectivity Querying in Dynamic Graphs - Lantian Xu (University of Technology Sydney), Dong Wen (University of New South Wales)*, Lu Qin (UTS), Ronghua Li (Beijing Institute of Technology), Ying Zhang (University of Technology Sydney), Xuemin Lin (Shanghai Jiaotong University)
- A Local Search Approach to Efficient (k,p)-Core Maintenance - Chenghan Zhang (Wuhan University), Yuanyuan Zhu (Wuhan University)*, Lijun Chang (The University of Sydney)
- SBSC: A fast Self-tuned Bipartite proximity graph-based Spectral Clustering - Abdul Khan (PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur)*, Rashmi Maheshwari (IIITDM Jabalpur), Mohammad Maksood Akhter (PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur), Dr.Sraban Kumar Mohanty (IIIT Jabalpur)
- Subgroup Discovery with Small and Alternative Feature Sets - Jakob Bach (Karlsruhe Institute of Technology (KIT))*
PODS Research 5: Tutorial 1 (Albert Atserias) & Other Connections to Quantum Computing
Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Charlottenburg I/II
Session chair: Cristina Sirangelo
- Local-vs-Global Consistency of Annotated Relations - Albert Atserias (UPC)
- A Quantum-Leap into Schema Matching: Beyond 1-to-1 Matchings -
Luisa Gerlach (Humboldt-Universität zu Berlin), Tobias Köppl (Fraunhofer Fokus), Stefanie Scherzinger (Universität Passau), Nicole Schweikardt (Humboldt-Universität zu Berlin), René Zander (Fraunhofer Fokus)
SIGMOD Industry 1: Cloud Database Architecture
Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Tiergarten I/II/III
Session chair: NN
- Eigen+: Memory Over-Subscription for Alibaba Cloud Databases -
Ji You Li (Alibaba Cloud); Jiachi Zhang (Alibaba Cloud); Yuhang Liu (Alibaba Cloud); Wenchao Zhou (Alibaba Cloud)*; Xin Zhou (Alibaba Cloud); Fangyuan Zhou (Alibaba Cloud); Feifei Li (Alibaba Cloud)
- CloudJump II: Optimizing Cloud Databases for Shared Storage -
Zongzhi Chen (Alibaba Group)*; xinjun Yang (Alibaba Group); Mo Sha (Alibaba Group); Feifei Li (Alibaba Group); Kang Wang (Alibaba Group); Zheyu Miao (Alibaba Group); Jie Xu (Alibaba Group); Jianfeng Wang (Alibaba-Inc); Sheng Wang (Alibaba Group)
- Unlocking the Potential of CXL for Disaggregated Memory in Cloud-Native Databases (best industry paper) -
Xinjun Yang (Alibaba Cloud Computing); Yingqiang Zhang (Alibaba Cloud Computing ); Hao Chen (Alibaba Group )*; Feifei Li (Alibaba Cloud Computing ); Gerry Fan (XConn Technologies); Yang Kong (Alibaba Cloud Computing ); Bo Wang (Alibaba Cloud Computing );
- ABase: The Multi-Tenant NoSQL Serverless Database for Diverse and Dynamic Workloads in Large-scale Cloud Environments -
Rong Kang (ByteDance)*; Yanbin Chen (Bytedance); Ye Liu (Bytedance); Fuxin Jiang (Bytedance); Qingshuo Li (Bytedance); Miao Ma (Bytedance); Jian Liu (Bytedance); Guangliang Zhao (Bytedance); Tieying Zhang (Bytedance); Jianjun Chen (Bytedance); Lei Zhang (
- CockroachDB Serverless: Sub-second scaling from zero with multi-region cluster virtualization -
Jeff Swenson (Cockroach Labs); Andy Kimball (Cockroach Labs); Raphael Poss (Cockroach Labs); Rebecca Taft (Cockroach Labs)*; Jay Lim (Cockroach Labs); Adam Storm (Cockroach Labs); Sumeer Bhola (Cockroach Labs); Paul Bulkley-Logston (Cockroach Labs); Adity
- Adaptive and Efficient Log Parsing as a Cloud Service -
Zeyan Li (ByteDance Inc.)*; Jie Song (ByteDance Inc.); Tieying Zhang (Bytedance); Tao Yang (ByteDance Inc.); Xiongjun Ou (ByteDance Inc.); Yingjie Ye (ByteDance Inc.); Pengfei Duan (ByteDance Inc.); Muchen Lin (ByteDance Inc.); Jianjun Chen (Bytedance)
SIGMOD New Researcher Symposium
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Charlottenburg III
Demo Session B
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Bellevue
- ChARLES: Change-Aware Recovery of Latent Evolution Semantics in Relational Data
Shiyi He (University of Utah), Alexandra Meliou (University of Massachusetts Amherst), Anna Fariha (University of Utah) - Locator: Local Stability for Rankings
Felix Campbell (Ben-Gurion University of the Negev), Yuval Moskovitch (Ben-Gurion University of the Negev) - Catching up with Disorder: Dynamic Graphs with Out-of-Order Updates
MUHAMMAD KHAN (INRIA), Ioana Manolescu (INRIA), Angelos Christos Anadiotis (Oracle) - SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage
Florens Rohde (University of Leipzig), Victor Christen (University of Leipzig), Erhard Rahm (University of Leipzig) - Qymera: Simulating Quantum Circuits using RDBMS
Tim Littau (TU Delft), Rihan Hai (TU Delft) - Finding What You're Looking For: A Distribution-Aware Dataset Search Engine in Action
Lennart Behme (Technische Universität Berlin), Leonard Geißler (Technische Universität Berlin), Pratham Agrawal (IIT Delhi), Emil Badura (Technische Universität Berlin), Benjamin Ueber (Technische Universität Berlin), Kaustubh Beedkar (IIT Delhi), Volker Markl (Technische Universität Berlin) - Mobility Stream Processing on NebulaStream and MEOS
Mariana Duarte (Universite libre de Bruxelles), Dwi P.A. Nugroho (Technische Universität Berlin, Bifold), Georges Tod (SNCB-NMBS Engineering), Evert Bevernage (SNCB-NMBS Engineering), Pieter Moelans (SNCB-NMBS Engineering), Emine Tas (SNCB-NMBS Engineering), Esteban Zimányi (Universite libre de Bruxelles), Mahmoud Sakr (Universite libre de Bruxelles), Steffen Zeuch (Technische Universität Berlin, Bifold), Volker Markl (Technische Universität Berlin, Bifold) - Grafixer: Enabling User-Centric Repairs for Property Graphs
Amedeo Pachera (Lyon 1 University), Angela Bonifati (Lyon 1 University), Andrea Mauri (Lyon 1 University) - Apache Wayang in Action: Enabling Data Systems Integration via a Unified Data Analytics Framework
Kaustubh Beedkar (IIT Dehli), Aurelien Bertrand (ITU Copenhagen), Haralampos Gavriilidis (TU Berlin), Augusto Fonseca (LNCC), Zoi Kaoudi (IT University of Copenhagen), Mingxi Liu (East China Normal University), Volker Markl (TU Berlin), Juri Petersen (IT University of Copenhagen), Fabio Porto (LNCC), Victor Ribeiro (LNCC), Mads Sejer Pedersen (IT University of Copenhagen), Lucas Tavares (LNCC), Michalis Vargiamis (Scalytics), Chen Xu (East China Normal University) - LpBound in Action: Cardinality Estimation with One-Sided Guarantees
Haozhe Zhang (University of Zurich), Christoph Mayer (University of Zurich), Mahmoud Abo Khamis (RelationalAI), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington) - ShapX Engine: A Demonstration of Shapley Value Approximations
SUCHIT GUPTE (Ohio State University), John Paparrizos (The Ohio State University) - Doctopus: A System for Budget-aware Structural Data Extraction from Unstructured Documents
Yuanhao Zhong (Beijing Institude of Technology), Yuhao Deng (Beijing Institude of Technology), Chengliang Chai (Beijing Institude of Technology), Ruixin Gu (Beijing University of Technology), Ye Yuan (Beijing Institude of Technology), Guoren Wang (Beijing Institude of Technology), Lei Cao (University of Arizona) - Demo of Kishu: Time-Traveling for Computational Notebooks
Zhaoheng Li (University of Illinois at Urbana-Champaign), Supawit Chockchowwat (University of Illinois at Urbana-Champaign), Hanxi Fang (University of Illinois at Urbana-Champaign), Yongjoo Park (University of Illinois at Urbana-Champaign) - NeutronRAG: Towards Understanding the Effectiveness of RAG from a Data Retrieval Perspective
Peizheng Li (Northeastern University), Chaoyi Chen (Northeastern University), Hao Yuan (Northeastern University), Zhenbo Fu (Northeastern University), Hang Shen (Northeastern University), Xinbo Yang (Northeastern University), Qiange Wang (National University of Singapore), Xin Ai (Northeastern University), Yanfeng Zhang (Northeastern University), Yingyou Wen (Neusoft AI Magic Technology Research), Ge Yu (Northeastern University) - MiniClean: A Single-Machine System for Cleaning Big Graphs
Wenchao Bai (Southeast University), Wenfei Fan (University of Edinburgh), Jiahui Jin (Southeast University), Daji Li (Shenzhen Institute of Computing Sciences), Jian Li (Shenzhen Institute of Computing Sciences), Shuhao Liu (Shenzhen Institute of Computing Sciences), Mingliang Ouyang (Shenzhen Institute of Computing Sciences), Qiang Yuan (Shenzhen Institute of Computing Sciences) - CausaLens: A System for Summarizing Causal DAGs
Noam Chen (Technion), Anna Zeng (MIT), Michael Cafarella (MIT), Batya Kenig (Technion), Markos Markakis (MIT), Oren Mishali (Technion), Brit Youngmann (Technion), Babak Salimi (University of California, San Diego) - OIE: An Interpretable System for Outlier Explanation and Summarization
Jingzhe Xu (Beijing Institute of Technology), Yuhao Deng (Beijing Institute of Technology), Chengliang Chai (Beijing Institute of Technology), Zequn Li (Beijing Institute of Technology), Yuping Wang (Beijing Institute of Technology), Lei Cao (University of Arizona)
SIGMOD Research 3: Vector Search & Neighbor Search
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Potsdam I
Session chair: NN
- Graph-Based Vector Search: An Experimental Evaluation of the State-of-the-Art -
Ilias Azizi (Mohammed VI Polytechnic University)*, Karima Echihabi (Mohammed VI Polytechnic University), Themis Palpanas (Université Paris Cité)
- DEG: Efficient Hybrid Vector Search Using the Dynamic Edge Navigation Graph - Ziqi Yin (Nanyang Technological University)*, Jianyang Gao (Nanyang Technological University), Pasquale Balsebre (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Cheng Long (Nanyang Technological University)
- Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search -
Jiuqi Wei (Institute of Computing Technology, Chinese Academy of Sciences)*, Xiaodong Lee (ICT), Zhenyu Liao (Huazhong University of Science and Technology), Themis Palpanas (Université Paris Cité), Botao Peng (Institute of Computing Technology, Chinese Academy of Sciences)
- iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search - Yuexuan Xu (Nanyang Technological University), Jianyang Gao (Nanyang Technological University)*, Yutong Gou (Nanyang Technological University), Cheng Long (Nanyang Technological University), Christian S. Jensen (Aalborg University)
- MIRAGE-ANNS: Mixed Approach Graph-based Indexing for Approximate Nearest Neighbor Search -
Sairaj Voruganti (University of Waterloo)*, Tamer Özsu (University of Waterloo)
SIGMOD Research 4: Text-to-SQL & ML-infused Queries
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Potsdam III
Session chair: NN
- Automated Validating and Fixing of Text-to-SQL Translation with Execution Consistency - Yicun Yang (Shanghai Jiao Tong University)*, zhaoguo wang (Shanghai Jiao Tong University), Yu Xia (Shanghai Jiao Tong University), Zhuoran Wei (Shanghai Jiao Tong University), Haoran Ding (Shanghai Jiao Tong University), Ruzica Piskac (Yale University), Haibo Chen (Shanghai Jiao Tong University), Jinyang Li (New York University)
- Reliable Text-to-SQL with Adaptive Abstention - kaiwen chen (university of Toronto)*, Yueting Chen (Seattle University), Nick Koudas (University of Toronto), Xiaohui Yu (York University)
- SNAILS: Schema Naming Assessments for Improved LLM-Based SQL Inference - Kyle Luoma (University of California, San Diego)*, Arun Kumar (University of California, San Diego)
- Hydro: Adaptive Query Processing of ML Queries - Gaurav Tarlok Kakkar (Georgia Institute of Technology)*, Jiashen Cao (Georgia Tech), Aubhro Sengupta (Georgia Institute of Technology), Joy Arulraj (Georgia Tech), Hyesoon Kim (Georgia Tech)
- Mitigating the Impedance Mismatch between Prediction Query Execution and Database Engine - Chenyang Zhang (East China Normal University), Junxiong Peng (East China Normal University), Chen Xu (East China Normal University)*, Quanqing Xu (OceanBase, Ant Group ), Chuanhui Yang (OceanBase)
PODS Research 6: Randomized Analysis and Data Structures
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Charlottenburg I/II
Session chair: Cristian Riveros
- Optimal Dynamic Parameterized Subset Sampling - Junhao Gan (The University of Melbourne), Seeun William Umboh (The University of Melbourne), Hanzhi Wang (Renmin University of China), Anthony Wirth (The University of Sydney), Zhuo Zhang (The University of Melbourne)
- Perfect Sampling in Turnstile Streams Beyond Small Moments - David P. Woodruff (Carnegie Mellon University), Shenghao Xie (Texas A&M University), Samson Zhou (Texas A&M University)
- Robust Statistical Analysis on Streaming Data with Near-Duplicates in General Metric Spaces -
Qin Zhang (Indiana University Bloomington)
- Efficient Algorithms for k-Clustering with Noisy and Exact Oracles - Sainyam Galhotra (Cornell University), Rahul Raychaudhury (Duke University), Stavros Sintos (University of Illinois at Chicago)
- On the adversarial robustness of Locality-Sensitive Hashing in Hamming space - Mikhail Makarov (EPFL), Michael Kapralov (EPFL), Christian Sohler (University of Cologne)
- A Theoretical Framework for Distribution-Aware Dataset Search - Aryan Esmailpour (University of Illinois Chicago), Sainyam Galhotra (Cornell University), Rahul Raychaudhury (Duke University), Stavros Sintos (University of Illinois Chicago)
SIGMOD Industry 2: Distributed Systems and Hybrid Workloads
Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Tiergarten I/II/III
Session chair: NN
- MaLT: A Framework for Managing Large Transactions in OceanBase -
Chenguang Fang (OceanBase, Ant Group); Chen Qian (OceanBase, Ant Group); Qi Yang (OceanBase, Ant Group); Zeyu Wang (OceanBase, Ant Group); Zhenkun Yang (OceanBase, Ant Group); Fanyu Kong (OceanBase, Ant Group); Quanqing Xu (OceanBase, Ant Group )*; Hui Ca
- Managed Resource Scaling in Amazon EMR -
Vishal Vyas (Amazon Web Services); Andrei Paduroiu (Amazon Web Services)*; Srikanth Kandula (Amazon Web Services); Hari Ohm Prasath Rajagopal (Amazon Web Services); Mukesh Punhani (Amazon Web Services); Marco Manzo (Amazon Web Services); Ankur Goyal (Amaz
- TXSQL: Lock Optimizations Towards High Contented Workloads -
Donghui Wang (Tencent Inc.); Yuxing Chen (Tencent)*; Chengyao Jiang (Tencent Inc.); Anqun Pan (Tencent Inc.); Wei Jiang (Tencent Inc.); Songli Wang (Tencent Inc.); Hailin Lei ( Tencent Inc.); Chong Zhu (Tencent Inc.); Lixiong Zheng (Tencent Inc.); Wei Lu
- Experimental Evaluation of Optimizing Memory Consumption in SAP HANA using PEOopt -
Lukas Landgraf (TU Dresden)*; Florian Wolf (SAP SE); Wolfgang Lehner (TU Dresden)
- Enterprise Application-Database Co-Innovation for Hybrid Transactional/Analytical Processing: A Virtual Data Model and Its Query Optimization Needs -
Kihong Kim (SAP Labs Korea)*; Hyunwook Kim (SAP Labs Korea); Jin Su Lee (SAP Labs Korea); Taehyung Lee (SAP Labs Korea); Alexander Boehm (SAP SE); Norman May (SAP SE); Guido Moerkotte (Universität Mannheim); Daniel Ritter (SAP SE); Ralf Dentzer (SAP SE);
- Flux: Unifying Heterogeneous Infrastructure for Alibaba AnalyticDB -
Wei Li (Alibaba Cloud); Jiachi Zhang (Alibaba Cloud); Ye Yin (Alibaba Cloud); Yan Li (Alibaba Cloud); Zhanyang Zhu (Alibaba Cloud); Yuhao Li (Alibaba Cloud); Zhencan Peng (Alibaba Cloud); Lan Lu (Alibaba Cloud); Wenchao Zhou (Alibaba Cloud)*; Liang Lin (A
SIGMOD Panel 1: AI for Future Databases: A New Beginning or a Boulevard of Broken Dreams?
Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Potsdam I
- Organizers:
- Danica Porobic
- Carsten Binning
- Panelists
- Anastasia Ailamaki (EPFL)
- Surajit Chaudhuri (Microsoft)
- Tim Kraska (MIT & Amazon Web Services)
- Feifei Li (Alibaba Cloud)
- Aditya Parameswaran (UC Berkeley)
- Immanuel Trummer (Cornell University)
AI has opened new directions in database research, from learned components replacing traditional internals to LLMs, enabling a new generation of database systems that allow querying data beyond tables. Yet, adoption in commercial databases has been incremental rather than a fundamental rethinking of modern data system stacks. In this panel, we thus bring together experts from academia and industry to discuss the tension between potential and reality in how AI shapes real-world database products. We will explore questions such as: What should an AI-ready database stack look like—incremental evolution or radical departure? What prevents AI from replacing traditional components like query optimizers, cost models, and indexes? What does it take for LLM-based innovations to move beyond impressive demos? Can we use LLMs for more than Text-to-SQL and LLM-UDFs? By tackling these questions, this panel will challenge assumptions in research, examine AI’s role in future databases, and ask: Is AI the key to overcoming core limitations and will thus enable a new generation of database systems, or maybe AI is just another boulevard of broken (database) dreams?
SIGMOD Business Meeting
Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Charlottenburg III
SIGMOD Research 5: Machine Learning for Database Internals
Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Potsdam III
Session chair: NN
- How Good are Learned Cost Models, Really? Insights from Query Optimization Tasks - Roman Heinrich (DFKI Darmstadt & TU Darmstadt)*, Manisha Luthra (TU Darmstadt and DFKI), Johannes Wehrstein (TU Darmstadt), Harald Kornmayer (DHBW Mannheim), Carsten Binnig (TU Darmstadt)
- Low Rank Learning for Offline Query Optimization - Zixuan Yi (University of Pennsylvania)*, Yao Tian (The Hong Kong University of Science and Technology), Zack Ives (University of Pennsylvania), Ryan Marcus (University of Pennsylvania)
- Optimizing Block Skipping for High-Dimensional Data with Learned Adaptive Curve - Xu Chen (University of Electronic Science and Technology of China)*, Shuncheng Liu (University of Electronic Science and Technology of China), Tong Yuan (Huawei Technologies Co., Ltd.), tao ye (huawei), Kai Zeng (Huawei Technologies Co. Ltd.), Han Su (University of Electronic Science and Technology of China), Kai Zheng (University of Electronic Science and Technology of China)
- SPACE: Cardinality Estimation for Path Queries Using Cardinality-Aware Sequence-based Learning - Mehmet Aytimur (University of Konstanz)*, Theodoros Chondrogiannis (University of Konstanz), Michael Grossniklaus (University of Konstanz)
- T3: Accurate and Fast Performance Prediction for Relational Database Systems With Compiled Decision Trees - Maximilian Rieger (TUM)*, Thomas Neumann (TUM)
SIGMOD Research 6: Community and Network Analysis
Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Tiergarten I/II/III
Session chair: NN
- Community Detection in Heterogeneous Information Networks Without Materialization - Jiaxin Jiang (National University of Singapore)*, Siyuan Yao (National University of Singapore), Yuhang Chen (National University of Singapore), Bingsheng He (National University of Singapore), Yudong Niu (Singapore Management University), Yuchen Li (Singapore Management University), Shixuan Sun (Shanghai Jiao Tong University), Yongchao Liu (Ant Group)
- Dual-Hierarchy Labelling: Scaling Up Distance Queries on Dynamic Road Networks - Muhammad Farhan (Australian National University)*, Henning Koehler (Massey University), Qing Wang (ANU)
- Cohesiveness-aware Hierarchical Compressed Index for Community Search over Attributed Graphs - Yuxiang Wang (Hangzhou Dianzi University)*, Zhangyang Peng (Hangzhou Dianzi University), Xiangyu Ke (Zhejiang University), Xiaoliang Xu (Hangzhou Dianzi University), Tianxing Wu (Southeast University), Yuan Gao (Hangzhou Dianzi University)
- Deep Overlapping Community Search via Subspace Embedding - Qing Sima (University of New South Wales), Jianke Yu (University of Technology Sydney), Xiaoyang Wang (University of New South Wales)*, Wenjie Zhang (University of New South Wales), Ying Zhang (University of Technology Sydney), Xuemin Lin (Shanghai Jiaotong University)
- A Lovász-Simonovits Theorem for Hypergraphs with Application to Local Clustering -
Raj Kamal (Indian Institute of Technology Delhi)*, Amitabha Bagchi (IIT Delhi)
Sponsor Session
Time: Tuesday, 24.06.2025, 16:30 - 18:30
Location: Bellevue
- Innovations in AWS Analytics -
Sudipto Das, Sr. PE, AWS Analytics
Abstract: Analytics is a fast-changing space with customers seeking to derive new and fast insights on various types of data. AWS Analytics and reinvigorated the analytics stack to cater to this next generation of analytics use-cases power both traditional business analytics but also the emerging class of AI and ML workloads. This talk will provide an overview of this reimagined stack, called the SageMaker platform, and delve into some of the innovations that power this new offering.
- Huawei Cloud GaussDB, a Better Way to Database -
Nikolaos Ntarmos, Director of Database Lab at the Edinburgh Research Centre of Huawei
Abstract: GaussDB is a distributed relational database from Huawei. It supports intra-city cross-AZ deployment with zero data loss. With a distributed architecture, GaussDB supports petabytes of storage and contains more than 1,000 nodes per DB instance. It is highly available, reliable, secure, and scalable and provides services including quick deployment, backup, restoration, monitoring, and alarm reporting for enterprises. GaussDB continuously innovates with millions of customers by key technologies and comprehensive solutions.
PODS Research 7: Private Data Analysis (inc. Best Newcomer Award)
Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Charlottenburg I/II
Session chair: Dan Suciu
- Private Synthetic Data Generation in Small Memory - Rayne Holland (CSIRO's Data61), Jason Xue (CSIRO's Data61), Chandra Thapa (CSIRO's Data61), Seyit Camtepe (CSIRO's Data61)
- Fully Dynamic Algorithms for Graph Databases with Edge Differential Privacy - Sofya Raskhodnikova (Boston University), Teresa Anna Steiner (University of Southern Denmark)
- Optimal Bounds for Private Minimum Spanning Trees via Input Perturbation - Rasmus Pagh (University of Copenhagen), Lukas Retschmeier (University of Copenhagen), Hao Wu (University of Waterloo), Hanwen Zhang (University of Copenhagen)
- Differentially Private Hierarchical Heavy Hitters - Ari Biswas (University of Warwick), Graham Cormode (Meta AI/ University Of Warwick), Yaron Kanza (AT&T Research), Divesh Srivastava (AT&T Research), Zhengyi Zhou (AT&T Research)
- Differentially Private Substring and Document Counting (Best Newcomer Award) - Giulia Bernardini (University of Trieste), Philip Bille (Technical University of Denmark), Inge Li Gørtz (Technical University of Denmark), Teresa Anna Steiner (University of Southern Denmark)
SIGMOD Keynote 2 Awards Talks 2
Time: Wednesday, 25.06.2025, 08:30 - 10:30
Location: Potsdam I & III
- Keynote: The Case for Collaboration (Everything a Database Person really needs to know about Machine Learning) (Margo Seltzer) -
Margo Seltzer (University of British Columbia)
Title: The Case for Collaboration (Everything a Database Person really needs to know about Machine Learning)
Abstract: It's 2025, and the answer to every database performance or optimization problem is "machine learning". But what kinds of models are appropriate for these applications? I'm going to try to convince you that, as in good system design, "simpler is better". And, in this case, simpler has many benefits: simpler models are typically more efficient in both space and time, they are frequently transparently interpretable, and in many domains, they produce accuracy and generalization equivalent to the fanciest deep learning model you can build. At the same time, I'm going to explain what great collaboration looks like, how it can help you overcome imposter syndrome, and how it helps you find your own personal superpower.
Bio: Margo Seltzer is the Canada 150 Research Chair in Computer Systems and the Cheriton Family chair in Computer Science at the University of British Columbia. Her research interests are in systems, construed quite broadly: systems for capturing and accessing data provenance, file systems, databases, transaction processing systems, storage and analysis of graph-structured data, and systems for constructing optimal and interpretable machine learning models. Dr. Seltzer was a co-founder and CTO of SleepycatSoftware, the makers of Berkeley DB, the recipient of the 2021 ACM Software Sytems award and the 2020 ACM SIGMOD Systems Award. She is a member of the National Academy of Engineering and the American Academy of Arts and Sciences, a Sloan Foundation Fellow in Computer Science, an ACM Fellow, and a Fellow of the Royal Society of Canada. She is also recognized as an outstanding teacher and mentor.
- Award talks
SIGMOD Research 7: Transactions and Consistency
Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Potsdam I
Session chair: NN
- CRDV: Conflict-free Replicated Data Views (honorable mention for best research paper) -
Nuno Faria (INESCTEC & U. Minho)*, José Pereira (U. Minho & INESCTEC)
- Are database system researchers making correct assumptions about transaction workloads? - Cuong Nguyen (University of Maryland), Kevin Chen (8VC), Christopher DeCarolis (Independent), Daniel Abadi (UMD)*
- Low-Latency Transaction Scheduling via Userspace Interrupts: Why Wait or Yield When You Can Preempt? (best research paper) -
Kaisong Huang (Simon Fraser University)*, Jiatang Zhou (Simon Fraser University ), Zhuoyue Zhao (University at Buffalo), Dong Xie (Penn State University), Tianzheng Wang (Simon Fraser University)
- Moving on From Group Commit: Autonomous Commit Enables High Throughput and Low Latency on NVMe SSDs - Lam-Duy Nguyen (Technische Universität München)*, Adnan Alhomssi (University of Erlangen-Nürnberg), Tobias Ziegler (Technische Universität München), Viktor Leis (Technische Universität München)
- Boosting OLTP Performance with Per-Page Logging on NVDIMM - Seongjae Moon (Sungkyunkwan University), Bohyun Lee (Technische Universität München), Jonghyeok Park (Korea University), Sang-Won Lee (Seoul National University)*
SIGMOD Research 8: Streams, Spatial and Modern Hardware
Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Potsdam III
Session chair: NN
- SuSe: Summary Selection for Regular Expression Subsequence Aggregation over Streams - Steven Purtzel (Humboldt-Universität zu Berlin)*, Matthias Weidlich (Humboldt-Universität zu Berlin)
- Pandora: An Efficient and Rapid Solution for Persistence-Based Tasks in High-Speed Data Streams - Weihe Li (University of Edinburgh)*
- SwiftSpatial: Spatial Joins on Modern Hardware - Wenqi Jiang (ETH Zurich)*, Oleh-Yevhen Khavrona (ETHZ), Martin Parvanov (ETH Zurich), Gustavo Alonso (ETHZ)
- GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAP - Nils Boeschen (TU Darmstadt)*, Tobias Ziegler (Technische Universität München), Carsten Binnig (TU Darmstadt)
- GPH: An Efficient and Effective Perfect Hashing Scheme for GPU Architectures - Jiaping Cao (Hong Kong Polytechnic University), Le XU (The Hong Kong Polytechnic University (PolyU)), Man Lung Yiu (Hong Kong Polytechnic University), Jianbin Qin (Shenzhen Institute of Computing Sciences, Shenzhen University ), Bo Tang (Southern University of Science and Technology)*
SIGMOD Industry 3: Query Optimization and Vector Databases
Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Bellevue
Session chair: NN
- AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference - Yangshen Deng (AlayaDB AI); Zhengxin You (AlayaDB AI); Long Xiang (AlayaDB AI); Qilong Li (Southern University of Science and Technology & AlayaDB AI); Peiqi Yuan (Southern University of Science and Technology & AlayaDB AI); Zhaoyang Hong (Southern Univer
- MicroNN: An On-device Disk-resident Updatable Vector Database - Jeffrey Pound (Apple); Floris Chabert (Apple); Arjun Bhushan (Apple); Ankur Goswami (Apple); Anil Pacaci (Apple); Shihabur Chowdhury (Apple)*
- OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML - Xuanhe Zhou (Shanghai Jiao Tong University); Wei Zhou (Shanghai Jiao Tong University); Liguo Qi (4Paradigm Inc.); Hao Zhang (4Paradigm Inc.); Dihao Chen (SF Express Inc.); Bingsheng He (National University of Singapore)*; Mian Lu (4Paradigm Inc.); Guolian
- Dynamic Pruning for Recursive Joins - Norifumi Nishikawa (Hitachi, Ltd.)*; Akira Shimizu (Hitachi, Ltd.); Akira Ito (Hitachi, Ltd.); Shinji Fujiwara (Hitachi, Ltd.); Yuto Hayamizu (The University of Tokyo); Masaru Kitsuregawa (The University of Tokyo); Kazuo Goda (The University of Tokyo)
- Including Bloom Filters in Bottom-up Optimization - Timothy Zeyl (Huawei)*; Qi Cheng (Huawei); Reza Pournaghi (Huawei); Jason Lam (Huawei); Weicheng Wang (Huawei); Calvin Wong (Huawei); Chong Chen (Huawei); Per-Ake Larson (Huawei)
- Query Decorrelation in the Fabric Data Warehouse - Nicolas Bruno (Microsoft)*; Cesar Galindo-Legaria (Microsoft); Milind Joshi (Microsoft)
SIGMOD DEI Panel
Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Charlottenburg III
Organizer: Donatella Firmani
Panelists:
PODS Research 8: Tutorial 2 & Parallelization Bounds for Queries
Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Charlottenburg I/II
Session chair: Zhewei Wei
- Lower Bounds for Conjuctive Query Evaluation -
Stefan Mengel (Univ. Artois, CNRS, Centre de Recherche en Informatique de Lens (CRIL))
- Parallel Communication Obliviousness: One Round and Beyond -
Yufei Tao (The Chinese University of Hong Kong), Ru Wang (The Chinese University of Hong Kong), Shiyuan Deng (the Chinese University of Hong Kong)
SIGMOD Research 9: Query Optimization
Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Potsdam I
Session chair: NN
- DPconv: Super-Polynomially Faster Join Ordering (honorable mention for best research paper) -
Mihail Stoian (UTN)*, Andreas Kipf (UTN)
- LpBound: Pessimistic Cardinality Estimation Using Lp-Norms of Degree Sequences (best research paper) -
Haozhe Zhang (University of Zurich), Christoph Mayer (University of Zurich), Mahmoud Abo Khamis (RelationalAI), Dan Olteanu (University of Zurich)*, Dan Suciu (University of Washington)
- Towards a Converged Relational-Graph Optimization Framework - Yunkai Lou (Alibaba Group), Longbin Lai (Alibaba Group)*, Bingqing Lyu (Alibaba Group), Yufan Yang (Alibaba Group), XiaoLi Zhou (阿里巴巴), Wenyuan Yu (Alibaba Group), Ying Zhang (University of Technology Sydney), Jingren Zhou (Alibaba Group)
- Debunking the Myth of Join Ordering: Toward Robust SQL Analytics - Junyi Zhao (Tsinghua University)*, Kai Su (Tsinghua University), Yifei Yang (University of Wisconsin, Madison), Xiangyao Yu (University of Wisconsin-Madison), Paraschos Koutris (University of Wisconsin-Madison), Huanchen Zhang (Tsinghua University)
- Logical and Physical Optimizations for SQL Query Execution over Large Language Models - Dario Satriani (UNIBAS), Enzo Veltri (Università della Basilicata)*, Donatello Santoro (Università della Basilicata), Sara Rosato (EURECOM), Simone Varriale (EURECOM), Paolo Papotti (EURECOM)
SIGMOD Research 10: Privacy in Data Management
Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Potsdam III
Session chair: NN
- Privacy and Accuracy-Aware AI/ML Model Deduplication - Hong Guan (Arizona State University)*, Lixi Zhou (Arizona State University), Lei Yu (Rensselaer Polytechnic Institute), Li Xiong (Emory University), Kanchan Chowdhury (Arizona State University), Lulu Xie (Arizona State University), Xusheng Xiao (Arizona State University), Jia Zou (Arizona State University)
- PoneglyphDB: Efficient Non-interactive Zero-Knowledge Proofs for Arbitrary SQL Queries Verification - Binbin Gu (UC Irvine)*, Faisal Nawab (University of California at Irvine), Juncheng Fang (University of California, Irvine)
- Computing Inconsistency Measures Under Differential Privacy - Shubhankar Mohapatra (University of Waterloo)*, Amir Gilad (The Hebrew University), Xi He (University of Waterloo), Benny Kimelfeld (Technion)
- PrivRM: A Framework for Range Mean Estimation under Local Differential Privacy - Liantong YU (The Hong Kong Polytechnic University)*, Qingqing Ye (Hong Kong Polytechnic University), Rong Du (PolyU)
- Disclosure-compliant Query Answering - Rudi Poepsel-Lemaitre (Technische Universität Berlin)*, Kaustubh Beedkar (Indian Institute of Technology Delhi), Volker Markl (Technische Universität Berlin)
SIGMOD Industry 4: Graph Databases and ML
Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Bellevue
Session chair: NN
- GES: High-Performance Graph Processing Engine and Service in Huawei - Sen Gao (Huawei & SJTU)*; Jianwen Zhao (Huawei); Hao Zhang (Huawei); Shixuan Sun (Shanghai Jiao Tong University); Chen Liang (Huawei); Gongye Chen (Huawei); Wenliang Zhang (Huawei); Bo Ren (Huawei); Chao Liu (Huawei); Chengyi Zhang (Huawei); Quan Chen (Sh
- RedTAO: A Trillion-edge High-throughput Graph Store - Shihao Zhou (East China Normal University & Xiaohongshu)*; Qi Mao (Xiaohongshu); Yi Cheng (Xiaohongshu); Hongcheng Qi (Xiaohongshu); Yilun Huang (Xiaohongshu); Peng Cai (East China Normal University); Jun-Peng Zhu (East China Normal University)
- A Modular Graph-Native Query Optimization Framework - Bingqing Lyu (Alibaba Group)*; Xiaoli Zhou (Alibaba Group); Longbin Lai (Alibaba Group); Yufan Yang (Alibaba Group); Yunkai Lou ( Alibaba Group); Wenyuan Yu (Alibaba Group); Ying Zhang (Zhejiang Gongshang University); Jingren Zhou (Alibaba Group)
- TigerVector: Supporting Vector Search in Graph Databases for Advanced RAGs - Shige Liu (Purdue University)*; Zhifang Zeng (TigerGraph); Li Chen (TigerGraph); Adil Ainihaer (TigerGraph); Arun Ramasami (TigerGraph); Songting Chen (TigerGraph); Yu Xu (TigerGraph); Mingxi Wu (TigerGraph); Jianguo Wang (Purdue University)
- Scheduling Data Processing Pipelines for Incremental Training on MLP-based Recommendation Models - Zihao Chen (East China Normal University); Chenyang Zhang (East China Normal University); Chen Xu (East China Normal University)*; Zhao Zhang ( East China Normal University); Jiaqiang Wang (Tencent Inc.); Weining Qian ( East China Normal University); Aoyi
- Rockhopper: A Robust Optimizer for Spark Configuration Tuning in Production Environment - Yiwen Zhu (Microsoft)*; Rathijit Sen (Microsoft); Brian Kroth (Microsoft); Sergiy Matusevych (Microsoft); Andreas Mueller (Microsoft); Tengfei Huang (Microsoft); Rahul Challapalli (Microsoft); Weihan Tang (Microsoft); Xin He (Microsoft); Mo Liu (Microsoft
PODS Research 9: Sequence-Based Queries
Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Charlottenburg I/II
Session chair: Jef Wijsen
- Revisiting Weighted Information Extraction: A Simpler and Faster Algorithm for Ranked Enumeration - Pawel Gawrychowski (University of Wroclaw), Florin Manea (University of Göttingen), Markus L. Schmid (Humboldt Universität Berlin)
- Output-Sensitive Evaluation of Regular Path Queries - Mahmoud Abo Khamis (RelationalAI), Ahmet Kara (OTH Regensburg), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington)
- The Complexity of Maximal Common Subsequence Enumeration - Giovanni Buzzega (Università di Pisa), Alessio Conte (Università di Pisa), Yasuaki Kobayashi (Hokkaido University), Kazuhiro Kurita (Nagoya University), Giulia Punzi (Università di Pisa)
- Towards practical FPRAS for #NFA: Exploiting the Power of Dependence - Alexis de Colnet (Algorithms and Complexity Group, TU Wien), Kuldeep S. Meel (University of Toronto)
- Complex event recognition meets hierarchical conjunctive queries - Dante Pinto (Pontificia Universidad Católica de Chile), Cristian Riveros (Pontificia Universidad Católica de Chile)
- Complex event recognition under time constraints: towards a formal framework for efficient query evaluation - Julián García (Pontificia Universidad Católica de Chile), Cristian Riveros (Pontificia Universidad Católica de Chile)
SIGMOD DEI birds of a feather
Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Charlottenburg III
Poster Session 2
Time: Wednesday, 25.06.2025, 16:00 - 17:30
Location: nan
- 46: Serf: Streaming Error-Bounded Floating-Point Compression
Ruiyuan Li (Chongqing University), Zechao Chen (Chongqing University), Ruyun Lu (Chongqing University), XIAOLONG XU (Chongqing University), guangchao yang (Chongqing University), Chao Chen (Chongqing University), Jie Bao (JDT), Yu Zheng (JD) - 47: High-Throughput Ingestion for Video Warehouse: Comprehensive Configuration and Effective Exploration
Baiyan Zhang (Zhejiang University), Zepeng Li (Zhejiang University), Dongxiang Zhang (Zhejiang University), Huan Li (Zhejiang University), Kian-Lee Tan (National University of Singapore), Gang Chen (Zhejiang University) - 48: In-Database Time Series Clustering
Yunxiang Su (Tsinghua University), Kenny Ye Liang (Tsinghua University), Shaoxu Song (Tsinghua University) - 49: Tribase: A Vector Data Query Engine for Reliable and Lossless Pruning Compression using Triangle Inequalities
Qian Xu (Renmin University of China), Juan Yang (Tsinghua University), Feng Zhang (Renmin University of China), Junda Pan (Renmin University of China), Kang Chen (Tsinghua University), Youren Shen (Beijing HaiZhi XingTu Technology Co., Ltd.), Amelie Chi Zhou (Hong Kong Baptist University), Xiaoyong Du (Renmin University of China) - 50: Auto-Test: Learning Semantic-Domain Constraints for Unsupervised Error Detection in Tables
Qixu Chen (HKUST), Yeye He (Microsoft Research), Raymond Chi-Wing Wong (Hong Kong University of Science and Technology), Weiwei Cui (Microsoft Research Asia), Song Ge (Microsoft Reseach Asia), Haidong Zhang (Microsoft Research Asia), Dongmei Zhang (Microsoft Research Asia), Surajit Chaudhuri (Microsoft) - 51: Personalized Truncation for Personalized Privacy
Dajun Sun (Hong Kong University of Science and Technology), Wei Dong (Nanyang Technological University), Yuan Qiu (CNRS@CREATE & National University of Singapore), Ke Yi (Hong Kong Univ. of Science and Technology) - 52: Faster and Efficient Density Decomposition via Proportional Response with Exponential Momentum
Quan Xue (the University of Hong Kong), T-H. Hubert Chan (The University of Hong Kong) - 53: PDX: A Data Layout for Vector Similarity Search
Leonardo Kuffo (CWI), Elena Krippner (Technische Universität München), Peter Boncz (CWI) - 54: Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization
Haoyang Li (Peking University), Fangcheng Fu (Peking University), Hao Ge (Peking University), Sheng Lin (Peking University), Wang Xuanyu (Peking University), Jiawen Niu (Peking University), Yujie Wang (Peking University), Hailin Zhang (Peking University), Xiaonan Nie (Peking University), Bin Cui (Peking University) - 55: Camel: Efficient Compression of Floating-Point Time Series
Yuanyuan Yao (Zhejiang University), Lu Chen (Zhejiang University), Ziquan Fang (Zhejiang University), Yunjun Gao (Zhejiang University), Christian S. Jensen (Aalborg University), Tianyi Li (Aalborg University) - 56: SecureXGB: A Secure and Efficient Multi-party Protocol for Vertical Federated XGBoost
Zongda Han (State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications), Xiang Cheng (Beijing University of Posts and Telecommunications), Wenhong Zhao (Beijing University of Posts and Telecommunications), Jiaxin Fu (Beijing University of Posts and Telecommunications), Zhaofeng He (Beijing University of Posts and Telecommunications), Sen Su (Beijing University of Posts and Telecommunications) - 57: Graph Edit Distance Estimation: A New Heuristic and A Holistic Evaluation of Learning-based Methods
Mouyi Xu (The University of Sydney), Lijun Chang (The University of Sydney) - 58: Pluto: Sample Selection for Robust Anomaly Detection on Polluted Log Data
Lei Ma (WPI), Lei Cao (University of Arizona), Peter VanNostrand (WPI), Dennis Hofmann (Worcester Polytechnic Institute), Yao Su (Worcester Polytechnic Institute), Elke Rundensteiner (Worcester Polytechnic Institute) - 59: Directional Queries: Making Top-k Queries More Effective in Discovering Relevant Results
Paolo Ciaccia (Università di Bologna), Davide Martinenghi (Politecnico di Milano) - 60: Automatic Database Configuration Debugging using Retrieval-Augmented Language Models
Sibei Chen (Renmin University of China), Ju Fan (Renmin University of China), Bin Wu (Alibaba Group), Nan Tang (HKUST (GZ)), Chao Deng (Renmin University of China), Pengyi Wang (Renmin University of China), Ye Li (Alibaba), Jian Tan (Alibaba), Feifei Li (Alibaba Group), Jingren Zhou (Alibaba Group), Xiaoyong Du (Renmin University of China) - 61: Relevance queries for interval data
Panagiotis Bouros (Johannes Gutenberg University Mainz), Nikos Mamoulis (University of Ioannina) - 62: Disco: A Compact Index for LSM-trees
Wenshao Zhong (University of Illinois at Chicago), Chen Chen (University of Illinois at Chicago), Xingbo Wu (Microsoft Research), Jakob Eriksson (UIC) - 63: InTime: Towards Performance Predictability In Byzantine Fault Tolerant Proof-of-Stake Consensus
Weijie Sun (The Hong Kong University of Science and Technology), Zihuan XU (Shenzhen Institute of Computing Sciences), Wangze Ni (Hong Kong University of Science and Technology), Lei Chen (Hong Kong University of Science and Technology) - 64: Atom: An Efficient Query Serving System for Embedding-based Knowledge Graph Reasoning with Operator-level Batching
qihui zhou (CUHK), Peiqi Yin (The Chinese University of Hong Kong), Xiao Yan (Centre for Perceptual and Interactive Intelligence (CPII), Changji Li (CUHK), Guanxian Jiang (CUHK), James Cheng (CUHK) - 65: MatCo: Computing Match Cover of Subgraph Query over Graph Data
Zhichao Shi (Hunan University), Youhuan Li (Hunan University), Ziming Li (Hunan University), Yuequn Dou (Hunan University), Xionghu Zhong (Hunan University), Lei Zou (Peking University) - 66: U-DPAP: Utility-aware Efficient Range Counting on Privacy-preserving Spatial Data Federation
yahong chen (Central China Normal University), Xiaoyi Pang (Zhejiang University), Xiaoguang Li (Xidian University), Hanyi Wang (China Mobile (Suzhou) Software Technology Co., Ltd.), Ben Niu (Institute of Information Engineering, CAS, China), Shengnan Hu (Central China Normal University) - 67: Shapley Value Estimation based on Differential Matrix
Junyuan Pang (Zhejiang University), Jian Pei (Duke University), Haocheng Xia (University of Illinois Urbana-Champaign), Xiang Li (Zhejiang University), Jinfei Liu (Zhejiang University) - 68: Nested Parquet Is Flat, Why Not Use It? How To Scan Nested Data With On-the-Fly Key Generation and Joins.
Alice Rey (TUM), Maximilian Rieger (TUM), Thomas Neumann (TUM) - 69: GTX: A Write-Optimized Latch-free Graph Data System with Transactional Support
Libin Zhou (Purdue University), Yeasir Rayhan (Purdue University), Lu Xing (Purdue University), Walid Aref (Purdue) - 70: Online Detection of Anomalies in Temporal Knowledge Graphs with Interpretability
Jiasheng Zhang (University of Electronic Science and Technology of China), Rex Ying (Yale University), Jie Shao (University of Electronic Science and Technology of China) - 71: Styx: Transactional Stateful Functions on Streaming Dataflows
Kyriakos Psarakis (TU Delft), George Christodoulou (TU Delft), Georgios Siachamis (Inria), Marios Fragkoulis (Delft University of Technology), Asterios Katsifodimos (TU Delft) - 72: Accelerating Graph Indexing for ANNS on Modern CPUs
Mengzhao Wang (Zhejiang University), Haotian Wu (Zhejiang University), Xiangyu Ke (Zhejiang University), Yunjun Gao (Zhejiang University), Yifan Zhu (Zhejiang University), Wenchao Zhou (Alibaba Group) - 73: On the Feasibility and Benefits of Extensive Evaluation
Yujie Hui (The Ohio State University), Miao Yu (The Ohio State University), Hao Qi (ucmerced), Yifan Gan (The Ohio State University), Tianxi Li (The Ohio State University), Yuke Li (University of California, Merced), Xueyuan Ren (The Ohio State University), Sixiang Ma (The Ohio State University), Xiaoyi Lu (UC Merced), Yang Wang (The Ohio State University) - 74: Approximate DBSCAN under Differential Privacy
Yuan Qiu (CNRS@CREATE & National University of Singapore), Ke Yi (Hong Kong Univ. of Science and Technology) - 75: Data Enhancing for Machine Learning
Wenfei Fan (Univ. of Edinburgh), Xiaoyu Han (Fudan University), Weilong Ren (Shenzhen Institute of Computing Sciences), Zihuan XU (Shenzhen Institute of Computing Sciences) - 76: User-Centric Property Graph Repairs
Amedeo Pachera (Lyon 1 University), Angela Bonifati (Univ. of Lyon), Andrea Mauri (Université Claude Bernard Lyon 1) - 77: Cardinality Estimation of LIKE Predicate Queries using Deep Learning
Suyong Kwon (Seoul National University), Kyuseok Shim (Seoul National University), Woohwan Jung (Hanyang University) - 78: PQCache: Product Quantization-based KVCache for Long Context LLM Inference
Hailin Zhang (Peking University), Xiaodong Ji (Peking University), Yilin Chen (Beijing Institute of Technology), Fangcheng Fu (Peking University), Xupeng Miao (Purdue University), Xiaonan Nie (Peking University), weipeng chen (Baichuan Inc.), Bin Cui (Peking University) - 79: Connectivity-Oriented Property Graph Partitioning for Distributed Graph Pattern Query Processing
Min Shi (Hunan University), Peng Peng (Hunan University), Xu Zhou (Hunan university), Jiayu Liu (Hunan University), Guoqing Xiao (Hunan University), Kenli Li (Hunan University) - 80: HyperMR: Efficient Hypergraph-enhanced Matrix Storage on Compute-in-Memory Architecture
Yifan Wu (Zhejiang University), Ke Chen (Zhejiang University), Gang Chen (Zhejiang University), Dawei Jiang (Zhejiang University), Huan Li (Zhejiang University), Lidan Shou (Zhejiang University) - 81: PLM4NDV: Minimizing Data Access for Number of Distinct Values Estimation with Pre-trained Language Models
Xianghong Xu (ByteDance), Xiao He (ByteDance), Tieying Zhang (Bytedance), Lei Zhang (ByteDance), Rui Shi (ByteDance Inc.), Jianjun Chen (Bytedance) - 82: An Efficient and Exact Algorithm for Locally h-Clique Densest Subgraph Discovery
XIAOJIA XU (Renmin University of China), Haoyu Liu (Renmin University of China), Xiaowei Lv (Renmin University of China), Yongcai Wang (Renmin University of China), Deying Li (School of information, Renmin University of China) - 83: Understanding and Reusing Test Suites Across Database Systems
Suyang Zhong (National University of Singapore), Manuel Rigger (National University of Singapore) - 84: OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment
Xiangjin Xie (Tsinghua University), Guangwei Xu (Alibaba Group), Lingyan Zhao (ZhaoLingyan), Ruijie Guo (Alibaba) - 85: Density Decomposition of Bipartite Graphs
Yalong Zhang (Beijing Institute of Technology), Ronghua Li (Beijing Institute of Technology), Qi Zhang (Beijing Institute of Technology), Hongchao Qin (Beijing Institute of Technology), Lu Qin (UTS), Guoren Wang (Beijing Institute of Technology) - 86: Schema-Based Query Optimisation for Graph Databases
Chandan Sharma (INRIA), Pierre Genevès (CNRS), Nils Gesbert (Grenoble INP), Nabil Layaïda (Inria) - 87: SPAS: Continuous Release of Data Streams under w-Event Differential Privacy
Xiaochen Li (Zhejiang University), Tianyu Li (National University of Singapore), Yitian Cheng (Xi'an Jiaotong University), Chen Gong (University of Virginia), Kui Ren (Zhejiang University), Zhan Qin (Zhejiang University), Tianhao Wang (University of Virginia) - 88: Efficiently Processing Joins and Grouped Aggregations on GPUs
Bowen Wu (ETH Zurich), Dimitrios Koutsoukos (ETHZ), Gustavo Alonso (ETHZ) - 89: HoneyComb: A Parallel Worst-Case Optimal Join on Multicores
Jiacheng Wu (University of Washington), Dan Suciu (University of Washington) - 90: Ultraverse: A System-Centric Framework for Efficient What-If Analysis for Database-Intensive Web Applications
Ronny Ko (Ohio State University and Osaka University), Chuan Xiao (Osaka University, Nagoya University), Makoto Onizuka (Osaka University), Zhiqiang Lin (The Ohio State University), Yihe Huang (Databricks) - 91: Accelerate Distributed Joins with Predicate Transfer
Yifei Yang (University of Wisconsin, Madison), Xiangyao Yu (University of Wisconsin-Madison) - 92: BPF-DB: A Kernel-Embedded Transactional Database Management System For eBPF Applications
Matthew Butrovich (Carnegie Mellon University), Samuel Arch (Carnegie Mellon University), Wan Shen Lim (Carnegie Mellon University), William Zhang (Carnegie Mellon University), Jignesh Patel (Carnegie Mellon University), Andrew Pavlo (Carnegie Mellon University) - 93: Athena: An Effective Learning-based Framework for Query Optimizer Performance Improvement
Runzhong LI (Southern University of Science and Technology), Qilong Li (Southern University of Science and Technology), Haotian Liu (Southern University of Science and Technology), Rui Mao (Shenzhen University), Qing Li (The Hong Kong Polytechnic University), Bo Tang (Southern University of Science and Technology) - 94: Centrum: Model-based Database Auto-tuning with Minimal Distributional Assumptions
Yuanhao Lai (Huawei), Pengfei Zheng (Huawei), Chenpeng Ji (Huawei), Yan Li (Huawei), Songhan Zhang (The Chinese University of Hong Kong, Shenzhen), Rutao Zhang (Huawei), Zhengang Wang (Huawei), yunfei du (Huawei) - 95: Incremental Rule Discovery in Response to Parameter Updates
Jiaye Zheng (Shanghaitech University), Haoxian Chen (ShanghaiTech University), Wenfei Fan (Univ. of Edinburgh) - 96: Data Chunk Compaction in Vectorized Execution
Yiming Qiao (Tsinghua University), Huanchen Zhang (Tsinghua University) - 97: VEGA: An Active-tuning Learned Index with Group-Wise Learning Granularity
Meng Li (Nanjing University), Huayi Chai (Nanjing University), Siqiang Luo (Nanyang Technological University), Haipeng Dai (Nanjing University), Rong Gu (Nanjing University), Jiaqi Zheng (Nanjing University), Guihai Chen (Nanjing University) - 98: Bursting Flow Query on Large Temporal Flow Networks
Lyu Xu (Hong Kong Baptist University), Jiaxin Jiang (National University of Singapore), Byron Choi (Hong Kong Baptist University), Jianliang Xu (Hong Kong Baptist University), Bingsheng He (National University of Singapore) - 99: Cracking SQL Barriers: An LLM-based Dialect Translation System
Wei Zhou (Shanghai Jiao Tong University), Yuyang Gao (Tsinghua University), Xuanhe Zhou (Shanghai Jiao Tong University), Guoliang Li (Tsinghua University) - 100: Maximus: A Modular Accelerated Query Engine for Data Analytics on Heterogeneous Systems
Marko Kabić (ETH Zurich), Shriram Chandran (ETH Zürich), Gustavo Alonso (ETHZ) - 101: Zombie Hashing: Reanimating Tombstones in Graveyard
Benwei Shi (University of Utah), Yuvaraj Chesetti (Northeastern University), Jeff Phillips (University of Utah), Prashant Pandey (Northeastern University) - 102: Progressive entity resolution: a design space exploration
Jakub Maciejewski (National and Kapodistrian University of Athens), Konstantinos Nikoletos (National and Kapodistrian University of Athens), George Papadakis (University of Athens), Yannis Velegrakis (Utrecht University and University of Trento) - 103: Fast Maximum Common Subgraph Search: A Redundancy-Reduced Backtracking Approach
Kaiqiang Yu (Nanyang Technological University), Kaixin Wang (Beijing University of Technology), Cheng Long (Nanyang Technological University), Laks Lakshmanan (The University of British Columbia), Reynold Cheng ("The University of Hong Kong, China") - 104: LICS: Towards Theory-Informed Effective Visual Abstraction of Property Graph Schemas
Kasidis Chantharojwong (Nanyang Technological University), Sourav S Bhowmick (Nanyang Technological University), Byron Choi (Hong Kong Baptist University) - 105: RLOMM: An Efficient and Robust Online Map Matching Framework with Reinforcement Learning
Minxiao Chen (Beijing University of Posts and Telecommunications), Haitao Yuan (Nanyang Technological University), Nan Jiang (Beijing University of Posts and Telecommunications), Zhihan Zheng (Beijing University of Posts and Telecommunications), Sai Wu (Zhejiang University), Ao Zhou (Beijing University of Posts and Telecommunications), Shangguang Wang (State Key Laboratory of Networking and Switching Technology) - 106: DISCES: Systematic Discovery of Event Stream Queries
Rebecca Sattler (Humboldt Universität zu Berlin), Sarah Kleest-Meißner (Humboldt-Universität zu Berlin), Steven Lange (Humboldt University Berlin), Markus Schmid (Hu berlin), Nicole Schweikardt (HU Berlin), Matthias Weidlich (Humboldt-Universität zu Berlin) - 107: A Profit-Maximizing Data Marketplace with Differentially Private Federated Learning under Price Competition
Peng Sun (Hunan University), Liantao Wu (East China Normal University), Zhibo Wang (Zhejiang University), Jinfei Liu (Zhejiang University), Juan Luo (Hunan University), Wenqiang Jin (Hunan University) - 108: A Structured Study of Multivariate Time-Series Distance Measures
Jens d'Hondt (Eindhoven University of Technology), Haojun Li (The Ohio State University), Fan Yang (The Ohio State University), Odysseas Papapetrou (TU Eindhoven), John Paparrizos (The Ohio State University) - 109: Efficient Approximation Algorithms for Minimum Cost Seed Selection with Probabilistic Coverage Guarantee
Chen Feng (The Hong Kong Polytechnic University), Xingguang Chen (The Chinese University of Hong Kong), Qintian Guo (The Hong Kong University of Science and Technology), Fangyuan Zhang (The Chinese University of Hong Kong), Sibo Wang (The Chinese University of Hong Kong) - 110: Fast Hypertree Decompositions via Linear Programming: Fractional and Generalized
Anikait Mundhra (University of California, Santa Barbara), Vaishali Surianarayanan (University of California Santa Barbara), Daniel Lokshtanov (UC Santa Barbara), Ajaykrishnan E S (University of California Santa Barbara) - 111: H-Rocks CPU-GPU accelerated RocksDB on Persistent Memory
Shweta Pandey (Indian Institute of Science), Arkaprava Basu (Indian Institute Of Science) - 112: Sequoia: An Accessible and Extensible Framework for Privacy-Preserving Machine Learning over Distributed Data
Kaiqiang Xu (HKUST), Di Chai (HKUST), Junxue ZHANG (Hong Kong University of Science and Technology), Fan Lai (UIUC), Kai Chen (HKUST) - 113: SymphonyQG: towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search
Yutong Gou (Nanyang Technological University), Jianyang Gao (Nanyang Technological University), Yuexuan Xu (Nanyang Technological University), Cheng Long (Nanyang Technological University) - 114: Efficient Maximum s-Bundle Search via Local Vertex Connectivity
Yang Liu (Harbin Institute of Technology, Shenzhen), Hejiao Huang (Harbin Institute of Technology, Shenzhen), Kaiqiang Yu (Nanyang Technological University), Shengxin Liu (Harbin Institute of Technology, Shenzhen), Cheng Long (Nanyang Technological University) - 115: A Benchmark for Data Management in Microservices
Rodrigo Laigner (University of Copenhagen), Zhexiang Zhang (University of Copenhagen), Yijian Liu (University of Copenhagen), Leonardo Freitas Gomes (Amadeus), Yongluan Zhou (University of Copenhagen) - 116: Randomized Sketches for Quantile in LSM-tree based Store
Ziling Chen (Tsinghua University), Shaoxu Song (Tsinghua University) - 117: Optimizing LSM-trees via Active Learning
Weiping Yu ( Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Zihao Yu (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity) - 118: BCviz: A Linear-Space Index for Mining and Visualizing Cohesive Bipartite Subgraphs
Jianxiong Ye (Harbin Institute of Technology), Zhaonian Zou (Harbin Institute of Technology), Dandan Liu (Harbin Institute of Technology), Bin Yang (Harbin Institute of Technology), Xudong Liu (Harbin Institute of Technology) - 119: Practical DB-OS Co-Design with Privileged Kernel Bypass
Xinjing Zhou (Massachusetts Institute of Technology), Viktor Leis (Technische Universität München), Jinming Hu (DolphinDB), Xiangyao Yu (University of Wisconsin-Madison), Michael Stonebraker (MIT) - 120: SPARTAN: Data-Adaptive Symbolic Time-Series Approximation
Fan Yang (The Ohio State University), John Paparrizos (The Ohio State University) - 121: Integral Densest Subgraph Search on Directed Graphs
Yalong Zhang (Beijing Institute of Technology), Ronghua Li (Beijing Institute of Technology), Longlong Lin (Southwest University), Qi Zhang (Beijing Institute of Technology), Lu Qin (UTS), Guoren Wang (Beijing Institute of Technology) - 122: Using Process Calculus for Optimizing Data and Computation Sharing in Complex Stateful Parallel Computations
Zilu Tian (University of Zurich), Christoph Koch ("EPFL, Switzerland"), Dan Olteanu (University of Zurich) - 123: MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training
Pinxue Zhao (Peking University), Hailin Zhang (Peking University), Fangcheng Fu (Peking University), Xiaonan Nie (Peking University), Qibin Liu (Tencent.com), FANG YANG (Tencent), yuanbo peng (tencent.com), Dian Jiao (Tencent), Shuaipeng Li (Tencent), Jinbao Xue (Tencent), Yangyu Tao (Tencent), Bin Cui (Peking University) - 124: Mnemosyne: Dynamic Workload-Aware BF Tuning via Accurate Statistics in LSM trees
Zichen Zhu (Boston University), Yanpeng Wei (Tsinghua University), Ju Hyoung Mun (Brandeis University), Manos Athanassoulis (Boston University) - 125: Learned Offline Query Planning via Bayesian Optimization
Jeffrey Tao (University of Pennsylvania), Natalie Maus (University of Pennsylvania), Haydn Jones (University of Pennsylvania), Yimeng Zeng (University of Pennsylvania), Jacob Gardner (University of Pennsylvania), Ryan Marcus (University of Pennsylvania) - 126: Alsatian: Optimizing Model Search for Deep Transfer Learning
Nils Straßenburg (Hasso Plattner Institute, University of Potsdam), Boris Glavic (University of Illinois Chicago), Tilmann Rabl (Hasso Plattner Institute, University of Potsdam) - 127: Theoretically and Practically Efficient Maximum Defective Clique Search
Qiangqiang Dai (Beijing Institute of Technology), Ronghua Li (Beijing Institute of Technology), Donghang Cui (Beijing Institute of Technology), Guoren Wang (Beijing Institute of Technology) - 128: Two Birds with One Stone: Efficient Deep Learning over Mislabeled Data through Subset Selection
Yuhao Deng (Beijing Institute of Technology), Chengliang Chai (Beijing Institute of Technology), Kaisen Jin (Beijing Institute of Technology), Linan Zheng (University of Arizona), Lei Cao (University of Arizona/MIT), Ye Yuan ( Beijing Institute of Technology), Guoren Wang (Beijing Institute of Technology) - 129: Galley: Modern Query Optimization for Sparse Tensor Programs
Kyle Deeds (University of Washington), Willow Ahrens (CSAIL MIT), Magdalena Balazinska (UW), Dan Suciu (University of Washington) - 130: Buffered Persistence in B+ Trees
Mingzhe Du (University of Rochester), Michael Scott (University of Rochester) - 131: An Adaptive Benchmark for Modeling User Exploration of Large Datasets
Joanna Purich (University of Maryland, College Park), Anthony Wise (University of Washington), Leilani Battle (University of Washington) - 132: Credible Intervals for Knowledge Graph Accuracy Estimation
Stefano Marchesin (Università di Padova), Gianmaria Silvello (University of Padova) - 133: Divide-and-Conquer: Scalable Shortest Path Counting on Large Road Networks
Muhammad Farhan (Australian National University), Henning Koehler (Massey University), Qing Wang (ANU) - 134: A Universal Sketch for Estimating Heavy Hitters and Per-Element Frequency Moments in Data Streams with Bounded Deletions
Liang Zheng (Southeast University), Qingjun Xiao (Southeast University), Xuyuan CAI (Hong Kong Polytechnic University) - 135: A Rank-Based Approach to Recommender System's Top-K Queries with Uncertain Scores
Coral Scharf (Technion - Israel Institute of Technology), Carmel Domshlak (Technion - Israel Institute of Technology), Avigdor Gal (Technion -- Israel Institute of Technology), Haggai Roitman (IBM Research Haifa) - 136: Adda: Towards Efficient in-Database Feature Generation via LLM-based Agents
Kuan Lu (Zhejiang University), Zhihui Yang (Zhejiang University), Sai Wu (Zhejiang University), Ruichen Xia (Zhejiang University), Dongxiang Zhang (Zhejiang University), Gang Chen (Zhejiang University) - 137: HotStuff-1: Linear Consensus with One-Phase Speculation
Dakai Kang (University of California, Davis), Suyash Gupta (University of Oregon), Dahlia Malkhi (UC Santa Barbara and Chainlink Labs), Mohammad Sadoghi (University of California, Davis) - 138: Dangers of List Processing in Querying Property Graphs
Amélie Gheerbrant (Université de Paris, IRIF), Leonid Libkin (University of Edinburgh & RelationalAI), Alexandra Rogova (IRIF, Université Paris Cité) - 139: Understanding the Black Box: A Deep Empirical Dive into Shapley Value Approximations for Feature Explanations
SUCHIT GUPTE (Ohio State University), John Paparrizos (The Ohio State University) - 140: Parallel kd-tree with Batch Updates
Ziyang Men (UC Riverside), Zheqi Shen (UC Riverside), Yan Gu (UC Riverside), Yihan Sun (University of California, Riverside) - 141: Computing Approximate Graph Edit Distance via Optimal Transport
Qihao Cheng (Tsinghua University), Da Yan (Indiana University Bloomington), Tianhao Wu (Tsinghua University), Zhongyi Huang (Tsinghua University), Qin Zhang (Indiana University Bloomington) - 142: Scalable Complex Event Processing on Video Streams
Chenxia Han (The Chinese University of Hong Kong), Chaokun Chang (The Chinese University of Hong Kong), Srijan Srivastava (Hong Kong Centre For Perceptual and Interactive Intelligence), YAO LU (NUS), Eric Lo (Chinese University of Hong Kong) - 143: Modyn: Data-Centric Machine Learning Pipeline Orchestration
Maximilian Böther (ETH Zurich), Ties Robroek (IT University of Copenhagen), Viktor Gsteiger (ETH Zurich), Robin Holzinger (Technical University of Munich), Xianzhe Ma (ETH Zurich), Pinar Tozun (IT University of Copenhagen), Ana Klimovic (ETH Zurich) - 144: FastPDB: Towards Bag-Probabilistic Queries at Interactive Speeds
Aaron Huber (SUNY Buffalo), Boris Glavic (University of Illinois Chicago), Oliver Kennedy (University at Buffalo, SUNY), Atri Rudra (University at Buffalo), Zhuoyue Zhao (University at Buffalo) - 145: RLER-TTE: An Efficient and Effective Framework for En Route Travel Time Estimation with Reinforcement Learning
Zhihan Zheng (Beijing University of Posts and Telecommunications), Haitao Yuan (Nanyang Technological University), Minxiao Chen (Beijing University of Posts and Telecommunications), Shangguang Wang (State Key Laboratory of Networking and Switching Technology) - 146: Revisiting the Design of In-Memory Dynamic Graph Storage
Jixian Su (Shanghai Jiao Tong University), Chiyu Hao (Shanghai Jiao Tong University), Shixuan Sun (Shanghai Jiao Tong University), Hao Zhang (Huawei), Sen Gao (National University of Singapore), Jiaxin Jiang (National University of Singapore), Yao Chen (National University of Singapore), Chenyi Zhang (Huawei Technologies), Bingsheng He (National University of Singapore), Minyi Guo (Shanghai Jiaotong University) - 147: PilotDB: Database-Agnostic Online Approximate Query Processing with A Priori Error Guarantees
Yuxuan Zhu (University of Illinois Urbana-Champaign), Tengjun Jin (UIUC), Stefanos Baziotis (University of Illinois at Urbana-Champaign), Chengsong Zhang (UIUC), Charith Mendis (University of Illinois at Urbana-Champaign), Daniel Kang (UIUC) - 148: High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance
Maximilian Kuschewski (Technische Universität München), Jana Giceva (TU Munich), Thomas Neumann (TUM), Viktor Leis (Technische Universität München) - 149: ISSD: Indicator Selection for Time Series State Detection
ChengYu Wang (NUDT), Tongqing Zhou (National University of Defense Technology), Lin Chen (National University of Defense Technology), shan zhao ( Hefei University of Technology), Zhiping Cai (NUDT) - 150: TableDC: Deep Clustering for Tabular Data
Hafiz Tayyab Rauf (The University of Manchester), André Freitas (University of Manchester), Norman Paton (University of Manchester) - 151: Clementi: Efficient Load Balancing and Communication Overlap for Multi-FPGA Graph Processing
Feng Yu (National University of Singapore), Hongshi Tan (National University of Singapore), Xinyu Chen (Hong Kong University of Science and Technology (Guangzhou)), Yao Chen (National University of Singapore), Bingsheng He (National University of Singapore), Weng-Fai Wong (National University of Singapore) - 152: Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Reham Omar (Concordia University), Omij Mangukiya (Concordia University), Essam Mansour (Concordia University) - 153: A Cost-Effective LLM-based Approach to Identify Wildlife Trafficking in Online Marketplaces (Data-Driven Applications)
Juliana Silva Barbosa (NYU), Sunandan Chakraborty (Indiana University Indianapolis), Ulhas Gondhali (CUNY graduate center, John Jay College of Criminal Justice), Jennifer Jacquet (University of Miami), Gohar Petrossian (John Jay College of Criminal Justice), Kinshuk Sharma (Indiana University Indianapolis), Juliana Freire (New York University) - 154: GABoost: Graph Alignment Boosting via Local Optimum Escape
Wei Liu (Peking University), Wei Zhang (Peking University), Haiyan Zhao (Peking University), Zhi Jin (Key Lab of High Confidence Software Technologies (Peking University), Ministry o) - 155: Federated Heavy Hitter Analytics with Local Differential Privacy
Yuemin Zhang (The Hong Kong Polytechnic University), Qingqing Ye (Hong Kong Polytechnic University), Haibo Hu (Hong Kong Polytechnic University) - 156: RWalks: Random Walks as Attribute Diffusers for Filtered Vector Search
Anas AIT AOMAR (Mohammed VI Polytechnic University), Karima Echihabi (Mohammed VI Polytechnic University), Marco Arnaboldi (Oracle), Ioannis Alagiannis (Oracle), Damien Hilloulin (Oracle), Manal CHERKAOUI (Mohammed VI Polytechnic University) - 157: Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs
Jiaxin Jiang (National University of Singapore), Siyuan Yao (National University of Singapore), Yuchen Li (Singapore Management University), Qiange Wang (National University of Singapore), Bingsheng He (National University of Singapore), Min Chen (Grab) - 158: Fair and Actionable Causal Prescription Ruleset
Benton Li (Cornell University), Nativ Levy (Technion), Brit Youngmann (Technion), Sainyam Galhotra (Cornell University), Sudeepa Roy (Duke University, USA) - 159: Robust Privacy-Preserving Triangle Counting under Edge Local Differential Privacy
Yizhang He (The University of New South Wales), Kai Wang (Shanghai Jiao Tong University), Wenjie Zhang (University of New South Wales), Xuemin Lin (Shanghai Jiaotong University), Ying Zhang (University of Technology Sydney), Wei Ni (CSIRO) - 160: Parallel $k$-Core Decomposition: Theory and Practice
Youzhe Liu (University of California, Riverside), Xiaojun Dong (University of California, Riverside), Yan Gu (UC Riverside), Yihan Sun (University of California, Riverside) - 161: CARINA: An Efficient CXL-Oriented Embedding Serving System for Recommendation Models
Peiqi Yin (The Chinese University of Hong Kong), Qihui Zhou (CUHK), Xiao Yan (Centre for Perceptual and Interactive Intelligence (CPII) ), Chao Wang (The Chinese University of Hong Kong), Eric Lo (Chinese University of Hong Kong), Changji Li (CUHK), Lan Lu (University of Pennsylvania), Hua Fan (Alibaba Cloud), Wenchao Zhou (Alibaba Group), Ming-Chang YANG (The Chinese University of Hong Kong), James Cheng (CUHK) - 162: DataVinci: Learning Syntactic and Semantic String Repairs
Mukul Singh (Microsoft), José Cambronero Sanchez (Microsoft), Sumit Gulwani (Microsoft Research), Vu Le (Microsoft), Carina Negreanu (Robin AI), Arjun Radhakrishna (Microsoft), Gust Verbruggen (Microsoft) - 163: AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language Models
runjie yu (Huazhong University of Science and Technology), Weizhou Huang (Huazhong University of Science and Technology), shuhan bai (Huazhong University of Science and Technology), Jian Zhou (Huazhong University of Science and Technology), Fei Wu (Huazhong University of Science and Technology) - 164: PrivPetal: Relational Data Synthesis via Permutation Relations
Kuntai Cai (National University of Singapore), Xiaokui Xiao (National University of Singapore), Yin Yang (Hamad bin Khalifa University) - 165: RM^2: Answer Counting Queries Efficiently under Shuffle Differential Privacy
Qiyao Luo (OceanBase, Ant Group), Jianzhe Yu (Hong Kong University of Science and Technology), Wei Dong (Nanyang Technological University), Quanqing Xu (OceanBase, Ant Group ), Chuanhui Yang (OceanBase), Ke Yi (Hong Kong Univ. of Science and Technology) - 166: Vectorizing Distributed Graph Computations made Automated
Wenyue Zhao (University of Edinburgh), Yang Cao (University of Edinburgh), Peter Buneman (The University of Edinburgh), Jia Li (Edinburgh Research Center, Central Software Institute, Huawei), Nikos Ntarmos (Edinburgh Research Center, Central Software Institute, Huawei) - 167: QURE: AI-Assisted and Automatically Verified UDF Inlining
Tarique Siddiqui (Microsoft Research), Arnd König (Microsoft), Jiashen Cao (Georgia Tech), Cong Yan (Microsoft research), Shuvendu Lahiri (Microsoft Research) - 168: Finding Logic Bugs in Graph-processing Systems via Graph-cutting
Qiuyang Mang (The Chinese University of Hong Kong, Shenzhen), Jinsheng Ba (ETH Zurich), Pinjia He (The Chinese University of Hong Kong, Shenzhen), Manuel Rigger (National University of Singapore) - 169: Lossless Transformation of Knowledge Graphs to Property Graphs using Standardized Schemas
Kashif Rabbani (Aalborg University Denmark), Matteo Lissandrini (University of Verona), Angela Bonifati (Univ. of Lyon), Katja Hose (TU Wien) - 170: Self-Enhancing Video Data Management System for Compositional Events with Large Language Models
Enhao Zhang (University of Washington), Nicole Sullivan (University of Washington), Brandon Haynes (Microsoft Gray Systems Lab), Ranjay Krishna (University of Washington), Magdalena Balazinska (UW) - 171: Efficient and Accurate Differentially Private Cardinality Continual Releases
Dongdong Xie (Xi'an Jiaotong University), Pinghui Wang (Xi'an Jiaotong University), Quanqing Xu (OceanBase, Ant Group), Chuanhui Yang (OceanBase), Rundong Li (Xi'an Jiaotong University)
PODS Research 10: Infinity and Inconsistency
Time: Wednesday, 25.06.2025, 16:00 - 17:30
Location: Charlottenburg I/II
Session chair: Mahmoud Abo Khamis
- Restricted Chase Termination: You Want More than Fairness - David Carral (LIRMM, Inria, University of Montpellier, CNRS), Lukas Gerlach (Knowledge-Based Systems Group, TU Dresden), Lucas Larroque (DI ENS), Michaël Thomazo (Inria, DIENS, ENS, CNRS, PSL University)
- No Cliques Allowed: The Next Step Towards BDD/FC Conjecture - Lucas Larroque (DI ENS), Piotr Ostropolski-Nalewaja (University of Wrocław / TU Dresden), Michaël Thomazo (Inria, DIENS, ENS, CNRS, PSL University)
- Polynomial Time Convergence of the Iterative Evaluation of Datalogo Programs - Sungjin Im (University of California, Merced), Benjamin Moseley (Carnegie Mellon University), Hung Ngo (RelationalAI Inc.), Kirk Pruhs (University of Pittsburgh)
- Below and Above Why-Provenance for Datalog Queries - Marco Calautti (University of Milano), Ester Livshits (University of Edinburgh), Andreas Pieris (University of Edinburgh and University of Cyprus), Markus Schneider (University of Edinburgh)
- Rewriting Consistent Answers on Annotated Data - Phokion Kolaitis (University of California Santa Cruz and IBM Research), Nina Pardal (University of Southampton), Jonni Virtema (University of Sheffield), Jef Wijsen (Université de Mons)
- Computing Range Consistent Answers to Aggregation Queries via Rewriting - Aziz Amezian El Khalfioui (University of Mons), Jef Wijsen (University of Mons)
SIGMOD Keynote 3 & Awards Talks 3
Time: Thursday, 26.06.2025, 08:30 - 10:30
Location: Potsdam I & III
- Keynote: Fifty Years of Transaction Processing Research (Phil Bernstein) -
Phil Bernstein (Microsoft)
Title: Fifty Years of Transaction Processing Research
Abstract:Fifty years ago, Jim Gray and his IBM colleagues published their first papers that defined the transaction abstraction and mechanisms to support it: two-phase locking for isolation and logging for atomicity and durability. Three years later, I published my first paper on the topic. By 1993, when Gray and Reuter published their now classic book, Transaction Processing: Concepts and Techniques, the transaction problem seemed to be solved. Yet research continues, as it should. There are several drivers of this research: algorithmic innovation (such as ARIES), invention of weakened isolation levels that have better performance (such as snapshot isolation), the advent of new platform architectures (such as multicore), changing requirements (such as prioritizing scalability over efficiency), and leveraging new mechanisms (such as RDMA). In this talk, I will recount some history of transaction research, explain why transaction research continues to this day, and speculate about its future.
Bio: Philip A. Bernstein is a Distinguished Scientist at Microsoft Research. Over the past 50 years, he has been a product architect at Microsoft and Digital Equipment Corp., a professor at Harvard University and Wang Institute of Graduate Studies, and a VP Software at Sequoia Systems. During that time, he has co-authored over 200 papers on database management and two books on the theory and implementation of transaction processing systems. He is a Fellow of the ACM and AAAS, a winner of the E.F. Codd SIGMOD Innovations Award, a member of the Washington State Academy of Sciences, and a member of the National Academy of Engineering. He received a B.S. degree from Cornell and M.Sc. and Ph.D. from University of Toronto. More details are at https://research.microsoft.com/~philbe.
- Award talks
Poster Session 3
Time: Thursday, 26.06.2025, 10:30 - 11:30
Location: nan
- 173: Efficient Indexing for Flexible Label-Constrained Shortest Path Queries in Road Networks
Libin Wang (Hong Kong University of Science and Technology), Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) - 174: How to Grow an LSM-tree: Towards Bridging The Gap Between Theory and Practice
Dingheng Mo (Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Stratos Idreos (Harvard) - 175: NEXT: A New Secondary Index Framework for LSM-based Data Storage
JIACHEN SHI (Nanyang Technological University), Jingyi Yang (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Xiaoli Li (Institute for Infocomm Research , ASTAR, Singapore/Nanyang Technological University) - 176: A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach
Taiyi Wang (University of Cambridge), Liang Liang (Imperial College London), Guang Yang (Neo4j), Thomas Heinis (Imperial College), Eiko Yoneki (University of Cambridge) - 177: BT-Tree: A Reinforcement Learning Based Index for Big Trajectory Data
Tu Gu (Nanyang Technological University), Kaiyu Feng (Beijing Institute of Technology), Jingyi Yang (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Cheng Long (Nanyang Technological University), Rui Zhang (ruizhang.info) - 178: B$\circledS X$ : Subgraph Matching with Batch Backtracking Search
Yujie Lu (Fudan University), Zhijie Zhang (Fudan University), Weiguo Zheng (Fudan University) - 179: Constant-time Connectivity Querying in Dynamic Graphs
Lantian Xu (University of Technology Sydney), Dong Wen (University of New South Wales), Lu Qin (UTS), Ronghua Li (Beijing Institute of Technology), Ying Zhang (University of Technology Sydney), Xuemin Lin (Shanghai Jiaotong University) - 180: A Local Search Approach to Efficient (k,p)-Core Maintenance
Chenghan Zhang (Wuhan University), Yuanyuan Zhu (Wuhan University), Lijun Chang (The University of Sydney) - 181: SBSC: A fast Self-tuned Bipartite proximity graph-based Spectral Clustering
Abdul Khan (PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur), Rashmi Maheshwari (IIITDM Jabalpur), Mohammad Maksood Akhter (PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur), Dr.Sraban Kumar Mohanty (IIIT Jabalpur) - 182: Subgroup Discovery with Small and Alternative Feature Sets
Jakob Bach (Karlsruhe Institute of Technology (KIT)) - 183: Graph-Based Vector Search: An Experimental Evaluation of the State-of-the-Art
Ilias Azizi (Mohammed VI Polytechnic University), Karima Echihabi (Mohammed VI Polytechnic University), Themis Palpanas (Université Paris Cité) - 184: DEG: Efficient Hybrid Vector Search Using the Dynamic Edge Navigation Graph
Ziqi Yin (Nanyang Technological University), Jianyang Gao (Nanyang Technological University), Pasquale Balsebre (Nanyang Technological University), Gao Cong (Nanyang Technological Univesity), Cheng Long (Nanyang Technological University) - 185: Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search
Jiuqi Wei (Institute of Computing Technology, Chinese Academy of Sciences), Xiaodong Lee (ICT), Zhenyu Liao (Huazhong University of Science and Technology), Themis Palpanas (Université Paris Cité), Botao Peng (Institute of Computing Technology, Chinese Academy of Sciences) - 186: iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search
Yuexuan Xu (Nanyang Technological University), Jianyang Gao (Nanyang Technological University), Yutong Gou (Nanyang Technological University), Cheng Long (Nanyang Technological University), Christian S. Jensen (Aalborg University) - 187: MIRAGE-ANNS: Mixed Approach Graph-based Indexing for Approximate Nearest Neighbor Search
Sairaj Voruganti (University of Waterloo), Tamer Özsu (University of Waterloo) - 188: Automated Validating and Fixing of Text-to-SQL Translation with Execution Consistency
Yicun Yang (Shanghai Jiao Tong University), zhaoguo wang (Shanghai Jiao Tong University), Yu Xia (Shanghai Jiao Tong University), Zhuoran Wei (Shanghai Jiao Tong University), Haoran Ding (Shanghai Jiao Tong University), Ruzica Piskac (Yale University), Haibo Chen (Shanghai Jiao Tong University), Jinyang Li (New York University) - 189: Reliable Text-to-SQL with Adaptive Abstention
kaiwen chen (university of Toronto), Yueting Chen (Seattle University), Nick Koudas (University of Toronto), Xiaohui Yu (York University) - 190: SNAILS: Schema Naming Assessments for Improved LLM-Based SQL Inference
Kyle Luoma (University of California, San Diego), Arun Kumar (University of California, San Diego) - 191: Hydro: Adaptive Query Processing of ML Queries
Gaurav Tarlok Kakkar (Georgia Institute of Technology), Jiashen Cao (Georgia Tech), Aubhro Sengupta (Georgia Institute of Technology), Joy Arulraj (Georgia Tech), Hyesoon Kim (Georgia Tech) - 192: Mitigating the Impedance Mismatch between Prediction Query Execution and Database Engine
Chenyang Zhang (East China Normal University), Junxiong Peng (East China Normal University), Chen Xu (East China Normal University), Quanqing Xu (OceanBase, Ant Group ), Chuanhui Yang (OceanBase) - 193: How Good are Learned Cost Models, Really? Insights from Query Optimization Tasks
Roman Heinrich (DFKI Darmstadt & TU Darmstadt), Manisha Luthra (TU Darmstadt and DFKI), Johannes Wehrstein (TU Darmstadt), Harald Kornmayer (DHBW Mannheim), Carsten Binnig (TU Darmstadt) - 194: Low Rank Learning for Offline Query Optimization
Zixuan Yi (University of Pennsylvania), Yao Tian (The Hong Kong University of Science and Technology), Zack Ives (University of Pennsylvania), Ryan Marcus (University of Pennsylvania) - 195: Optimizing Block Skipping for High-Dimensional Data with Learned Adaptive Curve
Xu Chen (University of Electronic Science and Technology of China), Shuncheng Liu (University of Electronic Science and Technology of China), Tong Yuan (Huawei Technologies Co., Ltd.), tao ye (huawei), Kai Zeng (Huawei Technologies Co. Ltd.), Han Su (University of Electronic Science and Technology of China), Kai Zheng (University of Electronic Science and Technology of China) - 196: SPACE: Cardinality Estimation for Path Queries Using Cardinality-Aware Sequence-based Learning
Mehmet Aytimur (University of Konstanz), Theodoros Chondrogiannis (University of Konstanz), Michael Grossniklaus (University of Konstanz) - 197: T3: Accurate and Fast Performance Prediction for Relational Database Systems With Compiled Decision Trees
Maximilian Rieger (TUM), Thomas Neumann (TUM) - 198: Community Detection in Heterogeneous Information Networks Without Materialization
Jiaxin Jiang (National University of Singapore), Siyuan Yao (National University of Singapore), Yuhang Chen (National University of Singapore), Bingsheng He (National University of Singapore), Yudong Niu (Singapore Management University), Yuchen Li (Singapore Management University), Shixuan Sun (Shanghai Jiao Tong University), Yongchao Liu (Ant Group) - 199: Dual-Hierarchy Labelling: Scaling Up Distance Queries on Dynamic Road Networks
Muhammad Farhan (Australian National University), Henning Koehler (Massey University), Qing Wang (ANU) - 200: Cohesiveness-aware Hierarchical Compressed Index for Community Search over Attributed Graphs
Yuxiang Wang (Hangzhou Dianzi University), Zhangyang Peng (Hangzhou Dianzi University), Xiangyu Ke (Zhejiang University), Xiaoliang Xu (Hangzhou Dianzi University), Tianxing Wu (Southeast University), Yuan Gao (Hangzhou Dianzi University) - 201: Deep Overlapping Community Search via Subspace Embedding
Qing Sima (University of New South Wales), Jianke Yu (University of Technology Sydney), Xiaoyang Wang (University of New South Wales), Wenjie Zhang (University of New South Wales), Ying Zhang (University of Technology Sydney), Xuemin Lin (Shanghai Jiaotong University) - 202: A Lovász-Simonovits Theorem for Hypergraphs with Application to Local Clustering
Raj Kamal (Indian Institute of Technology Delhi), Amitabha Bagchi (IIT Delhi) - 203: CRDV: Conflict-free Replicated Data Views ?
Nuno Faria (INESCTEC & U. Minho), José Pereira (U. Minho & INESCTEC) - 204: Are database system researchers making correct assumptions about transaction workloads?
Cuong Nguyen (University of Maryland), Kevin Chen (8VC), Christopher DeCarolis (Independent), Daniel Abadi (UMD) - 205: Low-Latency Transaction Scheduling via Userspace Interrupts: Why Wait or Yield When You Can Preempt? ?
Kaisong Huang (Simon Fraser University), Jiatang Zhou (Simon Fraser University ), Zhuoyue Zhao (University at Buffalo), Dong Xie (Penn State University), Tianzheng Wang (Simon Fraser University) - 206: Moving on From Group Commit: Autonomous Commit Enables High Throughput and Low Latency on NVMe SSDs
Lam-Duy Nguyen (Technische Universität München), Adnan Alhomssi (University of Erlangen-Nürnberg), Tobias Ziegler (Technische Universität München), Viktor Leis (Technische Universität München) - 207: Boosting OLTP Performance with Per-Page Logging on NVDIMM
Seongjae Moon (Sungkyunkwan University), Bohyun Lee (Technische Universität München), Jonghyeok Park (Korea University), Sang-Won Lee (Seoul National University) - 208: SuSe: Summary Selection for Regular Expression Subsequence Aggregation over Streams
Steven Purtzel (Humboldt-Universität zu Berlin), Matthias Weidlich (Humboldt-Universität zu Berlin) - 209: Pandora: An Efficient and Rapid Solution for Persistence-Based Tasks in High-Speed Data Streams
Weihe Li (University of Edinburgh) - 210: SwiftSpatial: Spatial Joins on Modern Hardware
Wenqi Jiang (ETH Zurich), Oleh-Yevhen Khavrona (ETHZ), Martin Parvanov (ETH Zurich), Gustavo Alonso (ETHZ) - 211: GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAP
Nils Boeschen (TU Darmstadt), Tobias Ziegler (Technische Universität München), Carsten Binnig (TU Darmstadt) - 212: GPH: An Efficient and Effective Perfect Hashing Scheme for GPU Architectures
Jiaping Cao (Hong Kong Polytechnic University), Le XU (The Hong Kong Polytechnic University (PolyU)), Man Lung Yiu (Hong Kong Polytechnic University), Jianbin Qin (Shenzhen Institute of Computing Sciences, Shenzhen University ), Bo Tang (Southern University of Science and Technology) - 213: DPconv: Super-Polynomially Faster Join Ordering ?
Mihail Stoian (UTN), Andreas Kipf (UTN) - 214: LpBound: Pessimistic Cardinality Estimation Using Lp-Norms of Degree Sequences ?
Haozhe Zhang (University of Zurich), Christoph Mayer (University of Zurich), Mahmoud Abo Khamis (RelationalAI), Dan Olteanu (University of Zurich), Dan Suciu (University of Washington) - 215: Towards a Converged Relational-Graph Optimization Framework
Yunkai Lou (Alibaba Group), Longbin Lai (Alibaba Group), Bingqing Lyu (Alibaba Group), Yufan Yang (Alibaba Group), XiaoLi Zhou (阿里巴巴), Wenyuan Yu (Alibaba Group), Ying Zhang (University of Technology Sydney), Jingren Zhou (Alibaba Group) - 216: Debunking the Myth of Join Ordering: Toward Robust SQL Analytics
Junyi Zhao (Tsinghua University), Kai Su (Tsinghua University), Yifei Yang (University of Wisconsin, Madison), Xiangyao Yu (University of Wisconsin-Madison), Paraschos Koutris (University of Wisconsin-Madison), Huanchen Zhang (Tsinghua University) - 217: Logical and Physical Optimizations for SQL Query Execution over Large Language Models
Dario Satriani (UNIBAS), Enzo Veltri (Università della Basilicata), Donatello Santoro (Università della Basilicata), Sara Rosato (EURECOM), Simone Varriale (EURECOM), Paolo Papotti (EURECOM) - 218: Privacy and Accuracy-Aware AI/ML Model Deduplication
Hong Guan (Arizona State University), Lixi Zhou (Arizona State University), Lei Yu (Rensselaer Polytechnic Institute), Li Xiong (Emory University), Kanchan Chowdhury (Arizona State University), Lulu Xie (Arizona State University), Xusheng Xiao (Arizona State University), Jia Zou (Arizona State University) - 219: PoneglyphDB: Efficient Non-interactive Zero-Knowledge Proofs for Arbitrary SQL Queries Verification
Binbin Gu (UC Irvine), Faisal Nawab (University of California at Irvine), Juncheng Fang (University of California, Irvine) - 220: Computing Inconsistency Measures Under Differential Privacy
Shubhankar Mohapatra (University of Waterloo), Amir Gilad (The Hebrew University), Xi He (University of Waterloo), Benny Kimelfeld (Technion) - 221: PrivRM: A Framework for Range Mean Estimation under Local Differential Privacy
Liantong YU (The Hong Kong Polytechnic University), Qingqing Ye (Hong Kong Polytechnic University), Rong Du (PolyU) - 222: Disclosure-compliant Query Answering
Rudi Poepsel-Lemaitre (Technische Universität Berlin), Kaustubh Beedkar (Indian Institute of Technology Delhi), Volker Markl (Technische Universität Berlin) - 223: Near-Duplicate Sequence Alignment with One Permutation Hashing
Zhencan Peng (Rutgers University), Yuheng Zhang (Rutgers University ), Dong Deng (Rutgers University - New Brunswick) - 224: Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality
Xilin Tang (Renmin University of China), Feng Zhang (Renmin University of China), Shuhao Zhang (Huazhong University of Science and Technology), Yani Liu (Renmin University of China), Bingsheng He (National University of Singapore), Xiaoyong Du (Renmin University of China) - 225: Adaptive Quotient Filters
Richard Wen (University of Maryland), Hunter Mccoy (University of Utah), David Tench (Lawrence Berkeley National Labs), Guido Tagliavini (Rutgers University), Michael Bender (Stony Brook), Alex Conway (Cornell Tech), Martin Farach-Colton (Rutgers University), Rob Johnson (VMware Research), Prashant Pandey (University of Utah) - 226: FAAQP: Fast and Accurate Approximate Query Processing based on Bitmap-augmented Sum-Product Network
Hanbing Zhang (Fudan University), Yinan Jing (Fudan University), Zhenying He (Fudan University), Kai Zhang (Fudan University), X. Sean Wang (Fudan University) - 227: Approximating Opaque Top-k Queries
Jiwon Chang (University of Rochester), Fatemeh Nargesian (University of Rochester) - 228: LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-level CSR
Song Yu (Northeastern University), Shufeng Gong (NorthEastern University), Qian Tao (Alibaba Group), Sijie Shen (Alibaba Group), Yanfeng Zhang (Northeastern University), Wenyuan Yu (Alibaba Group), Pengxi Liu (Northeastern University), Zhixin Zhang (Northeastern University), Hongfu Li (Northeastern University), Luo Xiaojian (Alibaba group), Ge Yu (Northeastern University), Jingren Zhou (Alibaba Group) - 229: An experimental comparison of tree-data structures for connectivity queries on fully-dynamic undirected graphs
Qing Chen (University of Zürich), Michael Böhlen (University of Zurich), Sven Helmer (University of Zurich) - 230: TGraph: A Tensor-centric Graph Processing Framework
YongLiang Zhang (Wuhan University), Yuanyuan Zhu (Wuhan University), Hao Zhang (Huawei), Congli Gao (Huawei Technologies CO LTD), yuyang wang (Wuhan University), Guojing Li (Wuhan University), Tianyang Xu (Wuhan University), Ming Zhong (Wuhan University), Jiawei Jiang (Wuhan University), Tieyun Qian (Wuhan University), Chenyi Zhang (Huawei Technologies), Jeffrey Xu Yu (Chinese University of Hong Kong) - 231: Efficient Index Maintenance for Effective Resistance Computation on Evolving Graphs
Meihao Liao (Beijing Institute of Technology), Cheng Li (Beijing Institute of Technology), Ronghua Li (Beijing Institute of Technology), Guoren Wang (Beijing Institute of Technology) - 232: Accelerating Core Decomposition in Billion-Scale Hypergraphs
Wenqian Zhang (The University of New South Wales), Zhengyi Yang (University of New South Wales), Dong Wen (University of New South Wales), Wentao Li (University of Leicester), Wenjie Zhang (University of New South Wales), Xuemin Lin (Shanghai Jiaotong University) - 233: GPU-Accelerated Graph Cleaning with a Single Machine
Wenchao Bai (Southeast University), Wenfei Fan (Univ. of Edinburgh ), Shuhao Liu (Shenzhen Institute of Computing Sciences), Kehan Pang (Beihang University), Xiaoke Zhu (Beihang University), Jiahui Jin (Southeast University) - 234: The Best of Both Worlds: On Repairing Timestamps and Attribute Values for Multivariate Time Series
Jingyu Zhu (Nankai university), Weiwei Deng (Nankai University), Yu Sun (Nankai University), Shaoxu Song (Tsinghua University), Haiwei Zhang (Nankai University), Xiaojie Yuan (Nankai Univeristy) - 235: Table Overlap Estimation through Graph Embeddings
Francesco Pugnaloni (Hasso Plattner Institute), Luca Zecchini (University of Modena and Reggio Emilia), Matteo Paganelli (University of Modena and Reggio Emilia), Matteo Lissandrini (University of Verona), Felix Naumann (Hasso Plattner Institute, University of Potsdam), Giovanni Simonini (University of Modena and Reggio Emilia) - 236: Discovering Top-k Relevant and Diversified Rules
Wenfei Fan (Univ. of Edinburgh ), Ziyan Han (Beihang University), Min Xie (Shenzhen Institute of Computing Sciences ), Guangyi Zhang (Shenzhen Technology University) - 237: Provenance-Enabled Explainable AI
Jiachi Zhang (Alibaba Group), Wenchao Zhou (Alibaba Group), Benjamin Ujcich (Georgetown University) - 238: Entity/Relationship Graphs
Philipp Skavantzos (The University of Auckland), Sebastian Link (University of Auckland) - 239: Synthesizing Third Normal Form Schemata that Minimize Integrity Maintenance and Update Overheads
Zhuoxing Zhang (The University of Auckland), Sebastian Link (University of Auckland) - 240: Physical Visualization Design: Decoupling Interface and System Design
Yiru Chen (Columbia University), Xupeng Li (Columbia University), Jeffrey Tao (University of Pennsylvania), Alana Ramjit (Cornell Tech), Ravi Netravali (Princeton University), Subrata Mitra (Adobe Research), Aditya Parameswaran (University of California, Berkeley), Javad Ghaderi (Columbia University), Dan Rubenstein (Columbia University), Eugene Wu (Columbia University) - 241: Interactive Graph Search Made Simple
Shangqi Lu (Hong Kong University of Science and Technology (Guangzhou)), Ru Wang (Chinese University of Hong Kong), Yufei Tao (The Chinese University of Hong Kong) - 242: SketchQL: Video Moment Querying with a Visual Query Interface
Renzhi Wu (Georgia Institute of Technology), Pramod Chunduri (Georgia Institute of Technology), Ali Payani (Cisco Systems Inc.), Xu Chu (GATECH), Joy Arulraj (Georgia Tech), Kexin Rong (Georgia Institute of Technology) - 243: Intra-Query Runtime Elasticity for Cloud-Native Data Analysis
Xukang Zhang (Renmin University of China), Huanchen Zhang (Tsinghua University), Xiaofeng Meng (Renmin University of China) - 244: Live Patching for Distributed In-Memory Key-Value Stores
Michael Fruth (University of Passau), Stefanie Scherzinger (University of Passau) - 245: SHIELD: Encrypting Persistent Data of LSM-KVS from Monolithic to Disaggregated Datacenters
Viraj Thakkar (Arizona State University), Dongha Kim (Arizona State University), Yingchun Lai (Apache/Pegasus), Hokeun Kim (Arizona State University), Zhichao Cao (Arizona State University) - 246: ?-Tune: Harnessing Large Language Models for Automated Database System Tuning
Victor Giannakouris (Cornell University), Immanuel Trummer (Cornell University) - 247: Constant Optimization Driven Database System Testing
Chi Zhang (Nanjing University), Manuel Rigger (National University of Singapore) - 248: Capsule: an Out-of-Core Training Mechanism for Colossal GNNs
Yongan Xiang (University of Science and Technology of China ), Zezhong Ding (University of Science and Technology of China), Rui Guo (University of Science and Technology of China), Shangyou Wang (University of Science and Technology of China), Xike Xie (University of Science and Technology of China), S. Kevin Zhou (USTC) - 249: CtxPipe: Context-aware Data Preparation Pipeline Construction for Machine Learning
Haotian Gao (National University of Singapore), Shaofeng Cai (National University of Singapore), Anh Dinh (Deakin University), Zhiyong Huang (NUS School of Computing), Beng Chin Ooi (NUS) - 250: Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving
Shihong Gao (The Hong Kong University of Science and Technology), Xin Zhang (Hong Kong University of Science and Technology), Yanyan Shen (Shanghai Jiao Tong University), Lei Chen (Hong Kong University of Science and Technology) - 251: LCP: Enhancing Scientific Data Management with Lossy Compression for Particles
Longtao Zhang (Florida State University), Ruoyu Li (Florida State University), Congrong Ren (The Ohio State University), Sheng Di (Argonne National Laboratory, Lemont, IL), Jinyang Liu (University of Houston), Jiajun Huang (UCR), Robert Underwood (Argonne National Laboratory), Pascal Grosset (Los Alamos National Laboratory), Dingwen Tao (Institute of Computing Technology, Chinese Academy of Sciences), Xin Liang (University of Kentucky), Hanqi Guo (The Ohio State University), Franck Cappello (Argonne National Laboratory, Lemont, IL), Kai Zhao (Florida State University) - 252: Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
Xiaozhen Liu (University of California, Irvine), Yicong Huang (UC Irvine), Xinyuan Lin (University of California Irvine), Avinash Kumar (U C IRVINE), Sadeem Alsudais (King Saud University), Chen Li (UC Irvine)
Tutorial 5: Autotuning Systems: Techniques, Challenges, and Opportunities
Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Charlottenburg III
Brian Kroth (Microsoft), Sergiy Matusevych (Microsoft), Yiwen Zhu (Microsoft)
Tutorial 6: Data+AI: LLM4Data and Data4LLM
Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Bellevue
Guoliang Li (Tsinghua University), Jiayi Wang (Tsinghua University), Chenyang Zhang (Tsinghua University), Jiannan Wang (Huawei Technologies, Simon Fraser University)
SIGMOD Panel 2: Where Does Academic Database Research Go From Here?
Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Potsdam I
- Organizers
- Eugene Wu
- Raul Castro Fernandez
- Panelists:
- Dan Suciu (Microsoft Endowed Professor in Computer Science and Engineering at the University of Washington)
- Sihem Amer-Yahia (Research Director, CNRS, LIG; Vice President of VLDB Endowment)
- Yannis Ioannidis (Professor of Informatics & Telecom at the University of Athens, President of the Association for Computing Machinery)
- Anastasia Ailamaki (Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne), Ippokratis Pandis (VP/Distinguished Engineer at AWS)
- Jens Dittrich (Professor of Computer Science at Saarland University; 3x CIDR gong show winner)
SIGMOD Research 11: Approximate Query Processing and Sequences
Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Potsdam III
Session chair: NN
- Near-Duplicate Sequence Alignment with One Permutation Hashing - Zhencan Peng (Rutgers University)*, Yuheng Zhang (Rutgers University ), Dong Deng (Rutgers University - New Brunswick)
- Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality - Xilin Tang (Renmin University of China)*, Feng Zhang (Renmin University of China), Shuhao Zhang (Huazhong University of Science and Technology), Yani Liu (Renmin University of China), Bingsheng He (National University of Singapore), Xiaoyong Du (Renmin University of China)
- Adaptive Quotient Filters - Richard Wen (University of Maryland), Hunter Mccoy (University of Utah), David Tench (Lawrence Berkeley National Labs), Guido Tagliavini (Rutgers University), Michael Bender (Stony Brook), Alex Conway (Cornell Tech), Martin Farach-Colton (Rutgers University), Rob Johnson (VMware Research), Prashant Pandey (University of Utah)*
- FAAQP: Fast and Accurate Approximate Query Processing based on Bitmap-augmented Sum-Product Network - Hanbing Zhang (Fudan University)*, Yinan Jing (Fudan University), Zhenying He (Fudan University), Kai Zhang (Fudan University), X. Sean Wang (Fudan University)
- Approximating Opaque Top-k Queries - Jiwon Chang (University of Rochester)*, Fatemeh Nargesian (University of Rochester)
SIGMOD Research 12: Graph Database Systems
Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Charlottenburg I/II
Session chair: NN
- LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-level CSR - Song Yu (Northeastern University)*, Shufeng Gong (NorthEastern University), Qian Tao (Alibaba Group), Sijie Shen (Alibaba Group), Yanfeng Zhang (Northeastern University), Wenyuan Yu (Alibaba Group), Pengxi Liu (Northeastern University), Zhixin Zhang (Northeastern University), Hongfu Li (Northeastern University), Luo Xiaojian (Alibaba group), Ge Yu (Northeastern University), Jingren Zhou (Alibaba Group)
- An experimental comparison of tree-data structures for connectivity queries on fully-dynamic undirected graphs - Qing Chen (University of Zürich)*, Michael Böhlen (University of Zurich), Sven Helmer (University of Zurich)
- TGraph: A Tensor-centric Graph Processing Framework - YongLiang Zhang (Wuhan University), Yuanyuan Zhu (Wuhan University)*, Hao Zhang (Huawei), Congli Gao (Huawei Technologies CO LTD), yuyang wang (Wuhan University), Guojing Li (Wuhan University), Tianyang Xu (Wuhan University), Ming Zhong (Wuhan University), Jiawei Jiang (Wuhan University), Tieyun Qian (Wuhan University), Chenyi Zhang (Huawei Technologies), Jeffrey Xu Yu (Chinese University of Hong Kong)
- Efficient Index Maintenance for Effective Resistance Computation on Evolving Graphs - Meihao Liao (Beijing Institute of Technology), Cheng Li (Beijing Institute of Technology), Ronghua Li (Beijing Institute of Technology)*, Guoren Wang (Beijing Institute of Technology)
- Accelerating Core Decomposition in Billion-Scale Hypergraphs - Wenqian Zhang (The University of New South Wales), Zhengyi Yang (University of New South Wales)*, Dong Wen (University of New South Wales), Wentao Li (University of Leicester), Wenjie Zhang (University of New South Wales), Xuemin Lin (Shanghai Jiaotong University)
Tutorial 4: Privacy and Security in Distributed Data Markets (part 1)
Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Charlottenburg III
Daniel Alabi (University of Illinois Urbana-Champaign), Sainyam Galhotra (Cornell University), Shagufta Mehnaz (The Pennsylvania State University), Zeyu Song (The Pennsylvania State University), Eugene Wu (Columbia University)
Demo Session C
Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Bellevue
- Introducing RAW Hollow: An In-Memory Co-located, Compressed Object Store With Opt-in Strong Consistency [Industry]
Govind Venkatraman Krishnan (Netflix Inc), Eduardo Ramirez (Netflix Inc), Drew Koszewnik (Netflix Inc), Yujia (Cynthia) Xie (Netflix Inc), Tej Vepa (Netflix Inc), Bernardo Gomez Palacio (Netflix Inc) - Blink twice - automatic workload pinning and regression detection for Versionless Apache Spark using retries. [Industry]
Martin Grund (Databricks), Stefania Leone (Databricks), Justin Breese (Databricks), Reynold Xin (Databricks), Matei Zaharia (Databricks), Vijayan Prabhakaran (Databricks) - Demonstrating MAST: An Efficient System for Point Cloud Data Analytics
Jiangneng Li (Nanyang Technological University), Haitao Yuan (Nanyang Technological University), Jie Wang (Nanyang Technological University), Ziting Wang (Nanyang Technological University), Han Mao Kiah (Nanyang Technological University), Gao Cong (Nanyang Technological University) - VQLens: A Demonstration of Vector Query Execution Analysis
Jia yansha (Southern University of Science and Technology), Zhengxin You (Department of Computer Science and Engineering, Southern University of Science and Technology), Yujie Wang (Department of Computer Science and Engineering, Southern University of Science and Technology), Qiaomu Shen (Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology), Bo Tang (Department of Computer Science and Engineering, Southern University of Science and Technology) - anagodb: Offering Massive Parallelism for Database Engine
Yuto Hayamizu (The University of Tokyo), Ryoji Kawamichi (The University of Tokyo), Tsuyoshi Ozawa (The University of Tokyo), Masaru Kitsuregawa (The University of Tokyo / Research Organization of Information and Systems), Kazuo Goda (The University of Tokyo) - NebulaStream: An extensible, high-performance streaming engine for multi-modal edge applications
Steffen Zeuch (TU Berlin), Adrian Michalke (TU Berlin), Aljoscha Lepping (TU Berlin), Volker Markl (TU Berlin), Ricardo Martinez (TU Berlin), Nils Schubert (TU Berlin), Lukas Schwerdtfeger (TU Berlin), Taha Tekdogan (TU Berlin), Ariane Ziehn (TU Berlin), Christoph Falkensteiner (DHZC), Kyle Krüger (TU Berlin), Tobias Röschl (DHZC), Alexander Meyer (DHZC), Svea Wilkending (DHZC) - How DuckDB is USING KEY to Unlock Recursive Query Performance
Björn Bamberg (Universität Tübingen), Denis Hirn (Universität Tübingen), Torsten Grust (Universität Tübingen) - A Query-Aware Enormous Database Generator For System Performance Evaluation
xuhua huang (East China Normal University), zirui hu (East China Normal University), siyang weng (East China Normal University), rong zhang (East China Normal University), chengcheng yang (East China Normal University), xuan zhou (East China Normal University), weining qian (East China Normal University), chuanhui yang (OceanBase, Ant Group), quanqing xu (OceanBase, Ant Group) - CORE+: A COmplex event Recognition Engine in C++
Kyle Bossonney (Oxford University), Nicolá,s Buzeta (PUC Chile), Vicente Calisto (IMFD), Juan-Eduardo López (PUC Chile), Cristian Riveros (PUC Chile), Stijn Vansummeren (UHasselt) - UDFBench: A Tool for Benchmarking UDF Queries on SQL Engines
Yannis Foufoulas (Athena Research Center), Theoni Palaiologou (University of Athens), Alkis Simitsis (Athena Research Center), - ACE-in-Action: A Smart DBMS Bufferpool for SSDs
Tarikul Islam Papon (Boston University), Teona Bagashvili (Boston University), Manos Athanassoulis (Boston University) - Alpha Demo: A Hardware-Accelerated Data Model for Ad-Hoc Manipulation of Point Clouds
Sophie Pfister (University of Fribourg), Alberto Lerner (University of Fribourg), Abishek Ramdas (University of Fribourg), Philippe Cudré,-Mauroux (University of Fribourg) - Automated Database Tuning vs. Human-based Tuning in a Simulated Stressful Work Environment
Patrick Wang (Carnegie Mellon University), Wan Shen Lim (Carnegie Mellon University), William Zhang (Carnegie Mellon University), Samuel Arch (Carnegie Mellon University), Andrew Pavlo (Carnegie Mellon University) - TEQ: An Open and Developer-friendly Testbed for Edge-based Query Processing Algorithms
Yu Lei (Zhejiang University), Xinle Jiang (Southern University of Science and Technology), Hua Lu (Roskilde University), Christian Jensen (Aalborg University), Bo Tang (Southern University of Science and Technology), Huan Li (Zhejiang University) - Demonstrating PDSP-Bench: A Benchmarking System for Parallel and Distributed Stream Processing
Pratyush Agnihotri (TU Darmstadt), Carsten Binnig (DFKI and TU Darmstadt) - TD-Join: Leveraging Temporal Dependencies in Time Series Joins
Gianluca Rossi (Lyon 1 University), Riccardo Tommasini (INSA Lyon), Angela Bonifati (Lyon 1 University) - Demo of LearnedWMP: Workload Memory Prediction Using Deep Query Template Representations
Shaikh Quader (York University ), Ghadeer Abuoda (York University), Yonis Abokar (York University), Marin Litoiu (York University), Manos Papagelis (York University) - LingoDB-CT: Understanding LingoDB’,s Inner Workings
Michael Jungmair (Technical University of Munich)
SIGMOD Research 13: Data Cleaning and Explainability
Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Potsdam I
Session chair: NN
- GPU-Accelerated Graph Cleaning with a Single Machine - Wenchao Bai (Southeast University), Wenfei Fan (Univ. of Edinburgh ), Shuhao Liu (Shenzhen Institute of Computing Sciences)*, Kehan Pang (Beihang University), Xiaoke Zhu (Beihang University), Jiahui Jin (Southeast University)
- The Best of Both Worlds: On Repairing Timestamps and Attribute Values for Multivariate Time Series - Jingyu Zhu (Nankai university), Weiwei Deng (Nankai University), Yu Sun (Nankai University)*, Shaoxu Song (Tsinghua University), Haiwei Zhang (Nankai University), Xiaojie Yuan (Nankai Univeristy)
- Table Overlap Estimation through Graph Embeddings - Francesco Pugnaloni (Hasso Plattner Institute)*, Luca Zecchini (University of Modena and Reggio Emilia), Matteo Paganelli (University of Modena and Reggio Emilia), Matteo Lissandrini (University of Verona), Felix Naumann (Hasso Plattner Institute, University of Potsdam), Giovanni Simonini (University of Modena and Reggio Emilia)
- Discovering Top-k Relevant and Diversified Rules - Wenfei Fan (Univ. of Edinburgh ), Ziyan Han (Beihang University), Min Xie (Shenzhen Institute of Computing Sciences )*, Guangyi Zhang (Shenzhen Technology University)
- Provenance-Enabled Explainable AI - Jiachi Zhang (Alibaba Group)*, Wenchao Zhou (Alibaba Group), Benjamin Ujcich (Georgetown University)
SIGMOD Research 14: Data Models & Interfaces
Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Potsdam III
Session chair: NN
- Entity/Relationship Graphs - Philipp Skavantzos (The University of Auckland), Sebastian Link (University of Auckland)*
- Synthesizing Third Normal Form Schemata that Minimize Integrity Maintenance and Update Overheads - Zhuoxing Zhang (The University of Auckland), Sebastian Link (University of Auckland)*
- Physical Visualization Design: Decoupling Interface and System Design - Yiru Chen (Columbia University)*, Xupeng Li (Columbia University), Jeffrey Tao (University of Pennsylvania), Alana Ramjit (Cornell Tech), Ravi Netravali (Princeton University), Subrata Mitra (Adobe Research), Aditya Parameswaran (University of California, Berkeley), Javad Ghaderi (Columbia University), Dan Rubenstein (Columbia University), Eugene Wu (Columbia University)
- Interactive Graph Search Made Simple - Shangqi Lu (Hong Kong University of Science and Technology (Guangzhou)), Ru Wang (Chinese University of Hong Kong), Yufei Tao (The Chinese University of Hong Kong)*
- SketchQL: Video Moment Querying with a Visual Query Interface - Renzhi Wu (Georgia Institute of Technology)*, Pramod Chunduri (Georgia Institute of Technology), Ali Payani (Cisco Systems Inc.), Xu Chu (GATECH), Joy Arulraj (Georgia Tech), Kexin Rong (Georgia Institute of Technology)
SIGMOD Industry 5: Post-Relational Database Systems
Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Charlottenburg I/II
Session chair: NN
- Asynchronous Replication Strategies for a Real-Time DBMS -
Srinivasan Seshadri (Aerospike)*
- Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables -
Daniel Sotolongo (Snowflake)*; Daniel Mills (Snowflake); Tyler Akidau (Snowflake); Anirudh Santhiar (Snowflake); Attila-Peter Toth (Snowflake); Ilaria Battiston (Snowflake); Ankur Sharma (Snowflake); Botong Huang (Snowflake); Boyuan Zhang (Snowflake); Da
- Pruning in Snowflake: Working Smarter, Not Harder -
Andreas Zimmerer (University of Technology Nuremberg)*; Damien Dam (Snowflake Inc.); Jan Kossmann (Snowflake Inc.); Juliane Waack (Snowflake Inc.); Ismail Oukid (Snowflake Inc.); Andreas Kipf (University of Technology Nuremberg)
- AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes -
Anja Gruenheid (Microsoft)*; Jesús Camacho-Rodríguez (Microsoft); Carlo Curino (Microsoft); Raghu Ramakrishnan (Microsoft); Stas Pak (LinkedIn); Sumedh Sakdeo (LinkedIn); Lenisha Gandhi (LinkedIn); Sandeep Singhal (LinkedIn); Pooja Nilangekar (University
- Databricks Lakeguard: Supporting fine-grained access control and multi-user capabilities for Apache Spark workloads. -
Martin Grund (Databricks)*; Stefania Leone (Databricks); Matei Zaharia (Databricks); Reynold Xin (Databricks)
- Oceanus: Enable SLO-Aware Vertical Autoscaling for Cloud-Native Streaming Service in Tencent -
Zihao Chen (Tencent); Jiazhi Jiang (Beijing Normal University)*; Jiangang Liu (Tencent); Chao Zhang (Tencent); Yuyi Diao (Tencent); Yang Li (Tencent); Hanmei Luo (Tencent); Peng Chen (Tencent)
Tutorial 4: Privacy and Security in Distributed Data Markets (part 2)
Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Charlottenburg III
Daniel Alabi (University of Illinois Urbana-Champaign), Sainyam Galhotra (Cornell University), Shagufta Mehnaz (The Pennsylvania State University), Zeyu Song (The Pennsylvania State University), Eugene Wu (Columbia University)
Demo Session D
Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Bellevue
- Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data
Longbin Lai (Alibaba Group), Changwei Luo (Alibaba Group), Yunkai Lou (Alibaba Group), Mingchen Ju (University of New South Wales), Zhengyi Yang (University of New South Wales) - KDSelector: A Knowledge-Enhanced and Data-Efficient Model Selector Learning Framework for Time Series Anomaly Detection
Zhiyu Liang (Harbin Institute of Technology), Dongrui Cai (Harbin Institute of Technology), Chenyuan Zhang (Harbin Institute of Technology), Zheng Liang (Harbin Institute of Technology), Chen Liang (Harbin Institute of Technology), Bo Zheng (CnosDB Inc.), Shi Qiu (Central South University), Jin Wang (Central South University), Hongzhi Wang (Harbin Institute of Technology) - PY-SHARQ: A Holistic Python Library for Explaining Association Rules on Relational Data
Hadar Ben-Efraim (Bar-Ilan University), Susan Davidson (University of Pennsylvania), Amit Somech (Bar-Ilan University) - CausalExplain: Causal Explanations of Black-box Models with Training Data Subsets
Arman Ashkari (University of Utah), El Kindi Rezig (University of Utah) - SeerCuts: Explainable Attribute Discretization
Eugenie Lai (Massachusetts Institute of Technology), Inbal Croitoru (Technion), Noam Bitton (Technion), Ariel Shalem (Technion), Brit Youngmann (Technion), Sainyam Galhotra (Cornell), El Kindi Rezig (University of Utah), Michael Cafarella (Massachusetts Institute of Technology) - Interactive Fairness Auditing: Leveraging AVOIR for Dynamic Evaluation and Mitigation
Amin Meghrazi (The Ohio State University), Pranav Maneriker (The Ohio State University), Swati Padhee (The Ohio State University), Srinivasan Parthasarathy (The Ohio State University) - Zorro: Quantifying Uncertainty in Models & Predictions Arising from Dirty Data
Kaiyuan Hu (UCSD), Jiongli Zhu (UCSD), Boris Glavic (University of Illinois Chicago), Babak Salimi (UCSD) - CauSumX: Summarized Causal Explanations For Group-By-Average Queries
Nativ Levy (Technion), Michael Cafarella (CSAIL), Amir Gilad (Hebrew University), Sudeepa Roy (Duke), Brit Youngmann (Technion) - Demonstration of DPClustX: Differentially Private Explanations for Clusters
Ron Zadicario (Tel Aviv University), Amir Gilad (Hebrew University), Kathy Razmadze (Tel Aviv University), Tova Milo (Tel Aviv University) - Authenticating Multi-Chain Queries: Verifiable Virtual Filesystem Is All You Need
Haixin Wang (Hong Kong Baptist University), Cheng Xu (Hong Kong Baptist University), Ce Zhang (Hong Kong Baptist University), Haibo Hu (Hong Kong Polytechnic University), Shikun Tian (Digital Technologies, Ant Group), Shenglong Chen (Digital Technologies, Ant Group), Ying Yan (Digital Technologies, Ant Group), Jianliang Xu (Hong Kong Baptist University) - Demonstrating CEDAR: A System for Cost-Efficient Data-Driven Claim Verification
Tharushi Jayasekara (Cornell University), Immanuel Trummer (Cornell University) - A Fast Line Density Visualization Plugin for Geographic Information Systems
Tsz Nam Chan (Shenzhen University), Bojian Zhu (Hong Kong Baptist University), Dingming Wu (Shenzhen University), Yun Peng (Guangzhou University), Leong Hou U (University of Macau), Wei Tu (Shenzhen University), Ruisheng Wang (Shenzhen University) - Virtualizing Cloud Data Infrastructures with BRAD
Geoffrey Yu (Massachusetts Institute of Technology), Ziniu Wu (Massachusetts Institute of Technology), Ferdi Kossmann (Massachusetts Institute of Technology), Tianyu Li (Massachusetts Institute of Technology), Markos Markakis (Massachusetts Institute of Technology), Amadou Ngom (Massachusetts Institute of Technology), Sophie Zhang (Massachusetts Institute of Technology), Tim Kraska (Massachusetts Institute of Technology), Samuel Madden (Massachusetts Institute of Technology) - SemExplorer: A User Interface for Semantic Approach to Customized Dataset Search
Zixin Wei (The Chinese University of Hong Kong, Shenzhen), Jun Han (The Hong Kong University of Science and Technology), Xiaolin Han (The Northwestern Polytechnical University), Chenhao Ma (The Chinese University of Hong Kong, Shenzhen) - Mosaic: An Architecture for Linking Databases and Scalable Interactive Visualizations
Jeffrey Heer (University of Washington), Dominik Moritz (Carnegie Mellon University), Ron Pechunk (University of Washington) - PASCAL: A Theory-Informed Visual Interface for Property Graph Schema Visualization
Kasidis Chanthatrojwong (Nanyang Technological University), Sourav S Bhowmick (Nanyang Technological University), Byron Choi (Hong Kong Baptist University) - RelationalPatternVis: A Tool for Query Pattern Visualization
Diandre Sabale (Northeastern University), Wolfgang Gatterbauer (Northeastern University)
SIGMOD Research 15: Cloud-Scale DBMS, Testing and Tuning
Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Potsdam I
Session chair: NN
- Intra-Query Runtime Elasticity for Cloud-Native Data Analysis - Xukang Zhang (Renmin University of China), Huanchen Zhang (Tsinghua University), Xiaofeng Meng (Renmin University of China)*
- Live Patching for Distributed In-Memory Key-Value Stores - Michael Fruth (University of Passau)*, Stefanie Scherzinger (University of Passau)
- SHIELD: Encrypting Persistent Data of LSM-KVS from Monolithic to Disaggregated Datacenters - Viraj Thakkar (Arizona State University)*, Dongha Kim (Arizona State University), Yingchun Lai (Apache/Pegasus), Hokeun Kim (Arizona State University), Zhichao Cao (Arizona State University)
- ðœ†-Tune: Harnessing Large Language Models for Automated Database System Tuning - Victor Giannakouris (Cornell University)*, Immanuel Trummer (Cornell University)
- Constant Optimization Driven Database System Testing - Chi Zhang (Nanjing University)*, Manuel Rigger (National University of Singapore)
SIGMOD Research 16: Scalable ML, Dataflow and Simulation
Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Potsdam III
Session chair: NN
- Capsule: an Out-of-Core Training Mechanism for Colossal GNNs - Yongan Xiang (University of Science and Technology of China ), Zezhong Ding (University of Science and Technology of China), Rui Guo (University of Science and Technology of China), Shangyou Wang (University of Science and Technology of China), Xike Xie (University of Science and Technology of China)*, S. Kevin Zhou (USTC)
- CtxPipe: Context-aware Data Preparation Pipeline Construction for Machine Learning - Haotian Gao (National University of Singapore), Shaofeng Cai (National University of Singapore), Anh Dinh (Deakin University), Zhiyong Huang (NUS School of Computing), Beng Chin Ooi (NUS)*
- Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving - Shihong Gao (The Hong Kong University of Science and Technology)*, Xin Zhang (Hong Kong University of Science and Technology), Yanyan Shen (Shanghai Jiao Tong University), Lei Chen (Hong Kong University of Science and Technology)
- LCP: Enhancing Scientific Data Management with Lossy Compression for Particles - Longtao Zhang (Florida State University), Ruoyu Li (Florida State University), Congrong Ren (The Ohio State University), Sheng Di (Argonne National Laboratory, Lemont, IL), Jinyang Liu (University of Houston), Jiajun Huang (UCR), Robert Underwood (Argonne National Laboratory), Pascal Grosset (Los Alamos National Laboratory), Dingwen Tao (Institute of Computing Technology, Chinese Academy of Sciences), Xin Liang (University of Kentucky), Hanqi Guo (The Ohio State University), Franck Cappello (Argonne National Laboratory, Lemont, IL), Kai Zhao (Florida State University)*
- Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs - Xiaozhen Liu (University of California, Irvine)*, Yicong Huang (UC Irvine), Xinyuan Lin (University of California Irvine), Avinash Kumar (U C IRVINE), Sadeem Alsudais (King Saud University), Chen Li (UC Irvine)
SIGMOD Industry 6: Data Models, Provenance and Governance
Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Charlottenburg I/II
Session chair: NN
- Unified Lineage System: Tracking Data Provenance at Scale -
Gabriela Jacques-Silva (Meta Inc)*; Evangelia Kalyvianaki (Meta Inc); Katriel Cohn-Gordon (Meta Inc); Adham Meguid (Meta Inc); Huy Ngyuen (Meta Inc); Danny Ben-David (Meta Inc); Carl Nayak (Meta Inc); Varun Saravagi (Meta Inc); George Stasa (Meta Inc); I
- Unity Catalog: Open and Universal Governance for the Lakehouse and Beyond -
Ramesh Chandra (Databricks)*; Haogang Chen (Databricks); Tim Januschowski (Databricks); Ray Matharu (Databricks); Andrew Li (Databricks); Adrian Ionescu (Databricks); Adriana Ispas (Databricks); Ben Zhang (Databricks); Bogdan Ghita (Databricks); Bogdan Ra
- Rel: A Programming Language for Relational Data -
Molham Aref (RelationalAI); Paolo Guagliardo (University of Edinburgh); George Kastrinis (RelationalAI); Leonid Libkin (University of Edinburgh & RelationalAI)*; Victor Marsault (LIGM, Univ. Gustave Eiffel, CNRS); Wim Martens (University of Bayreuth); Mar
- JSON Relational Duality: A Revolutionary Combination of Document, Object, and Relational Models -
Shashank Gugnani (Oracle)*; Zhen Liu (Oracle); Hui Chang (Oracle); Beda Hammerschmidt (Oracle); Srinivas Kareenhalli (Oracle); Kishy Kumar (Oracle); Tirthankar Lahiri (Oracle); Ying Lu (Oracle); Douglas McMahon (Oracle); Ajit Mylavarapu (Oracle); Sukhada
- Scalable Execution of Application Logic within Everest BusinessStore -
Benjamin Hilprecht (Everest Systems)*; Nico Mürdter (Everest Systems); Arthur Paim Arnold (Everest Systems GmbH); Kristijan Ziza (School of Electrical Engineering); Franz Färber (Everest Systems); Wolfgang Lehner (TU Dresden)
- SAP HANA Cloud: Data Management for Modern Enterprise Applications -
Norman May (SAP); Alexander Böhm (SAP); Daniel Ritter (SAP)*; Frank Renkes (SAP); Mihnea Andrei (SAP); Wolfgang Lehner (TU Dresden)
DEEM - Workshop on Data Management for End-to-End Machine Learning
Time: Friday, 27.06.2025, 08:30 - 17:00
Location: Charlottenburg I/II
https://deem-workshop.github.io/
Applying Machine Learning (ML) in real-world scenarios is a challenging task. In recent years, the main focus of the data management community has been on creating systems and abstractions for the efficient training of ML models on large datasets. However, model training is only one of many steps in an end-to-end ML application, and a number of orthogonal data management problems arise from the large-scale use of ML and increased adoption large language models (LLMs).
For example, data preprocessing and feature extraction workloads may be complicated and require simultaneous execution of relational and linear algebraic operations. Next, model selection may involve searching many combinations of model architectures, features, and hyper-parameters to find the best-performing model. After model training, the resulting model may have to be deployed and integrated into business workflows and require lifecycle management using metadata and lineage. As a further complication, the resulting system may have to take into account a heterogeneous audience, ranging from domain experts without programming skills to data engineers and statisticians who develop custom algorithms. Many such challenges are human or engineer-centered (e.g., monitoring ML pipelines, leveraging LLMs for domain-specific tasks at scale), and DEEM uniquely encourages submissions in such topics.
Additionally, the importance of incorporating ethics and legal compliance into machine-assisted decision-making is being broadly recognized. Critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. DEEM welcomes research on providing system-level support to data scientists who wish to develop and deploy responsible machine learning methods.
DEEM aims to bring together researchers and practitioners at the intersection of applied machine learning, data management and systems research, with the goal to discuss the arising data management issues in ML application scenarios.
Workshop Chairs:
- Madelon Hulsebos, CWI, Netherlands
- Matteo Interlandi, Microsoft GSL, USA
- Shreya Shankar, UC Berkeley, USA
- Stefan Grafberger, BIFOLD & TU Berlin, Germany
DBPL - International Symposium on Database Programming Languages
Time: Friday, 27.06.2025, 08:30 - 12:30
Location: Charlottenburg III
https://sites.google.com/view/dbpl2025/home
For over 35 years, DBPL has established itself as the principal venue for publishing and discussing new ideas and problems at the intersection of data management and programming languages. Many key contributions relevant to the formal foundations, design, implementation, and evaluation of query languages (e.g., for object-oriented, nested, or semi-structured data) were first announced at DBPL.
Workshop Chairs
- Torsten Grust, University of Tübingen
- Amir Shaikhha, University of Edinburgh
DataEd - Workshop on Data Systems Education
Time: Friday, 27.06.2025, 08:30 - 12:30
Location: Bellevue
https://dataedinitiative.github.io/DataEd25/index.html
Data systems education is foundational in a variety of programs such as computer science, data science, and information systems and science. And, indeed, data management concepts are both timely and timeless in our increasingly data-driven world. A continual focus since the 1970’s in the database research community is the place in curricula and best practices for teaching data systems concepts. This important conversation is particularly lively in recent years given the rise of data science. There is also a long tradition in the Computer Science Education research community on investigations into how students learn data systems concepts. With the increasing focus on data in the past decade, there is renewed focus on data systems in education research.
Both the DB and CS Education communities, and adjacent communities, e.g., in Statistics Education, have complementary perspectives and experiences to share with each other. There is much to be gained by bringing the communities more closely together: to share findings, to cross-pollinate perspectives and methods, and to shed light on opportunities for mutual progress in data systems education. The DataEd workshop is a dedicated venue for these communities to come together, for presentation and discussion of data management systems education experiences and research.
Workshop Chairs
- Abdussalam Alawini, University of Illinois Urbana-Champaign, USA
- Sourav Bhowmick, Nanyang Technological University, Singapore
- Michael Liut, University of Toronto Mississauga, Canada
GRADES-NDA - Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)
Time: Friday, 27.06.2025, 08:30 - 17:00
Location: Tiergarten I/II/III
The GRADES-NDA workshop explores the challenges, application areas, and usage scenarios of managing large-scale graphs. It provides a forum for exchanging ideas on mining, querying, and learning from real-world network data, fostering interdisciplinary collaboration, and sharing datasets and benchmarks.
GRADES-NDA brings together researchers from academia, industry, and government to discuss advances in large-scale graph data management and analytics. Its scope covers domain-specific challenges, noise handling in real-world graphs, and innovations in databases, data mining, machine learning, data streaming, network science, and graph algorithms. Case studies across diverse areas are welcome, including Social Networks, Business Analytics, Healthcare, and Cybersecurity.
Workshop Organizers
- Akhil Arora, Aarhus University & Copenhagen Center for Social Data Science, Denmark
- Stefania Dumbrava, ENSIIE & Télécom SudParis, France
Bojan Karlaš (Harvard University), Babak Salimi (University of California, San Diego), Sebastian Schelter (BIFOLD & TU Berlin)
Tutorial 10: Transactional Cloud Applications: Status Quo, Challenges, and Opportunities (part 1)
Time: Friday, 27.06.2025, 09:00 - 10:30
Location: Schöneberg I/II/III
Rodrigo Laigner (University of Copenhagen), George Christodoulou (Delft University of Technology), Asterios Katsifodimos (Delft University of Technology), Yongluan Zhou (University of Copenhagen)
Bojan Karlaš (Harvard University), Babak Salimi (University of California, San Diego), Sebastian Schelter (BIFOLD & TU Berlin)
Tutorial 10: Transactional Cloud Applications: Status Quo, Challenges, and Opportunities (part 2)
Time: Friday, 27.06.2025, 11:00 - 12:30
Location: Schöneberg I/II/III
Rodrigo Laigner (University of Copenhagen), George Christodoulou (Delft University of Technology), Asterios Katsifodimos (Delft University of Technology), Yongluan Zhou (University of Copenhagen)
Tutorial 9: Reproducible Prototyping of Query Optimizer Components
Time: Friday, 27.06.2025, 13:30 - 15:00
Location: Köpenick I/II/III
Rico Bergmann (TUD Dresden University of Technology), Dirk Habich (TUD Dresden University of Technology)
Tutorial 8: OLTP Engines on Modern Storage Architectures (part 1)
Time: Friday, 27.06.2025, 13:30 - 15:00
Location: Schöneberg I/II/III
Daokun Hu (Ant Group), Quanqing Xu (OceanBase, Ant Group), Chuanghui Yang (OceanBase, Ant Group)
ProvenanceWeek
Time: Friday, 27.06.2025, 13:30 - 17:00
Location: Bellevue
https://ucdbg.github.io/ProvenanceWeek2025/
IPAW and TaPP build on a successful history of provenance workshops that bring together researchers from a wide range of computer science fields including workflows, semantic web, databases, high performance computing, distributed systems, operating systems, programming languages, and software engineering, as well as researchers from other fields, such as biology and physics that have urgent provenance needs.
Provenance is increasingly important in data science, workflow systems, and many other areas, particularly to support transparency, accountability and explanations. By providing a record of the data creation process and of dependencies between data, provenance information is essential for tracing errors in transformed data back to erroneous inputs, access control, auditing, repeatability and reproducibility, evaluating data quality, and establishing ownership of data.
Chairs
- Tanja Auge, University of Regensburg
- Seokki Lee, University of Cincinnati
Senior Chairs
- Adriane Chapman, University of Southampton
- Paul Groth, University of Amsterdam
Tutorial 11: Data Storage and Management for Image AI Pipelines
Time: Friday, 27.06.2025, 15:30 - 17:00
Location: Köpenick I/II/III
Utku Sirin (Harvard University), Stratos Idreos (Harvard University)
Tutorial 8: OLTP Engines on Modern Storage Architectures (part 2)
Time: Friday, 27.06.2025, 15:30 - 17:00
Location: Schöneberg I/II/III
Daokun Hu (Ant Group), Quanqing Xu (OceanBase, Ant Group), Chuanghui Yang (OceanBase, Ant Group)