SIGMOD Berlin, Germany, 2025




SIGMOD/PODS Detailed Program

HILDA - Workshop on Human-In-the-Loop Data Analytics

Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Charlottenburg I/II/III

https://hilda.io/2025/

HILDA brings together researchers and practitioners to exchange ideas and results on human-data interaction. It explores how data management and analysis can be made more effective when taking into account the people who design and build these processes as well as those who are impacted by their results.

In HILDA 2022, we implemented a mentoring program (inspired by workshops such as PLATEAU) and are continuing it this year. Our focus is on promising and early-stage research, with a core component of the program being that each paper is assigned a mentor. More details on the process are below.

The theme for this edition of the workshop is HILDA and Large Language Models (LLMs), however, the workshop is not limited to this theme and other topics are also of interest. We encourage research on guidelines and best practices for effective human-LLM collaboration. We also encourage research that questions the role of humans in traditional data pipelines with the emergence of LLMs.

Workshop Chairs


NOVAS - Novel Optimizations for Visionary AI Systems

Time: Sunday, 22.06.2025, 08:30 - 12:30
Location: Köpenick I/II/III

https://www.novasworkshop.org/

We want to bridge the gap between "data management'' and "generative AI'' research. We are calling for work or early ideas which may be deemed innovative, controversial, or disruptive if considered from the perspective of more established research areas.

Workshop Organizers


MIDAS - Workshop connecting academia and industry on Modern Integrated Database and AI Systems

Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Tegel

https://sites.google.com/view/midas2025/home

This one-day workshop is designed to foster meaningful collaboration between researchers and industry practitioners by identifying and addressing complex challenges in the field of Generative AI (GenAI) and Data. These challenges often require a longer-term research perspective while simultaneously needing to remain grounded in real-world constraints and operational scenarios.

The primary goals of this workshop are twofold:

For Researchers: To inform and shape their research agendas by exposing them to the most pressing, unsolved challenges encountered by industry professionals. This engagement will help ensure that academic research remains relevant and aligned with practical needs, ultimately accelerating the path from theoretical advancements to real-world applications.

For Practitioners: To gain fresh perspectives and cutting-edge insights from the research community on emerging or unresolved topics in GenAI and Data. By engaging with researchers, industry professionals can explore novel methodologies, validate ideas, and potentially adopt innovative solutions to enhance their work.

We envision this workshop as an interactive and collaborative platform where participants from both academia and industry can share insights, challenges, and advancements in the rapidly evolving domains of GenAI and Data. Through panel discussions, presentations, and breakout sessions, attendees will have the opportunity to:

Identify Key Industry Challenges: Engage in discussions that highlight the most pressing problems faced in real-world GenAI and data-driven applications.

Explore Long-Term Research Directions: Examine areas where foundational research can contribute to addressing these challenges.

Build Cross-Sector Partnerships: Establish meaningful connections between researchers and practitioners, fostering collaborations that can lead to impactful innovations.

Exchange Practical & Theoretical Insights: Leverage the diverse expertise of participants to bridge the gap between theoretical advancements and their practical implementation.

By bringing together a diverse group of experts, this workshop aims to create a dynamic space where ideas are exchanged, research is informed by industry needs, and groundbreaking solutions can emerge at the intersection of academic rigor and real-world application.

Workshop Co-Chairs


aiDM - Workshop on Exploiting Artificial Intelligence Techniques for Data Management

Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Schöneberg I/II/III

http://www.aidm-conf.org/

Recently, the field of Artificial Intelligence (AI) has been experiencing a resurgence. AI broadly covers a wide swath of techniques, which include logic-based approaches, probabilistic graphical models, machine learning approaches such as deep learning. Advances in specialized hardware capabilities (e.g., Graphics Processing Units (GPUs), Tensor Processing Units (TPUs), Field-Programmable Gate Arrays (FPGAs), etc.), software ecosystem (e.g., programming languages such as Python, Data Science frameworks, and accelerated ML libraries), and systems infrastructure (e.g., cloud servers with AI accelerators) have led to wide-spread adoption of AI techniques in a variety of domains. Examples of such domains include image classification, autonomous driving, automatic speech recognition, and conversational systems (e.g., chatbots). AI solutions not only support multiple data types (e.g., images, speech, or text), but also are available in various configurations and settings, from personal devices to large-scale distributed systems.

Despite the widespread adoption of AI across diverse domains, its integration with data management systems remains in its infancy. Currently, most database management systems (DBMS) serve primarily as repositories for feeding input data to AI models and storing results. Recently, there has been increasing interest in using AI techniques within data management systems, including natural language interfaces to relational databases and machine learning techniques for query optimization and performance tuning. However, significant opportunities remain to harness the full potential of AI for enhancing data management workloads.

aiDM'24 is a one-day workshop that will bring together people from academia and industry to explore innovative ways to integrate AI techniques into data management systems. The workshop will focus on leveraging AI to enhance various components of data management systems, including user interfaces, tooling, performance optimizations, and support for new query types and workloads. Special attention will be given to transparently exploiting AI techniques, such as Generative AI frameworks, for enterprise-class data management workloads. We aim to identify key research areas and inspire new initiatives in this emerging and transformative field.

Workshop Program Chairs


LLM-DPM - Next Gen Data and Process Management: Large Language Models and Beyond

Time: Sunday, 22.06.2025, 08:30 - 17:00
Location: Tiergarten I/II/III

https://dbpmworkshop.github.io/

The Workshop of Next Gen Data and Process Management: Large Language Models and Beyond (LLM-DPM), held in conjunction with the 2025 ACM SIGMOD Conference in Berlin, Germany, will explore the transformative role of Explainable AI (XAI), Trustworthy AI, and Large Language Models (LLMs) in revolutionizing Data and Process Management systems. Organizations across industries rely on complex processes to deliver products, services, and outcomes. Understanding these processes is critical for uncovering inefficiencies, addressing bottlenecks, ensuring compliance, and driving operational excellence.

Process mining, which leverages event logs from data systems, has emerged as a powerful approach for visualizing workflows, identifying anomalies, and optimizing processes. However, traditional methods such as surveys and interviews remain costly, error-prone, and disconnected from real operations. This workshop aims to bridge this gap by examining how cutting-edge AI techniques, particularly LLMs, can advance process mining and data management.

The workshop will focus on the emerging role of explainable AI and LLMs in addressing long-standing challenges such as query interpretation, data augmentation, user interaction, system optimization, future process prediction, and actionable insights for proactive decision-making. Particular attention will be paid to accountability and fairness to ensure these advancements lead to transparent, equitable, and resilient systems.

Additionally, the workshop will tackle critical challenges in integrating AI into process mining, including data quality, scalability of analysis techniques, and the complexity of large datasets. Discussions will fill gaps left by main track topics, delving into specific use cases, risks, and technical innovations in data-centric environments. By fostering dialogue between researchers and practitioners, the workshop will provide advancements at the intersection of AI, process mining, and database systems, driving both research and enterprise adoption.

Organizers


Tutorial 1: Advances in Designing Scalable Graph Neural Networks: The Perspective of Graph Data Management (part 1)

Time: Sunday, 22.06.2025, 09:00 - 10:30
Location: Bellevue

Ningyi Liao (Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Xiaokui Xiao (National University of Singapore), Reynold Cheng (The University of Hong Kong)


Tutorial 1: Advances in Designing Scalable Graph Neural Networks: The Perspective of Graph Data Management (part 2)

Time: Sunday, 22.06.2025, 11:00 - 12:30
Location: Bellevue

Ningyi Liao (Nanyang Technological University), Siqiang Luo (Nanyang Technological University), Xiaokui Xiao (National University of Singapore), Reynold Cheng (The University of Hong Kong)


Tutorial 2: Supporting Human-Centric Data Exploration Through Semantics and Natural Language Interaction

Time: Sunday, 22.06.2025, 13:30 - 15:00
Location: Bellevue

Vidya Setlur (Tableau Research)


Qdata - Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications

Time: Sunday, 22.06.2025, 13:30 - 17:00
Location: Köpenick I/II/III

https://itrummer.github.io/qdata/

Whereas quantum computing started out as a purely theoretical concept, the last few years have seen a “Cambrian explosion” of first-generation commercial quantum hardware culminating from decades of foundational research. Players, including the likes of Google, IBM, and Intel, as well as startup companies like IQM, D-Wave, IonQ, and Rigetti, are now producing hardware devices that implement quantum computing using various technologies. At the same time, the recent advances in quantum computing have inspired a new generation of classical hardware accelerators, offered commercially by providers such as Fujitsu, Toshiba, and 1Qubit, that mirror the interfaces and take inspiration from internal processes of quantum computers. These accelerators, including digital annealers, as well as GPU- and FPGA-based simulators of quantum computation, obtain approximate solutions for extremely large, combinatorial optimization problems quickly.

Using quantum computing and related technologies has become convenient and possible with standard IT interfaces. Several software frameworks have recently appeared that make solving a diverse range of problems using quantum computers easier. At the same time, multiple cloud providers nowadays offer quantum computing as a service, making the technology accessible to broad shares of the population. Taken together, these developments have recently spawned a flurry of research in various communities, ranging from operations research to machine learning, and aimed at analyzing the transformative potential of quantum computing for specific use cases.

The primary objective of the Q-Data workshop is to explore how quantum computing and related technologies can enhance data processing, data management, data analysis systems, and techniques. It also focuses on hybrid approaches that integrate both quantum and classical computing methodologies to enhance such data systems and techniques. This workshop will spur new research efforts in this emerging field and pave the way for building next-generation data-intensive systems with quantum computing support.

Workshop Chairs


Tutorial 3: Learned Indexes From the One-dimensional to the Multi-dimensional Spaces: Challenges, Techniques, and Opportunities

Time: Sunday, 22.06.2025, 15:30 - 17:00
Location: Bellevue

Abdullah Al-Mamun (Purdue University), Jianguo Wang (Purdue University), Walid G. Aref (Purdue University)


SIGMOD/PODS Warmup: Query Optimization Unleashed

Time: Sunday, 22.06.2025, 17:15 - 18:30
Location: Charlottenburg I/II/III

What happens when cutting-edge theory meets AI-driven intelligence and real-world database engineering?

Prepare for an electrifying session that will shatter conventional wisdom on query optimization and cardinality estimation. This is not just another academic discussion — this is a battle of ideas where the sharpest minds from theory, machine learning, and database systems go head-to-head to define the future of query performance.

Several domain experts will challenge each other’s methodologies in a high-intensity, cross-disciplinary debate moderated by Dan Suciu (University of Washington) and Volker Markl (BIFOLD & Technical University of Berlin):

This is the session where database theory and systems really meet. Don’t just attend—be part of the revolution.

The event is organized by Floris Geerts, Benny Kimelfeld, and Volker Markl.

See the dedicated page for more details.


PODS Opening and Keynote

Time: Monday, 23.06.2025, 08:30 - 09:50
Location: Potsdam I/III

Session chair: Benny Kimelfeld


DaMoN - Data Management on New Hardware

Time: Monday, 23.06.2025, 10:00 - 18:30
Location: Charlottenburg I/II

https://damon-db.org/

The continued evolution of computing hardware and infrastructure imposes new challenges and bottlenecks to program performance. As a result, traditional database architectures that focus solely on I/O optimization increasingly fail to utilize hardware resources efficiently. Multi-core CPUs, GPUs, FPGAs, new memory and storage technologies (such as flash and non-volatile memory), and low-power hardware imposes a significant challenge to optimizing database performance. Consequently, exploiting the characteristics of modern hardware has become an essential topic of database systems research.

The goal is to make database systems adapt automatically to sophisticated hardware characteristics, thus maximizing performance transparently for applications. To achieve this goal, the data management community needs interdisciplinary collaboration with researchers from computer architecture, compilers, operating systems, and storage. This involves rethinking traditional data structures, query processing algorithms, and database software architectures to adapt to the advances in the underlying hardware infrastructure.

Chairs


PODS Research 1: Gems of PODS & Test of Time Award

Time: Monday, 23.06.2025, 10:00 - 11:00
Location: Potsdam I/III

Session chair: Nicole Schweikardt


PODS Research 2: Join Evaluation & Maintenance (including the Best Paper Awards)

Time: Monday, 23.06.2025, 11:30 - 13:00
Location: Potsdam I/III

Session chair: Andreas Pieris


PODS Research 3: Explanation and Minimization

Time: Monday, 23.06.2025, 14:30 - 16:00
Location: Potsdam I/III

Session chair: Wolfgang Gatterbauer


PODS Research 4: Database Queries, beyond Evaluation

Time: Monday, 23.06.2025, 16:30 - 18:30
Location: Potsdam I/III

Session chair: Bas Ketsman


PODS Business Meeting

Time: Monday, 23.06.2025, 20:00 - 21:00
Location: Charlottenburg III

SIGMOD Opening & Keynote 1 & Awards Talks 1

Time: Tuesday, 24.06.2025, 08:30 - 10:00
Location: Potsdam I & III

Session chair: NN


Poster Session 1

Time: Tuesday, 24.06.2025, 10:30 - 11:30
Location: nan


Demo Session A

Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Bellevue


SIGMOD Research 1: Indexing

Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Potsdam I
Session chair: NN

SIGMOD Research 2: Graph Algorithms

Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Potsdam III
Session chair: NN

PODS Research 5: Tutorial 1 (Albert Atserias) & Other Connections to Quantum Computing

Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Charlottenburg I/II

Session chair: Cristina Sirangelo


SIGMOD Industry 1: Cloud Database Architecture

Time: Tuesday, 24.06.2025, 11:30 - 13:00
Location: Tiergarten I/II/III

Session chair: NN


SIGMOD New Researcher Symposium

Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Charlottenburg III

Demo Session B

Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Bellevue


SIGMOD Research 4: Text-to-SQL & ML-infused Queries

Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Potsdam III
Session chair: NN

PODS Research 6: Randomized Analysis and Data Structures

Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Charlottenburg I/II

Session chair: Cristian Riveros


SIGMOD Industry 2: Distributed Systems and Hybrid Workloads

Time: Tuesday, 24.06.2025, 14:30 - 16:00
Location: Tiergarten I/II/III

Session chair: NN


SIGMOD Panel 1: AI for Future Databases: A New Beginning or a Boulevard of Broken Dreams?

Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Potsdam I

AI has opened new directions in database research, from learned components replacing traditional internals to LLMs, enabling a new generation of database systems that allow querying data beyond tables. Yet, adoption in commercial databases has been incremental rather than a fundamental rethinking of modern data system stacks. In this panel, we thus bring together experts from academia and industry to discuss the tension between potential and reality in how AI shapes real-world database products. We will explore questions such as: What should an AI-ready database stack look like—incremental evolution or radical departure? What prevents AI from replacing traditional components like query optimizers, cost models, and indexes? What does it take for LLM-based innovations to move beyond impressive demos? Can we use LLMs for more than Text-to-SQL and LLM-UDFs? By tackling these questions, this panel will challenge assumptions in research, examine AI’s role in future databases, and ask: Is AI the key to overcoming core limitations and will thus enable a new generation of database systems, or maybe AI is just another boulevard of broken (database) dreams?


SIGMOD Business Meeting

Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Charlottenburg III

SIGMOD Research 5: Machine Learning for Database Internals

Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Potsdam III
Session chair: NN

SIGMOD Research 6: Community and Network Analysis

Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Tiergarten I/II/III
Session chair: NN

PODS Research 7: Private Data Analysis (inc. Best Newcomer Award)

Time: Tuesday, 24.06.2025, 16:30 - 18:00
Location: Charlottenburg I/II

Session chair: Dan Suciu


SIGMOD Keynote 2 Awards Talks 2

Time: Wednesday, 25.06.2025, 08:30 - 10:30
Location: Potsdam I & III

SIGMOD Research 7: Transactions and Consistency

Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Potsdam I
Session chair: NN

SIGMOD Research 8: Streams, Spatial and Modern Hardware

Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Potsdam III
Session chair: NN

SIGMOD Industry 3: Query Optimization and Vector Databases

Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Bellevue
Session chair: NN

SIGMOD DEI Panel

Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Charlottenburg III

Organizer: Donatella Firmani
Panelists:


PODS Research 8: Tutorial 2 & Parallelization Bounds for Queries

Time: Wednesday, 25.06.2025, 11:00 - 12:30
Location: Charlottenburg I/II

Session chair: Zhewei Wei


SIGMOD Research 9: Query Optimization

Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Potsdam I
Session chair: NN

SIGMOD Research 10: Privacy in Data Management

Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Potsdam III
Session chair: NN

SIGMOD Industry 4: Graph Databases and ML

Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Bellevue
Session chair: NN

PODS Research 9: Sequence-Based Queries

Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Charlottenburg I/II

Session chair: Jef Wijsen


SIGMOD DEI birds of a feather

Time: Wednesday, 25.06.2025, 14:00 - 15:30
Location: Charlottenburg III

Poster Session 2

Time: Wednesday, 25.06.2025, 16:00 - 17:30
Location: nan


PODS Research 10: Infinity and Inconsistency

Time: Wednesday, 25.06.2025, 16:00 - 17:30
Location: Charlottenburg I/II

Session chair: Mahmoud Abo Khamis


SIGMOD Keynote 3 & Awards Talks 3

Time: Thursday, 26.06.2025, 08:30 - 10:30
Location: Potsdam I & III

Poster Session 3

Time: Thursday, 26.06.2025, 10:30 - 11:30
Location: nan


Tutorial 5: Autotuning Systems: Techniques, Challenges, and Opportunities

Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Charlottenburg III

Brian Kroth (Microsoft), Sergiy Matusevych (Microsoft), Yiwen Zhu (Microsoft)


Tutorial 6: Data+AI: LLM4Data and Data4LLM

Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Bellevue

Guoliang Li (Tsinghua University), Jiayi Wang (Tsinghua University), Chenyang Zhang (Tsinghua University), Jiannan Wang (Huawei Technologies, Simon Fraser University)


SIGMOD Panel 2: Where Does Academic Database Research Go From Here?

Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Potsdam I


SIGMOD Research 11: Approximate Query Processing and Sequences

Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Potsdam III
Session chair: NN

SIGMOD Research 12: Graph Database Systems

Time: Thursday, 26.06.2025, 11:30 - 13:00
Location: Charlottenburg I/II
Session chair: NN

Tutorial 4: Privacy and Security in Distributed Data Markets (part 1)

Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Charlottenburg III

Daniel Alabi (University of Illinois Urbana-Champaign), Sainyam Galhotra (Cornell University), Shagufta Mehnaz (The Pennsylvania State University), Zeyu Song (The Pennsylvania State University), Eugene Wu (Columbia University)


Demo Session C

Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Bellevue


SIGMOD Research 13: Data Cleaning and Explainability

Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Potsdam I
Session chair: NN

SIGMOD Research 14: Data Models & Interfaces

Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Potsdam III
Session chair: NN

SIGMOD Industry 5: Post-Relational Database Systems

Time: Thursday, 26.06.2025, 14:30 - 16:00
Location: Charlottenburg I/II

Session chair: NN


Tutorial 4: Privacy and Security in Distributed Data Markets (part 2)

Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Charlottenburg III

Daniel Alabi (University of Illinois Urbana-Champaign), Sainyam Galhotra (Cornell University), Shagufta Mehnaz (The Pennsylvania State University), Zeyu Song (The Pennsylvania State University), Eugene Wu (Columbia University)


Demo Session D

Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Bellevue


SIGMOD Research 15: Cloud-Scale DBMS, Testing and Tuning

Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Potsdam I
Session chair: NN

SIGMOD Research 16: Scalable ML, Dataflow and Simulation

Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Potsdam III
Session chair: NN

SIGMOD Industry 6: Data Models, Provenance and Governance

Time: Thursday, 26.06.2025, 16:30 - 18:00
Location: Charlottenburg I/II

Session chair: NN


DEEM - Workshop on Data Management for End-to-End Machine Learning

Time: Friday, 27.06.2025, 08:30 - 17:00
Location: Charlottenburg I/II

https://deem-workshop.github.io/

Applying Machine Learning (ML) in real-world scenarios is a challenging task. In recent years, the main focus of the data management community has been on creating systems and abstractions for the efficient training of ML models on large datasets. However, model training is only one of many steps in an end-to-end ML application, and a number of orthogonal data management problems arise from the large-scale use of ML and increased adoption large language models (LLMs).

For example, data preprocessing and feature extraction workloads may be complicated and require simultaneous execution of relational and linear algebraic operations. Next, model selection may involve searching many combinations of model architectures, features, and hyper-parameters to find the best-performing model. After model training, the resulting model may have to be deployed and integrated into business workflows and require lifecycle management using metadata and lineage. As a further complication, the resulting system may have to take into account a heterogeneous audience, ranging from domain experts without programming skills to data engineers and statisticians who develop custom algorithms. Many such challenges are human or engineer-centered (e.g., monitoring ML pipelines, leveraging LLMs for domain-specific tasks at scale), and DEEM uniquely encourages submissions in such topics.

Additionally, the importance of incorporating ethics and legal compliance into machine-assisted decision-making is being broadly recognized. Critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. DEEM welcomes research on providing system-level support to data scientists who wish to develop and deploy responsible machine learning methods.

DEEM aims to bring together researchers and practitioners at the intersection of applied machine learning, data management and systems research, with the goal to discuss the arising data management issues in ML application scenarios.

Workshop Chairs:


DBPL - International Symposium on Database Programming Languages

Time: Friday, 27.06.2025, 08:30 - 12:30
Location: Charlottenburg III

https://sites.google.com/view/dbpl2025/home

For over 35 years, DBPL has established itself as the principal venue for publishing and discussing new ideas and problems at the intersection of data management and programming languages. Many key contributions relevant to the formal foundations, design, implementation, and evaluation of query languages (e.g., for object-oriented, nested, or semi-structured data) were first announced at DBPL.

Workshop Chairs


DataEd - Workshop on Data Systems Education

Time: Friday, 27.06.2025, 08:30 - 12:30
Location: Bellevue

https://dataedinitiative.github.io/DataEd25/index.html

Data systems education is foundational in a variety of programs such as computer science, data science, and information systems and science. And, indeed, data management concepts are both timely and timeless in our increasingly data-driven world. A continual focus since the 1970’s in the database research community is the place in curricula and best practices for teaching data systems concepts. This important conversation is particularly lively in recent years given the rise of data science. There is also a long tradition in the Computer Science Education research community on investigations into how students learn data systems concepts. With the increasing focus on data in the past decade, there is renewed focus on data systems in education research.

Both the DB and CS Education communities, and adjacent communities, e.g., in Statistics Education, have complementary perspectives and experiences to share with each other. There is much to be gained by bringing the communities more closely together: to share findings, to cross-pollinate perspectives and methods, and to shed light on opportunities for mutual progress in data systems education. The DataEd workshop is a dedicated venue for these communities to come together, for presentation and discussion of data management systems education experiences and research.

Workshop Chairs


GRADES-NDA - Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)

Time: Friday, 27.06.2025, 08:30 - 17:00
Location: Tiergarten I/II/III

https://gradesnda.github.io/

The GRADES-NDA workshop explores the challenges, application areas, and usage scenarios of managing large-scale graphs. It provides a forum for exchanging ideas on mining, querying, and learning from real-world network data, fostering interdisciplinary collaboration, and sharing datasets and benchmarks.

GRADES-NDA brings together researchers from academia, industry, and government to discuss advances in large-scale graph data management and analytics. Its scope covers domain-specific challenges, noise handling in real-world graphs, and innovations in databases, data mining, machine learning, data streaming, network science, and graph algorithms. Case studies across diverse areas are welcome, including Social Networks, Business Analytics, Healthcare, and Cybersecurity.

Workshop Organizers


Tutorial 7: Navigating Data Errors in Machine Learning Pipelines: Identify, Debug, and Learn (part 1)

Time: Friday, 27.06.2025, 09:00 - 10:30
Location: Köpenick I/II/III

Bojan Karlaš (Harvard University), Babak Salimi (University of California, San Diego), Sebastian Schelter (BIFOLD & TU Berlin)


Tutorial 10: Transactional Cloud Applications: Status Quo, Challenges, and Opportunities (part 1)

Time: Friday, 27.06.2025, 09:00 - 10:30
Location: Schöneberg I/II/III

Rodrigo Laigner (University of Copenhagen), George Christodoulou (Delft University of Technology), Asterios Katsifodimos (Delft University of Technology), Yongluan Zhou (University of Copenhagen)


Tutorial 7: Navigating Data Errors in Machine Learning Pipelines: Identify, Debug, and Learn (part 2)

Time: Friday, 27.06.2025, 11:00 - 12:30
Location: Köpenick I/II/III

Bojan Karlaš (Harvard University), Babak Salimi (University of California, San Diego), Sebastian Schelter (BIFOLD & TU Berlin)


Tutorial 10: Transactional Cloud Applications: Status Quo, Challenges, and Opportunities (part 2)

Time: Friday, 27.06.2025, 11:00 - 12:30
Location: Schöneberg I/II/III

Rodrigo Laigner (University of Copenhagen), George Christodoulou (Delft University of Technology), Asterios Katsifodimos (Delft University of Technology), Yongluan Zhou (University of Copenhagen)


Tutorial 9: Reproducible Prototyping of Query Optimizer Components

Time: Friday, 27.06.2025, 13:30 - 15:00
Location: Köpenick I/II/III

Rico Bergmann (TUD Dresden University of Technology), Dirk Habich (TUD Dresden University of Technology)


Tutorial 8: OLTP Engines on Modern Storage Architectures (part 1)

Time: Friday, 27.06.2025, 13:30 - 15:00
Location: Schöneberg I/II/III

Daokun Hu (Ant Group), Quanqing Xu (OceanBase, Ant Group), Chuanghui Yang (OceanBase, Ant Group)


ProvenanceWeek

Time: Friday, 27.06.2025, 13:30 - 17:00
Location: Bellevue

https://ucdbg.github.io/ProvenanceWeek2025/

IPAW and TaPP build on a successful history of provenance workshops that bring together researchers from a wide range of computer science fields including workflows, semantic web, databases, high performance computing, distributed systems, operating systems, programming languages, and software engineering, as well as researchers from other fields, such as biology and physics that have urgent provenance needs.

Provenance is increasingly important in data science, workflow systems, and many other areas, particularly to support transparency, accountability and explanations. By providing a record of the data creation process and of dependencies between data, provenance information is essential for tracing errors in transformed data back to erroneous inputs, access control, auditing, repeatability and reproducibility, evaluating data quality, and establishing ownership of data.

Chairs

Senior Chairs


Tutorial 11: Data Storage and Management for Image AI Pipelines

Time: Friday, 27.06.2025, 15:30 - 17:00
Location: Köpenick I/II/III

Utku Sirin (Harvard University), Stratos Idreos (Harvard University)


Tutorial 8: OLTP Engines on Modern Storage Architectures (part 2)

Time: Friday, 27.06.2025, 15:30 - 17:00
Location: Schöneberg I/II/III

Daokun Hu (Ant Group), Quanqing Xu (OceanBase, Ant Group), Chuanghui Yang (OceanBase, Ant Group)


Credits
Follow our progress: FacebookTwitter