Titulación DCC

Guia Coguia Externo
Áreas Ciencia e Ingeniería de datos, Lenguajes de programación
Sub Áreas Bases de datos, Semántica de lenguajes
Estado Disponible
Descripción

Graph databases are everywhere: from social networks and biomedical research to fraud detection and knowledge graphs like Wikidata. When you query or validate these massive graphs, mistakes in your queries — like comparing incompatible data types or referencing impossible combinations of categories — are only discovered after the system has already spent significant time and resources trying to compute an answer. What if we could detect, before running the query, that it will inevitably fail or return empty results? This thesis topic explores exactly that: using ideas from type systems and static analysis (techniques normally applied to programming languages) to catch errors early in graph query languages such as SPARQL and validation languages such as SHACL. The work builds on a recent OOPSLA 2025 publication from our group that successfully applied these ideas to the new Graph Query Language (GQL), and the goal is to extend them to the richer and more widely deployed RDF/SPARQL ecosystem, where real datasets and benchmarks are readily available for evaluation.

This is a research-oriented topic that combines programming language theory (type systems, gradual typing) with databases (graph querying, schema validation). No prior knowledge of these specific areas is required — what matters is a strong foundation in computer science fundamentals, comfort with formal reasoning, and genuine curiosity about how theoretical tools can solve practical problems. You will learn to design formal calculi, prove metatheoretic properties, and build prototype tools that work on real-world data. This thesis will be co-supervised with Wenjia Ye, a postdoctoral researcher at The University of Hong Kong and first author of the OOPSLA 2025 paper that serves as the foundation for this work.