UTN logo with text UTN Data Systems logo
UTN Data Systems
  • Home
  • Blog (current)
  • Publications
  • Projects
  • GitHub
  • String Fingerprints

    Cloud data warehouses are text-heavy. As the amount of text data to scan increases, queries become slower, therefore query engines require fast pre-filters to accelerate them. We present string fingerprints, a lightweight secondary index structure designed to approximate LIKE predicates, albeit with false positives. Fingerprints can be optimized for specific workloads using mixed-integer optimization and even generalize to unseen table filters.

    5 min read   ·   March 23, 2026

    2026

  • Benchmarking Semantic Query Processing Systems

    Semantic query processing is emerging as a new layer atop relational engines, elevating LLM-backed semantic operators to first-class SQL primitives for multimodal data. We present SemBench, the first benchmark to rigorously evaluate these systems end-to-end, and outline our roadmap towards our own system, Spectra, to make semantic operators affordable at scale.

    13 min read   ·   February 16, 2026

    2026

  • Democratizing Data Science

    Our vision is to build an end-to-end agentic data platform, enabling domain experts to acquire, clean, analyze, and visualize data in a principled manner by combining the benefits of LLMs with decades of database research.

    5 min read   ·   January 16, 2026

    2026

  • Launching Our Blog And Wrapping Up 2025

    I'm super excited to launch our blog! We'll use this space to share what's happening in our lab, from research papers and systems to the day-to-day life of our team. To kick things off, let's look back at 2025.

    4 min read   ·   December 31, 2025

    2025

© Copyright 2026. Impressum. Last updated: March 25, 2026.
Cookie cow

We use analytics cookies to understand site usage.