Luis Gaspar Schroeder

Luis Gaspar Schroeder

San Francisco, California, USA

Hey! I'm Luis. I research and build efficient infrastructure and machine learning systems.

This was not always the plan. During COVID lockdowns and my last year of high school, I co-founded Faktor30[0]. Our first real client was a bank in Hamburg that needed a system we'd never built before. Six people building software for whoever needed it. A bank. A food delivery company. The code was rarely the bottleneck. It was building a team, defining vision, and making sure everyone moved in the same direction.

I could build software but didn't understand the theory behind it. So I did my undergrad at TU Munich[1] in computer science, building the theoretical foundations you can't get from coding alone. Then I joined Snowflake's Database Connectors team[2] and later Microsoft[3] to see how systems work at enterprise scale. Building data pipelines that thousands of teams depended on showed me what breaks when you go from ten users to ten thousand. I learned to combine theory and practice at scale. But I kept noticing the people who invented the foundational ideas I was building on seemed to see something I didn't. Not just how to extend existing systems, but how to invent new ones.

I wanted to learn that. I pursued my Master's in computer science and spent two semesters at UC Berkeley[4] researching at the Sky Computing Lab[5] with Joseph E. Gonzalez[6], Matei Zaharia[7], and Deepti Raghavan[8] at Stanford[9]. I went there to learn how to deconstruct systems, challenge assumptions at every layer, focus on research problems that actually matter, and keep exploring when the path forward isn't obvious. The work started with batch data analytics[10, MLSys 2025]. Everyone treated it as a model or a hardware problem. It was neither. The inefficiency was in how data flowed through the system layers. Solution? Reorder the content structure of the database tables to maximize KV cache reuse at the GPU level. No model changes. No new hardware. That insight led to vCache[11, ICLR 2026], a production-ready semantic caching system with mathematically proven error rate guarantees. Users define how often the system can be wrong. It guarantees that rate and outperforms every baseline. From there vAttention[12, ICLR 2026] and SkyLight[13] for efficient inference, ALTO[14] for compound AI orchestration, and work on overthinking[15] in agentic systems. I learned that you can't invent real solutions by optimizing one component in isolation. You need to understand the entire system.

Now, I'm a founding member of the technical staff at UniversalAGI[16]. We design and train machine learning models for physics simulations in automotive, aerospace, and beyond. Predictive models that compresses months of design iteration into hours for every engineer who uses it. The physics matters. Models need to conserve momentum, respect boundary conditions, and maintain numerical stability as they scale. A single training sample can be 100× larger than a typical LLM sequence, and standard architectures break at that scale. You can't just add more GPUs. Model architecture and training infrastructure have to be co-designed from the ground up.

Picks and shovels.

Feel free to reach out at luis.gasparschroeder[at-symbol]gmail.com