dunkeln.github.io

Work

Observability / Tracing

trace_lm

An observability layer for LLM systems that traces model calls and execution paths so prompts, tool use, and system behavior can be inspected as structured runs rather than opaque logs.

no tags
Applied AI Systems

ragops

A retrieval-focused experimentation and benchmarking project for understanding how RAG systems fail, with tooling aimed at making retrieval quality, context assembly, and debugging workflows easier to evaluate and iterate.

tags
  • #benchmark
  • #rag
Evaluation / Experimentation

llm-evals-lab

A small evaluation lab for testing prompts, verifier ideas, and model behavior in a more repeatable way than ad hoc notebooks, with quick experiments designed to compare failure modes across setups.

no tags

Research code for modeling stochastic cellular automata with a two-stage transformer pipeline, using entropy-aware patch selection to improve prediction over uncertain spatial dynamics.

tags
  • #embeddings
  • #research-project
  • #stochastic-processes
I've added GoatCounter for privacy-friendly traffic analytics. Privacy