Research Direction
I care most about the parts of frontier AI work that create judgment: where capability is real, how to evaluate it, and what it takes to make advanced systems dependable in practice.
My interests sit in the overlap between research, product, evaluation, and operations. I am most useful in environments where a team needs to decide what matters, what is signal versus noise, and how to turn emerging capability into something robust.
working principles
// how i thinkProduct exploration is a research instrument
I do not see product exploration as downstream polish on top of research. Done properly, it is a way of discovering where model capability creates durable leverage, where it falls apart, and what kinds of interaction actually expose the underlying science.
Evaluation is epistemic infrastructure
A serious lab needs ways to convert vague impressions into sharper judgment. Good evaluation systems are not reporting dashboards; they are the mechanism by which research, product, and strategy stay grounded in something more rigorous than taste alone.
Dependability matters as much as capability
The interesting question is rarely whether a frontier system can produce an impressive result once. The real question is whether it can do so reliably enough to support real work, real decisions, and real trust.
Public artifacts improve private thinking
I like making technical judgment legible in public. When you have to expose your method, your assumptions, and your caveats, you usually think more clearly. Public-facing artifacts can be a forcing function for better internal standards too.
public artifacts
// selected evidenceJob Exposure to AI, by AI
A public occupation atlas that combines labor-market data with structured multi-model judgments about replacement, augmentation, insulation, and disagreement.
Reflection AI Open Models
A public articulation of why frontier open models matter and why open research infrastructure is strategically important.