TECA Team
3 min read

Weekly Devlog #2: Publishing the Eval Baseline Before GA

Devlog
Evals
Trust

This week we focused on one thing that matters for trust: making our AI quality bar legible before GA.

Instead of waiting to publish a polished scorecard later, we published the eval methodology now and made the baseline page public.

What shipped

  • Public **/eval?utm_source=blog&utm_medium=article&utm_campaign=ga-launch-20260518** page with methodology, scoring rubric, and test categories.
  • Marketing-site footer link to the eval page.
  • Cross-document copy alignment so launch comms match the verified eval scope: **23 prompts across 5 categories**.

Why this matters

Most AI products ask for trust without showing how quality is measured. We are doing the opposite: publish the method first, then publish every scored baseline run against that method.

For us, this is a product principle, not a one-time launch asset: - if quality regresses, the release is blocked - if a claim can’t be verified, we soften or remove it - if we learn something new, we update the public artifact

What’s next

  • Populate the first scored baseline run and publish the per-run numbers on /eval.
  • Link the final eval URL in the GA blog post before CEO final read.
  • Continue weekly devlogs with shipped artifacts + lessons, not just announcements.

---

If you’re building AI products, our advice from this week is simple: publish your evaluation standard before you publish your metrics.

Share this insight:LinkedInX

Join the Continuous Manager

Get our latest thinking on AI, memory, and high-performance management delivered to your inbox every two weeks.

No spam. One-click unsubscribe anytime.

Start Your Growth Narrative

Join high-growth managers using TECA to automate context and lead with excellence.