Playbook

LLM Observability Baseline

How to instrument token usage, latency, and output quality with reproducible diagnostics.

Execution Checklist

LLM Token Counter

Count tokens and estimate model costs across GPT, Claude, Gemini, Llama, and more — with optional free API access for apps and agents

AI Cost Estimator

Estimate total AI API costs for real-world workloads across all major providers

Agent Trace Viewer

Visualize AI agent execution traces with timeline, table, and detail views for debugging LangChain and OpenAI agents

LLM Output Diff Tool

Compare outputs from different AI models side-by-side with diff highlighting