LLM Judge Rubric Builder

LLM Judge Rubric Builder

Compose weighted criteria, scoring scales, and a self-validating judge prompt — exportable as YAML or JSON.

Criteria (5)

Total weight1.000
Rubric is valid: weights sum to 1.0, every criterion has an id, name, and weight.
name: Customer Support Response Rubric
task: Score how well the assistant resolves a customer support ticket using only
  the provided knowledge base context.
judge_model: claude-sonnet-4.6
output_format: json
criteria:
  - id: factuality
    name: Factual accuracy
    description: All claims are correct and verifiable from the source material; no
      hallucinated facts.
    weight: 0.35
    scale: likert5
    pass_threshold: 4
  - id: completeness
    name: Completeness
    description: Response addresses every part of the user request without omission.
    weight: 0.25
    scale: likert5
    pass_threshold: 4
  - id: groundedness
    name: Groundedness
    description: Response cites or references provided context where relevant; no
      extrapolation beyond context.
    weight: 0.2
    scale: likert5
    pass_threshold: 3
  - id: format
    name: Output format compliance
    description: Response strictly matches the requested output format (JSON,
      markdown headings, length).
    weight: 0.1
    scale: pass_fail
  - id: tone
    name: Tone and clarity
    description: Tone matches brief; language is clear and free of jargon when not
      warranted.
    weight: 0.1
    scale: likert5
    pass_threshold: 3

What This Tool Does

LLM Judge Rubric Builder is built for deterministic developer and agent workflows.

Build LLM-as-judge rubric YAML or JSON with weighted criteria, scoring scales, judge-prompt scaffolding, and rubric self-validation.

Use How to Use for execution steps and FAQ for constraints, policies, and edge cases.

Last updated:

This tool is provided as-is for convenience. Output should be verified before use in any production or critical context.

Agent Invocation

Best Path For Builders

Browser workflow

Runs instantly in the browser with private local processing and copy/export-ready output.

Browser Workflow

This tool is optimized for instant in-browser execution with local data handling. Run it here and copy/export the output directly.

/llm-judge-rubric-builder/

For automation planning, fetch the canonical contract at /api/tool/llm-judge-rubric-builder.json.

How to Use LLM Judge Rubric Builder

  1. 1

    Set the rubric metadata

    Name the rubric, write a short task description so the judge model knows what it is grading, choose your judge model, and select an output format (JSON, XML, or structured text).

  2. 2

    Add and weight criteria

    Each criterion needs a name, an id, a description, a weight, and a scale (Likert 1-5, Likert 1-7, pass/fail, percentage, or 0/1). Use Equal weights or Normalize to 1.0 to keep weights tidy.

  3. 3

    Validate the rubric

    The validation panel flags missing fields, duplicate ids, weights that do not sum to 1.0, mixed scales, and pass-thresholds out of range. Resolve every error before exporting.

  4. 4

    Export YAML, JSON, or judge prompt

    Switch the right pane between YAML, JSON, and a generated judge prompt. The prompt renders criteria, score ranges, the procedure, and an exact output template that tracks your selected format.

Frequently Asked Questions

Which scales are supported?
Likert 1-5, Likert 1-7, pass/fail, percentage 0-100, and binary 0/1. Each criterion gets its own scale and an optional pass threshold for graded scales.
How does rubric self-validation work?
The builder checks weights sum to 1.0, every criterion has a unique id and name, scales used are not over-mixed, and pass thresholds stay inside their scale range. Errors block export; warnings highlight risk without blocking.
What is in the generated judge prompt?
Task statement, every criterion with description and score range, a four-step procedure (read, score with justification, weighted total, pass/fail), and an exact output template in JSON, XML, or structured text.
Does it send my data to a server?
No. The rubric, validation, YAML/JSON serialization, and judge prompt are all generated locally in your browser. Nothing is uploaded.
Can I import an existing rubric?
Not via file import in this version. You can paste rubric JSON into the structure manually or call the registered Web MCP tool from an agent to validate a programmatically built rubric.