Langfuse just got faster →
← Back to changelog
April 8, 2026

Boolean LLM-as-a-Judge Scores

Picture Hassieb PakzadHassieb Pakzad
Boolean LLM-as-a-Judge Scores

LLM-as-a-Judge evaluators can now return boolean scores for `true` / `false` decisions.

LLM-as-a-Judge evaluators in Langfuse can now return boolean scores in addition to numeric and categorical ones. This makes it easier to model simple decisions directly as native true or false scores and analyze them across your existing score tooling.

This is especially useful when the right answer is a binary judgment:

  • Detect User Disagreement as true or false
  • Detect Out-of-Scope Request as true or false
  • Detect Insufficient Answer as true or false

Numeric scores are still the right fit for continuous dimensions like helpfulness or faithfulness. Categorical scores remain best when you need more than two explicit labels. Boolean scores are the simplest option when the evaluator should return true or false. For concrete prompt examples, see LLM-as-a-Judge for Production Monitoring.

What's New

  • Choose Boolean when creating a custom LLM-as-a-Judge evaluator
  • Store true / false outcomes as native boolean scores
  • Analyze boolean evaluator outputs in dashboards, filters, and score analytics alongside your existing scores

Get started


Was this page helpful?