← Back to changelog
Hassieb Pakzad
April 8, 2026
Boolean LLM-as-a-Judge Scores

LLM-as-a-Judge evaluators can now return boolean scores for `true` / `false` decisions.
LLM-as-a-Judge evaluators in Langfuse can now return boolean scores in addition to numeric and categorical ones. This makes it easier to model simple decisions directly as native true or false scores and analyze them across your existing score tooling.
This is especially useful when the right answer is a binary judgment:
- Detect
User Disagreementastrueorfalse - Detect
Out-of-Scope Requestastrueorfalse - Detect
Insufficient Answerastrueorfalse
Numeric scores are still the right fit for continuous dimensions like helpfulness or faithfulness. Categorical scores remain best when you need more than two explicit labels. Boolean scores are the simplest option when the evaluator should return true or false. For concrete prompt examples, see LLM-as-a-Judge for Production Monitoring.
What's New
- Choose
Booleanwhen creating a custom LLM-as-a-Judge evaluator - Store
true/falseoutcomes as native boolean scores - Analyze boolean evaluator outputs in dashboards, filters, and score analytics alongside your existing scores
Get started
Was this page helpful?