class Riffer::Evals::Judge

Executes LLM-as-judge evaluations, using tool calling internally to get structured output from the judge model.

Attributes

The model string (provider/model format).

Public Class Methods

Source

# File lib/riffer/evals/judge.rb, line 37
def initialize(model:, provider_options: {})
  provider_name, model_name = model.split("/", 2)
  unless [provider_name, model_name].all? { |part| part.is_a?(String) && !part.strip.empty? }
    raise Riffer::ArgumentError, "Invalid model string: #{model}"
  end

  @model = model
  @provider_options = provider_options
end

Raises Riffer::ArgumentError unless model is “provider/model” format.

Public Instance Methods

evaluate (instructions:, input:, output:, ground_truth: nil)

Source

# File lib/riffer/evals/judge.rb, line 50
def evaluate(instructions:, input:, output:, ground_truth: nil)
  system_message = build_system_message(instructions)
  user_message = build_user_message(input: input, output: output, ground_truth: ground_truth)

  response = provider_instance.generate_text(
    system: system_message,
    prompt: user_message,
    model: model_name,
    tools: [EvaluationTool]
  )

  parse_tool_response(response)
end

Evaluates an input/output pair using the configured LLM.