class Riffer::Evals::ScenarioResult

Represents the result of evaluating a single scenario.

Attributes

ground_truth [R]

The ground truth used during evaluation.

input [R]

The input that was evaluated.

messages [R]

The full message history from the agent conversation.

output [R]

The agent output for this scenario.

results [R]

Individual evaluation results.

Public Class Methods

new (input:, output:, ground_truth:, results:, messages: [])

Source

# File lib/riffer/evals/scenario_result.rb, line 23
def initialize(input:, output:, ground_truth:, results:, messages: [])
  @input = input
  @output = output
  @ground_truth = ground_truth
  @results = results
  @messages = messages
end

Public Instance Methods

scores ()

Source

# File lib/riffer/evals/scenario_result.rb, line 35
def scores
  acc = {} #: Hash[singleton(Riffer::Evals::Evaluator), Float]
  results.each_with_object(acc) do |result, hash|
    hash[result.evaluator] = result.score
  end
end

Returns scores keyed by evaluator class.

to_h ()

Source

# File lib/riffer/evals/scenario_result.rb, line 46
def to_h
  {
    input: input,
    output: output,
    ground_truth: ground_truth,
    scores: scores.transform_keys(&:name),
    results: results.map(&:to_h),
    messages: messages.map(&:to_h)
  }
end

Returns a hash representation of the scenario result.