class Riffer::Evals::ScenarioResult

Represents the result of evaluating a single scenario.

Contains the input, output, ground truth, and individual evaluator results.

scenario_result = Riffer::Evals::ScenarioResult.new(
  input: "What is Ruby?",
  output: "A programming language.",
  ground_truth: "A programming language",
  results: [result1, result2]
)

scenario_result.scores  # => { MyEvaluator => 0.85 }

Attributes

ground_truth [R]

The ground truth used during evaluation.

input [R]

The input that was evaluated.

messages [R]

The full message history from the agent conversation.

output [R]

The agent output for this scenario.

results [R]

Individual evaluation results.

Public Class Methods

new (input:, output:, ground_truth:, results:, messages: [])

Source

# File lib/riffer/evals/scenario_result.rb, line 37
def initialize(input:, output:, ground_truth:, results:, messages: [])
  @input = input
  @output = output
  @ground_truth = ground_truth
  @results = results
  @messages = messages
end

Initializes a new scenario result.

Public Instance Methods

scores ()

Source

# File lib/riffer/evals/scenario_result.rb, line 49
def scores
  results.each_with_object({}) do |result, hash|
    hash[result.evaluator] = result.score
  end
end

Returns scores keyed by evaluator class.

to_h ()

Source

# File lib/riffer/evals/scenario_result.rb, line 59
def to_h
  {
    input: input,
    output: output,
    ground_truth: ground_truth,
    scores: scores.transform_keys(&:name),
    results: results.map(&:to_h),
    messages: messages.map(&:to_h)
  }
end

Returns a hash representation of the scenario result.