class Riffer::Evals::RunResult
Represents the complete result of an evaluation run across multiple scenarios.
Contains per-scenario results and provides aggregate scores.
run_result = Riffer::Evals::RunResult.new( scenario_results: [scenario_result1, scenario_result2] )
run_result.scores # => { MyEvaluator => 0.85 }
Attributes
Per-scenario evaluation results.
Public Class Methods
Source
# File lib/riffer/evals/run_result.rb, line 21 def initialize(scenario_results:) @scenario_results = scenario_results end
Initializes a new run result.
: (scenario_results: Array) -> void
Public Instance Methods
Source
# File lib/riffer/evals/run_result.rb, line 28 def scores return {} if scenario_results.empty? totals = Hash.new(0.0) counts = Hash.new(0) scenario_results.each do |scenario| scenario.scores.each do |evaluator, score| totals[evaluator] += score counts[evaluator] += 1 end end totals.each_with_object({}) do |(evaluator, total), hash| hash[evaluator] = total / counts[evaluator] end end
Returns average scores keyed by evaluator class across all scenarios.
: () -> Hash[singleton(Riffer::Evals::Evaluator), Float]
Source
# File lib/riffer/evals/run_result.rb, line 49 def to_h { scores: scores.transform_keys(&:name), scenario_results: scenario_results.map(&:to_h) } end
Returns a hash representation of the run result.
: () -> Hash[Symbol, untyped]