module Riffer::Evals::EvaluatorRunner
Orchestrates running evaluators against an agent across multiple scenarios.
result = Riffer::Evals::EvaluatorRunner.run( agent: MyAgent, scenarios: [ { input: "What is Ruby?", ground_truth: "A programming language" }, { input: "What is Python?" } ], evaluators: [AnswerRelevancyEvaluator] ) result.scores # => { AnswerRelevancyEvaluator => 0.85 }
Public Instance Methods
Source
# File lib/riffer/evals/evaluator_runner.rb, line 24 def run(agent:, scenarios:, evaluators:, context: nil) validate_agent!(agent) validate_evaluators!(evaluators) scenario_results = scenarios.map do |scenario| run_scenario(agent: agent, scenario: scenario, evaluators: evaluators, context: context) end Riffer::Evals::RunResult.new(scenario_results: scenario_results) end
Runs evaluators against an agent for the given scenarios. Raises Riffer::ArgumentError on an invalid agent or evaluator.