Overview
ReliabilityEval asserts that agents call expected tools, handle errors correctly, and produce non-empty responses.
Quick Start
Case Options
| Field | Type | Description |
|---|---|---|
expectedTools | string[] | Tool names that should be called |
shouldError | boolean | Whether the case should throw an error |
Tool Call Match Scorer
UsetoolCallMatch as a standalone scorer: