When creating Test Cases, capture expected results (in addition to capturing the name and test conditions as Lamatic does currently). Consider also,
Using code checks and AI to find deviation and grade result quality.
Use this quality score as an input for deciding whether to deploy an update.
Assist in recommending the best option when A|B testing prompts, agents, workflows, models and other settings such as temperature.
Using these results to create benchmarks.
Measuring and alerting about “model drift” over time.
Integrating “human-in-the-loop” strategies to expand Test Cases and quality checking.
Added Later (based on feedback from another user):
Make it easier to see the history of Test Case executions (without leaving Experiments). Marc Greenberg didn’t realize he could find the results in Logs, but also didn’t like having to dig through logs to find the execution details for a given test.
Please authenticate to join the conversation.
Planned
💡 Feature Requests
4 months ago
cwhiteman
Get notified by email when there are changes.
Planned
💡 Feature Requests
4 months ago
cwhiteman
Get notified by email when there are changes.