๐ŸŽ‰ Experiments is here!

December 10, 2024

We are thrilled to announce that Experiments is out of beta.

Experiments is designed to help you tune your LLM prompt, test it on production data, and verify your iterations with quantifiable data.

Main use cases

1. Continuous Improvement

Analyze production edge cases to refine your applicationโ€™s performance.

2. Pre-deployment Testing

Benchmark new releases rigorously before rolling out to production environments.

3. Structured Testing

Implement LLM-as-a-judge or custom evaluation metrics, then compare prompt variations side-by-side with quick, actionable feedback loop.

4. Prompt Optimization

Determine the best prompt for production by running evaluators to prevent performance regressions.

For detailed documentation, refer to our updated docs.