Creating Automated Training Data
Create synthetic training examples automatically using the Alkemi Agent
The Automated tab leverages AI to generate synthetic training examples, accelerating the training process while maintaining quality through review mechanisms.
Ways to generate training data
1. Quick Generation (Synchronous)
Instantly creates up to 10 prompt/query pairs. Navigate to the Text to SQL Training tab in your Data Product and Click the "Generate" button.

2. Configured Generation (Asynchronous) - Recommended
Configures ongoing generation of synthetic queries.
Quantity: Specify how many synthetic queries to generate in total
This number includes existing rows. If you have 5 rows already and set the
Auto-approval: Choose whether queries should be:
Automatically approved and used immediately
Held in "pending" status for manual review
Configuration Options
When setting up automated generation, consider:
Volume: Start with smaller batches (between 5 and 20) to assess quality
Review requirements: Initially, either enable manual review or set a high minimum certainty (between 80% and 95%) to ensure the quality of generated examples
Iteration: Adjust configuration based on the quality of generated examples
Why asynchronous generation is preferred
Uses higher-quality AI models
Includes additional validation steps
Produces more accurate and relevant training examples
Runs in the background without interrupting your workflow
Last updated