# Creating Automated Training Data

The Automated tab leverages AI to generate synthetic training examples, accelerating the training process while maintaining quality through review mechanisms.

## Ways to generate training data

### 1. Quick Generation (Synchronous)&#x20;

Instantly creates up to 10 prompt/query pairs. Navigate to the Text to SQL Training tab in your Data Product and Click the **"Generate"** button.

{% hint style="success" %}

#### **What synchronous generation is best for**

Quick testing or when you need a few examples immediately.
{% endhint %}

{% hint style="warning" %}

#### **Trade-offs**

Uses faster but lower-quality models due to speed requirements
{% endhint %}

<figure><img src="https://1184501161-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FyLLMBTNtAM53TC60IKAS%2Fuploads%2FKMdY0dRtzlhZwzsUNsME%2Fgenerate.png?alt=media&#x26;token=6037bc0a-5b32-4e0c-a67b-c43ee4d2f164" alt=""><figcaption></figcaption></figure>

{% embed url="<https://www.loom.com/share/02394732fbc14c6783e9283dd534ebac?sid=e3d5d25a-7dd4-4258-a06e-2f8f89710ed8>" %}

### 2. Configured Generation (Asynchronous) - Recommended

Configures ongoing generation of synthetic queries.

* **Quantity**: Specify how many synthetic queries to generate in total
  * This number includes existing rows. If you have 5 rows already and set the
* **Auto-approval**: Choose whether queries should be:

  * Automatically approved and used immediately
  * Held in "pending" status for manual review

  <figure><img src="https://1184501161-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FyLLMBTNtAM53TC60IKAS%2Fuploads%2FtJv1NhcA42QDC8K6ETvU%2Fconfigure.png?alt=media&#x26;token=ea0f76c1-3d54-4b0f-9ce8-37f9598df767" alt=""><figcaption></figcaption></figure>

#### **Configuration Options**

When setting up automated generation, consider:

* **Volume**: Start with smaller batches (between 5 and 20) to assess quality
* **Review requirements**: Initially, either enable manual review or set a high minimum certainty (between 80% and 95%) to ensure the quality of generated examples
* **Iteration**: Adjust configuration based on the quality of generated examples

{% hint style="success" %}

#### **Why asynchronous generation is preferred**

* Uses higher-quality AI models
* Includes additional validation steps
* Produces more accurate and relevant training examples
* Runs in the background without interrupting your workflow
  {% endhint %}

{% embed url="<https://www.loom.com/share/59509bd42ce5434eb8a6f999335ecfb6?sid=d91e8da0-5ad6-48d1-9dab-f14c3d1cd192>" %}
