> For the complete documentation index, see [llms.txt](https://docs.diaflow.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.diaflow.io/workflow-builder/nodes/private-ai-llm-models/openai.md).

# OpenAI

### 1. The "Ready-to-Use" AI (Overview)

Zero setup, zero API keys. Just drag, drop, and start prompting using Diaflow's built-in AI credits.

OpenAI’s superpower is range. In one node family, you can write text, understand images and PDFs, generate images, transcribe audio, and turn text into spoken audio.

### 2. How to Add the Node

On your canvas, open the node picker, search for **OpenAI**, and click it to add the node to your workflow.

<figure><img src="/files/lHrQowCCuNmNCb3rxW8V" alt=""><figcaption></figcaption></figure>

### 3. What can you automate? (The "Wow" Use Cases)

#### 1. **The Multimodal Deal Desk**

Send sales notes, proposal screenshots, or PDF materials into OpenAI and get fast summaries, key risks, and next-step recommendations for your team.

#### 2. **The Instant Creative Studio**

Use **DALL-E 3** to turn campaign ideas into ready-to-review visuals for ads, social posts, landing pages, or product launches.

#### 3. **The Voice Workflow Assistant**

Use **Whisper** to turn calls, voice notes, or recorded updates into text, then use **TTS** to turn approved messaging back into spoken audio.

### 4. Step-by-Step: Configuration & Smart Data Injection

#### 1. **Choose your Brain (Model):**

Select your preferred model from the dropdown.

Use **mini** or **nano** models for fast, lower-cost tasks. Use **GPT-4.1** or **GPT-5** models for deeper reasoning and higher-quality output.

Choose **DALL-E 3** for image creation, **Whisper** for audio transcription, and **TTS** or **TTS-HD** for spoken audio.

#### 2. **Write the Prompt & Inject Data:**

The easiest way to add live workflow data is to type `@` on your keyboard. Diaflow opens a visual dropdown so you can pick dynamic data from earlier steps.

Typing `{{` also works perfectly if you prefer the classic bracket syntax.

Put your task, rules, tone, and desired format directly into the main Prompt box.

If you want the node to remember earlier turns, turn on **Memory**. If you expect the same request to repeat, turn on **Caching**.

#### 3. **Attach Media (Images, PDFs, or audio):**

For vision tasks, pass image or PDF links from earlier steps into the media field.

For transcription, pass the audio file from an earlier step into the audio input.

For image generation and text-to-speech, you usually only need your written instruction and the right model choice.

<figure><img src="/files/JN3HqDiUIxyCm6LW5a16" alt=""><figcaption></figcaption></figure>

### 5. The Output: Seeing Your Results

This node returns the generated result for the model you chose.

Use `{{node_name.data}}` in your next step, like Gmail or Google Sheets, to pass the generated text or media forward.

You may also see the model used and the final prompt that ran. This helps you review what the node actually used.

You will also see `input_token` and `output_token`. Since Diaflow handles the billing, these tokens simply act as a meter so you can track how many credits your workflow is consuming.

Some runs may also show extra usage details, such as image count, audio length, or speech input size.

<figure><img src="/files/IwL5v9JHP91HHYesxmWb" alt=""><figcaption></figcaption></figure>

### 6. Golden Rules & Guardrails (CRITICAL PRO-TIPS)

#### **File/Image Limits:**

This node supports image and PDF input for vision tasks.

Supported visual file types include `png`, `jpg`, `jpeg`, `webp`, and `pdf`.

If you send an image or PDF to a model that does not support vision, the run fails.

For transcription, use an audio file. For text-to-speech, use text input.

#### **The System Prompt Trap:**

Some reasoning models do not reliably use a separate system instruction field.

For guaranteed results with advanced models, put your Persona/System instructions directly into the main Prompt box.

#### **Auto-Streaming:**

Diaflow automatically streams longer answers to make the AI feel faster.

Shorter requests may return as one complete answer instead.

`gpt-5-mini` and `gpt-5-nano` return the full answer at once.

#### **Model behavior can vary:**

Some reasoning models ignore creative tuning controls, so focus on writing a clearer prompt instead of fine-tuning settings.

If one model feels too slow or too expensive, switch to a **mini** or **nano** option first.

If **GPT-5 Pro** is not available yet in your workspace, choose another OpenAI model for now.

#### **If demand is high:**

You may occasionally see a temporary high-demand message.

If that happens, wait a moment and run the step again.

### 7. Need Help?

* [**Understanding Diaflow Variables (`@` vs `{{ }}`)**](/agent-builder/agent-builder/cross-features-control/mention-tool.md)
* [**Contact Diaflow Support**](mailto:support@diaflow.io)