> For the complete documentation index, see [llms.txt](https://docs.diaflow.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.diaflow.io/workflow-builder/nodes/public-ai-llm-models/elevenlabs-cloud.md).

# ElevenLabs Cloud

<figure><img src="/files/hR4rtmRDgJeJ7m0eUg8N" alt=""><figcaption></figcaption></figure>

## Description

The **ElevenLabs** node allows you to convert text into lifelike speech using the ElevenLabs Text-to-Speech API. With this node, you can select from various models and voices offered by ElevenLabs to generate high-quality, natural-sounding audio from input text.

This node is ideal for use cases such as:

* Creating voiceovers for videos or presentations
* Generating spoken feedback or summaries
* Building voice-enabled assistants
* Enhancing accessibility features with audio output

To use this node, you need a valid ElevenLabs API credential. Once configured, you can define the action (currently supports “Text to Speech” and "Speech to Text"), choose the model and voice, and input the text you want to convert.

The resulting audio file can be used in subsequent steps of your workflow or delivered to users.

## Inputs

The ElevenLabs component has the following input connections.

| Input Name                               | Description                                                                       | Constraints                                                                                                      |
| ---------------------------------------- | --------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| From data Loaders/ Data source/Vector DB | This input connection represents the context information for the ElevenLabs model | Must originate from a Data Loader/Data Source or VectorDB component.                                             |
| From Input                               | This input connection represents the user query for the ElevenLabs model.         | Must originate from a component that generates a text string as output such as a Python or Text Input component. |

## Component settings

| Parameter Name                       | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Credentials                          | You can specify to use your own ElevenLabs credentials                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| Action                               | <p>Choose the action:<br>- Text to Speech<br>- Speech to Text</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| Model                                | The list of models displayed depends on the customer's credentials.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Voice (With action = Text to Speech) | The list of voices displayed depends on the customer's credentials.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Text (With action = Text to Speech)  | <p>This is the input field where you provide the content to be converted into speech.</p><p>You can either type plain text directly or <strong>refer to the output of previous nodes</strong> by using the <code>@</code> symbol to select dynamic data from earlier steps in your workflow. This allows you to generate speech from content that was extracted, generated, or transformed in previous nodes — for example, scraped text, AI-generated summaries, or user input.</p><p></p><p><strong>Examples:</strong></p><ul><li>Static text: <code>Hello, welcome to our service.</code></li><li>Dynamic text: <code>@web-scraper.output.content</code> (refers to content scraped from a website)</li></ul><p>Make sure the referenced data is in plain text format for optimal speech synthesis quality.</p>                                           |
| Audio file                           | <p>This field is used to provide the <strong>input audio</strong> that you want to transcribe into text.</p><p>You can either:</p><ul><li>Upload a direct audio file (in supported formats like <code>.mp3</code>, <code>.wav</code>, etc.), or</li><li><strong>Dynamically reference audio output</strong> from previous nodes in the workflow by using the <code>@</code> symbol — for example, an audio file URL returned by a recording tool, voice assistant, or web scraper.</li></ul><p></p><p><strong>Example usages:</strong></p><ul><li>Static file reference: <code><https://example.com/audio/sample.mp3></code></li><li>Dynamic reference: <code>@recorder.output.audioFileUrl</code></li></ul><p>Make sure the referenced file is accessible via a valid URL or is passed from a previous node that provides audio in a compatible format.</p> |

## Advanced configurations

| Options                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Stability (With action = Text to Speech)        | <p>This setting controls how consistent the voice sounds between different runs.</p><ul><li><strong>Higher values</strong> (closer to 1) produce more stable and predictable speech output.</li><li><strong>Lower values</strong> introduce more variation and spontaneity in the generated voice.</li></ul>                                                                                                                                     |
| Similarity boost (With action = Text to Speech) | <p>This adjusts how strongly the generated voice should try to match the original voice profile.</p><ul><li><strong>Higher values</strong> make the voice more closely resemble the reference voice, which may reduce expressiveness.</li><li><strong>Lower values</strong> allow for more natural variation but may sound less like the reference voice.</li></ul>                                                                              |
| Enable catching                                 | <p>This enables local caching of generated audio to improve performance and reduce repeated API calls for identical input.</p><ul><li>When turned on, previously generated audio will be reused for the same input.</li><li>You can also configure the <strong>Caching time</strong>, which determines how long the result stays in cache before being refreshed.</li><li>Includes an option to <strong>Clear cache</strong> manually.</li></ul> |

## Outputs

The ElevenLabs component has the following output connections.

| Output Name | Description                                                            | Constraints                                                    |
| ----------- | ---------------------------------------------------------------------- | -------------------------------------------------------------- |
| To Output   | This output connection contains the esult of the ElevenLabs component. | Can be connected to any component that accepts a string input. |