OpenAI
Last updated
Last updated
To process a large amount of data based on context and input from the user using OpenAI Provider.
The OpenAI component allows you to integrate OpenAI into your flows. You can customize the parameters used by OpenAI component, and also specify the context of knowledge that the OpenAI component operates on, as well as provide the input query.
The OpenAI component UI changes depending on the model selected, as each model has differing available options. You can specify the exact model to run with the "Model" dropdown menu. These models range from Text to Image, GPT Chat, GPT Vision, Text to Speech models and more. See the Parameters table for more information on the available models to use.
The OpenAI component has the identifier of opa-X, where X represents the instance number of the OpenAI component.
Each of the above AI models serves different purposes, ranging from natural language understanding and generation (GPT), image generation (DALL-E), text-to-speech synthesis (TTS), to specialized tasks like ASMR generation (Whisper). They vary in terms of capabilities, modalities, and target applications.
Each of the available models is summarized below:
GPT 3.5 Turbo:
This is an enhanced version of the GPT-3 model, optimized for better performance, accuracy, or efficiency compared to the original GPT-3.
GPT 3.5 Turbo 16K:
Similar to GPT 3.5 Turbo, but with 16,000 parameters (16K). The number of parameters influences the model's capacity and complexity.
GPT 3.5 Turbo Instruct:
A variant of GPT 3.5 Turbo optimized for instruction-based learning or fine-tuning on specific tasks. It excels in scenarios where the model receives guidance or instructions during the generation process.
GPT 3.5 Turbo 1106:
A variant of GPT 3.5 Turbo optimized for faster performance and lower latency compared to previous version. It means that the model has a better efficiency, accuracy and scalability.
GPT-4:
Represents the next iteration of the GPT series after GPT-3, with improvements in model capacity, performance, and capabilities.
GPT-4 Vision:
A version of GPT-4 tailored specifically for understanding and generating visual content, such as images and videos. It extends the capabilities of traditional GPT models to incorporate vision-based inputs and outputs.
GPT-4o:
"o" for omni is a step towards much more natural human-computer interaction. It accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, which is similar to human response.
DALL-E 2:
A version of OpenAI's DALL-E model, which generates images from textual descriptions. DALL-E 2 includes improvements over the original DALL-E model in terms of image generation quality or efficiency.
DALL-E 3:
Another iteration of the DALL-E model, with further enhancements compared to DALL-E 2.
TTS-1:
Stands for Text-to-Speech 1, a model designed to convert written text into spoken audio. It provides high-quality and natural-sounding speech synthesis.
TTS-1 HD:
A variant of TTS-1 optimized for high-definition audio synthesis, offering even higher fidelity and clarity in the generated speech.
Whisper-1:
A model optimized for generating ASMR (Autonomous Sensory Meridian Response) content, which typically involves producing calming and pleasurable sensations through auditory stimuli. Whisper-1 specializes in creating ASMR-inducing audio content.
For more information regarding the various versions of OpenAI model please refer to the following subsections.
Parameter Name | Description |
---|---|
Credential
You can specify to use your own OpenAI credentials or alternatively you can use Diaflow's default credentials.
Models
This parameter specifies the version of OpenAI that the component should use. Available values: - GPT 3.5 Turbo - GPT 3.5 Turbo 16K - GPT 3.5 Turbo Instruct
- GPT 3.5 Turbo 1106 - GPT-4 - GPT-4 Vision
- GPT-4o - DALL-E 2 - DALL-E 3 - TTS-1 - TTS-1 HD - Whisper-1