GPT Vision
Combine natural language understanding with image comprehension, all within your flows.

Description
The OpenAI component allows you to integrate OpenAI GPT-4 Vision into your flows. In particular, the following versions are supported:
- GPT 4o 
The OpenAI component has the identifier of opa-X, where X represents the instance number of the OpenAI component.

Inputs
The OpenAI component has the following input connections.
From Data Loaders
This input connection represents the context information for the OpenAI model.
Must originate from a Data Loader component.
From Input/LLM
This input connection represents the user query for the OpenAI model.
Must originate from a component that generates a text string as output such as a Python or Text Input component.
Component settings
Credentials
You can specify to use your own OpenAI credentials or alternatively you can use Diaflow's default credentials.
Model
This parameter specifies the version of OpenAI that the component should use. Available values: - GPT 4o
Prompt
Describes how you want the OpenAI model to respond. For example, you can specify the role, manner and rules that OpenAI should adhere to. Also mention the component ID to connect the components.
Image source
Adding an image to your prompt by identify a trigger file in this configuration.
Advanced configurations
Enable caching
This option determines whether the results of the component are cached. This means that on the next run of the Flow, Diaflow will utilize the previous computed component output, as long as the inputs have not changed.
Caching time
Only applicable if the "Enable Caching" option has been enabled. This parameter controls how long Diaflow will wait before automatically clearing the cache.
Clear cache
Only applicable if the "Enable Caching" option has been enabled. Clicking this button will clear the cache.
Memory
The ability of the model to remember and utilize context within a single session. The context window represent the maximum amount of text the model can consider.
Window size
Only applicable if the "Memory" option has been enabled. The Window Size option refers to the number of previous conversation turns that the model can remember. Valid range for this parameter is 0 to 1000.
View test memory
Only applicable if the "Memory" option has been enabled. Opens a window to display the history of prompts and completions.
Clear test memory
Only applicable if the "Memory" option has been enabled. Clicking this button will clear the history of prompts and completions.
Temperature
The temperature is used to control the randomness of the output. When you set it higher, you'll get more random outputs. When you set it lower, towards 0, the values are more deterministic. Valid range for this parameter is 0 to 1.
Max lenght
The Max Length parameter in OpenAI refers to the maximum number of tokens allowed in the input text. Tokens can be individual words or characters. By setting the max length, you can control the length of the response generated by the model. It's important to note that longer texts may result in higher costs and longer response times. Valid range for this parameter is 0 to 3097.
Top P
Top-p sampling, involves selecting the next word from the smallest possible set of words whose cumulative probability is greater than or equal to the specified probability p, typically between 0 and 1.
Presence penalty
The Presence Penalty parameter in OpenAI refers to a parameter that can be used to control the level of repetition in the generated text. By increasing the presence penalty, the model is encouraged to generate more diverse and varied responses, reducing the likelihood of repetitive or redundant answers. It helps to make the generated text more coherent and interesting. Valid range for this parameter is -2 to +2.
Frequency penalty
This parameter helps control the repetitiveness of the generated text. A higher value for the Frequency Penalty encourages the model to generate more diverse and varied responses by penalizing the repetition of similar phrases or words. Conversely, a lower value allows for more repetitive output. It's a useful tool for fine-tuning the balance between coherence and diversity in the generated text. Valid range for this parameter is -2 to +2.
Outputs
The OpenAI component has the following output connections.
Description
This is a user supplied textual description of the OpenAI component.
Use case
Here is a simple use case of the OpenAI component, where the OpenAI component is being used with the GPT 4o model to analyse an image.

Last updated
Was this helpful?
