GPT Vision

Combine natural language understanding with image comprehension, all within your flows.

Description

The OpenAI component allows you to integrate OpenAI GPT-4 Vision into your flows. In particular, the following versions are supported:

  • GPT 4 Vision

  • GPT 4o

The OpenAI component has the identifier of opa-X, where X represents the instance number of the OpenAI component.

Inputs

The OpenAI component has the following input connections.

Input NameDescriptionConstraints

From Data Loaders

This input connection represents the context information for the OpenAI model.

Must originate from a Data Loader component.

From Input/LLM

This input connection represents the user query for the OpenAI model.

Must originate from a component that generates a text string as output such as a Python or Text Input component.

Component settings

Parameter NameDescription

Credentials

You can specify to use your own OpenAI credentials or alternatively you can use Diaflow's default credentials.

Model

This parameter specifies the version of OpenAI that the component should use. Available values: - GPT 4 Vision - GPT 4o

Prompt

Describes how you want the OpenAI model to respond. For example, you can specify the role, manner and rules that OpenAI should adhere to. Also mention the component ID to connect the components.

Image source

Adding an image to your prompt by identify a trigger file in this configuration.

Advanced configurations

OptionsDescription

Enable caching

This option determines whether the results of the component are cached. This means that on the next run of the Flow, Diaflow will utilize the previous computed component output, as long as the inputs have not changed.

Caching time

Only applicable if the "Enable Caching" option has been enabled. This parameter controls how long Diaflow will wait before automatically clearing the cache.

Clear cache

Only applicable if the "Enable Caching" option has been enabled. Clicking this button will clear the cache.

Memory

The ability of the model to remember and utilize context within a single session. The context window represent the maximum amount of text the model can consider.

Window size

Only applicable if the "Memory" option has been enabled. The Window Size option refers to the number of previous conversation turns that the model can remember. Valid range for this parameter is 0 to 1000.

View test memory

Only applicable if the "Memory" option has been enabled. Opens a window to display the history of prompts and completions.

Clear test memory

Only applicable if the "Memory" option has been enabled. Clicking this button will clear the history of prompts and completions.

Temperature

The temperature is used to control the randomness of the output. When you set it higher, you'll get more random outputs. When you set it lower, towards 0, the values are more deterministic. Valid range for this parameter is 0 to 1.

Max lenght

The Max Length parameter in OpenAI refers to the maximum number of tokens allowed in the input text. Tokens can be individual words or characters. By setting the max length, you can control the length of the response generated by the model. It's important to note that longer texts may result in higher costs and longer response times. Valid range for this parameter is 0 to 3097.

Top P

Top-p sampling, involves selecting the next word from the smallest possible set of words whose cumulative probability is greater than or equal to the specified probability p, typically between 0 and 1.

Presence penalty

The Presence Penalty parameter in OpenAI refers to a parameter that can be used to control the level of repetition in the generated text. By increasing the presence penalty, the model is encouraged to generate more diverse and varied responses, reducing the likelihood of repetitive or redundant answers. It helps to make the generated text more coherent and interesting. Valid range for this parameter is -2 to +2.

Frequency penalty

This parameter helps control the repetitiveness of the generated text. A higher value for the Frequency Penalty encourages the model to generate more diverse and varied responses by penalizing the repetition of similar phrases or words. Conversely, a lower value allows for more repetitive output. It's a useful tool for fine-tuning the balance between coherence and diversity in the generated text. Valid range for this parameter is -2 to +2.

Outputs

The OpenAI component has the following output connections.

Configuration Option NameDescription

Description

This is a user supplied textual description of the OpenAI component.

Use case

Here is a simple use case of the OpenAI component, where the OpenAI component is being used with the GPT Vision 4 model to analyse an image.

Last updated