GPT Vision

Combine natural language understanding with image comprehension, all within your Flows.

The OpenAI component allows you to integrate OpenAI GPT-4 Vision into your Flows. In particular, the following versions are supported:

  • GPT 4 Vision

The OpenAI component has the identifier of opa-X, where X represents the instance number of the OpenAI component.

Selecting the "Show advanced configurations" button shows additional parameters on the OpenAI component UI.

Each of these parameters is discussed in the section below.

The OpenAI component has the following parameters that can be specified directly on the UI component.

Parameter NameDescription

Prompt

Describes the context and user query in terms of component ID's of other Diaflow components. You can specify the context and the user query here. For example, to specify the user query we can reference the first instance of a Text Input component by using the double curly bracket syntax {{in-0}}.

Image Source

Specifies the image source to use. Note. Available options on the dropdown menu will only appear after a connection has been made from Eg a Files component to the OpenAI component.

Caching

This option determines whether the results of the component are cached. This means that on the next run of the Flow, Diaflow will utilize the previous computed component output, as long as the inputs have not changed.

Caching time

Only applicable if the "Enable Caching" option has been enabled. This parameter controls how long Diaflow will wait before automatically clearing the cache.

Clear cache

Only applicable if the "Enable Caching" option has been enabled. Clears the cache.

Memory

Track previous interactions with the LLM.

Window Size

Only applicable if the "Memory" option has been enabled. The Window Size option refers to the number of previous conversation turns that the model can remember. Valid range for this parameter is 0 to 1000.

View test memory

Only applicable if the "Memory" option has been enabled. Opens a window to display the history of prompts and completions.

Clear test memory

Only applicable if the "Memory" option has been enabled. Clicking this button will clear the history of prompts and completions.

Pre-Trained Data

Temperature

The temperature is used to control the randomness of the output. When you set it higher, you'll get more random outputs. When you set it lower, towards 0, the values are more deterministic. Valid range for this parameter is 0 to 1.

Max Length

The Max Length parameter in OpenAI refers to the maximum number of tokens allowed in the input text. Tokens can be individual words or characters. By setting the max length, you can control the length of the response generated by the model. It's important to note that longer texts may result in higher costs and longer response times. Valid range for this parameter is 0 to 3097.

Top P

The "Top P" parameter, also known as nucleus sampling, is a technique used in OpenAI's language models. It determines the probability distribution of words to consider during text generation. By setting a value for "P," the model only considers the most likely words that make up a cumulative probability of "P". This helps in controlling the randomness and diversity of generated text. Valid range for this parameter is 0 to 1.

Presence Penalty

The Presence Penalty parameter in OpenAI refers to a parameter that can be used to control the level of repetition in the generated text. By increasing the presence penalty, the model is encouraged to generate more diverse and varied responses, reducing the likelihood of repetitive or redundant answers. It helps to make the generated text more coherent and interesting. Valid range for this parameter is -2 to +2.

Frequency Penalty

This parameter helps control the repetitiveness of the generated text. A higher value for the Frequency Penalty encourages the model to generate more diverse and varied responses by penalizing the repetition of similar phrases or words. Conversely, a lower value allows for more repetitive output. It's a useful tool for fine-tuning the balance between coherence and diversity in the generated text. Valid range for this parameter is -2 to +2.

The OpenAI component has the following input connections.

Input NameDescriptionConstraints

From Data Loaders

This input connection represents the context information for the OpenAI model.

Must originate from a Data Loader component.

From Input/LLM

This input connection represents the user query for the OpenAI model.

Must originate from a component that generates a text string as output such as a Python or Text Input component.

Component Outputs

The OpenAI component has the following output connections.

Output Name FormatDescriptionConstraints

To Output

This output connection contains the text result of the OpenAI component.

Can be connected to any component that accepts a string input.

The OpenAI component has the following configuration options.

Configuration Option NameDescription

Description

This is a user supplied textual description of the OpenAI component.

Use Cases

The following is a simple use case of the OpenAI component, where the OpenAI component is being used to analyse a medical result sheet and convert to textual format, vis GPT-4 Vision.

Last updated