Vectors
Last updated
Last updated
Vector databases are essential to many large language model (LLM) systems, as they help extract key information from documents for LLM use. These specialized databases store vectorized data, including documents and their embeddings. They quickly retrieve the most relevant documents by identifying embeddings that closely match the query.
Diaflow supports storing and managing your files into vector format in one central location, allowing access by all your flows.
Files can be stored in the root Vectors area, or you can assign a file to one or multiple groups. To switch the display, click either the "Files" or the "Groups" button.
You can upload documents, create articles within Diaflow for upload, or upload URLs.
The Vectors page has two modes, allowing you to view files in Groups or all Files.
When the page is in "Files" display mode, the list will display files from both the root Vector area and all groups.
Diaflow will also show you in real-time the status of the vectorization process so that you know when your Vectors are ready for use.
You can add files to the root Vector area by clicking the "Add" button. This will display a menu, allowing you to upload Documents, Articles or URLs.
You can also arrange you files into groups to better organize them by clicking on the "Groups" button.
Your Vectors can be arranged into different groups to organize them. Documents, articles and URLs can be individually stored in groups.
Groups can be displayed by selecting the "Groups" button from the Vectors page.
From here, you can view your existing groups and a new group can be created by clicking on the "Add a group" button.
Enter a name and description for your group.
Clicking on the "Create" button will create the new group and open the "Group" page where you can specify the content of the group.
The triple-dot icon at the right hand side of each row displays a menu allowing you to view your data and delete the file from the group.
To view a textual version of your data you can select the "View data" menu item which will then display a popup dialog box showing the data.
Supported File Types:
Documents: PDF, DOCX, TXT
Note: Vector cannot properly embed images, tables, or graphs within these documents.
Content Types:
Articles: Text-based content.
URLs: Data from web pages with static content.
Note: Some pages/URLs may block the crawl function and may return a status of “rejected.”
Sitemaps:
Websites must have a robots.txt
file.
The system will track the sitemap using the following exact endpoints:
sitemap.xml
sitemap-index.xml.gz
sitemap
sitemap-index.xml
sitemap.xml.gz
sitemap.xml
sitemap_index.xml.gz
sitemap.xml
sitemap_index.xml
xmlsitemap