# Managing Vector Content

## **How to Upload Documents**

Prerequisites: You have documents in supported formats (PDF, DOCX, TXT, MD, CSV, XLSX, XLS, or JSON).

Steps:

1. Click Add, choose Documents

<figure><img src="/files/bO2U88P58HDuFodD9qcu" alt=""><figcaption></figcaption></figure>

2. Select your file(s) using the file picker

<figure><img src="/files/MjobZbe6HYtCTvGDXeq7" alt=""><figcaption></figcaption></figure>

3. Configure chunking settings:

| Setting         | Description               | Default | Range                           |
| --------------- | ------------------------- | ------- | ------------------------------- |
| Chunking Method | How text is divided       | Word    | Letter, Word, Sentence, Passage |
| Chunk Size      | Size of each text segment | 200     | 0-4500                          |
| Chunk Overlap   | Overlap between chunks    | 3       | 0-4500                          |

4. Choose destination (Save to groups): Select an existing group

<figure><img src="/files/d1Am28WEjuDQhygeJcV9" alt=""><figcaption></figcaption></figure>

5. Click Create

Expected Result: Document uploads and begins processing. Status shows “Processing” then “Completed”.

Tip: Use larger chunk sizes (400-500) for technical documents, smaller sizes (100-200) for conversational content.

Error Handling:

* File type not supported: Ensure your file is PDF, DOCX, TXT, MD, CSV, XLSX, XLS, or JSON format
* File is larger than 50MB: Compress your file or split into smaller sections
* Processing failed: Check file integrity and try re-uploading or “Rejected” if over limitations.
* The desired group not be visible: Please refresh the page to obtain the most current list.

## **How to Create Articles**

Prerequisites: You want to add custom text content to your knowledge base.

Steps:

1. Click Add, choose Articles

<figure><img src="/files/OmEYGsR4BdOq86gNmrTR" alt=""><figcaption></figcaption></figure>

2. Enter article details:

* Name: Required field, must be unique
* Content: Use the rich text editor to format your content

<figure><img src="/files/YRC3zXYXwsc3lqN0dN2s" alt=""><figcaption></figcaption></figure>

3. Configure chunking settings:

| Setting         | Description               | Default | Range                           |
| --------------- | ------------------------- | ------- | ------------------------------- |
| Chunking Method | How text is divided       | Word    | Letter, Word, Sentence, Passage |
| Chunk Size      | Size of each text segment | 200     | 0-4500                          |
| Chunk Overlap   | Overlap between chunks    | 3       | 0-4500                          |

4. Choose destination (Save to groups): Select an existing group

<figure><img src="/files/20uFgVLC71PeqWyLSVx9" alt=""><figcaption></figcaption></figure>

5. Click Create

Expected Result: Article is created and processed for vectorization.

Content Editor Features:

* Text formatting (bold, italic, strikethrough, underline, square code, headers, and similar formatting options)
* Bullet points and numbered lists
* Image/video insertion
* Link embedding

Error Handling:

* Limitation: Articles should not exceed 300,000 words.
* The desired group not be visible: Please refresh the page to obtain the most current list.

<br>

## **How to Add URLs and Crawl Websites**

### Option 1: Individual URLs

Prerequisites: You have specific web pages to add to your knowledge base.

Steps:

1. Click Add,  choose URLs

<figure><img src="/files/BD1NvSPwrSDXE5Ww057G" alt=""><figcaption></figcaption></figure>

2. Select Add individual links

<figure><img src="/files/ZM3dGN1HK6cSEdsIeCc9" alt=""><figcaption></figcaption></figure>

3. Enter the URL (must include http\:// or https\://)

<figure><img src="/files/wUQTfxuPnlUnk0mDadyk" alt=""><figcaption></figcaption></figure>

4. Click Add another link for multiple URLs
5. Configure chunking settings:

| Setting         | Description               | Default | Range                           |
| --------------- | ------------------------- | ------- | ------------------------------- |
| Chunking Method | How text is divided       | Word    | Letter, Word, Sentence, Passage |
| Chunk Size      | Size of each text segment | 200     | 0-4500                          |
| Chunk Overlap   | Overlap between chunks    | 3       | 0-4500                          |

6. Choose destination (Save to groups): Select an existing group

<figure><img src="/files/Si8vJ265Ppf8qEmM7rT5" alt=""><figcaption></figcaption></figure>

7. Click Create

### Option 2: Crawl Website Sitemap

Prerequisites: The target website has an accessible sitemap.

Steps:

1. Click Add,  choose URLs

<figure><img src="/files/JquV786PSq9TiUtOR5GY" alt=""><figcaption></figcaption></figure>

2. Select Crawl website sitemap

<figure><img src="/files/sqyn2OPzIKLghnIqwhLx" alt=""><figcaption></figcaption></figure>

3. Enter the website URL (e.g., <https://example.com>)

<figure><img src="/files/wtVoIqXaK7QNSvUqymOP" alt=""><figcaption></figcaption></figure>

4. Click Fetch
5. Wait for sitemap retrieval (loading screen appears)
6. Review the URL list:

* Deselect unwanted URLs by unchecking boxes
* URLs may display as relative paths (e.g., /products/item1 instead of full URL)

<figure><img src="/files/oX8R7AmtAG31WC7VYADk" alt=""><figcaption></figcaption></figure>

7. Configure chunking settings:

| Setting         | Description               | Default | Range                           |
| --------------- | ------------------------- | ------- | ------------------------------- |
| Chunking Method | How text is divided       | Word    | Letter, Word, Sentence, Passage |
| Chunk Size      | Size of each text segment | 200     | 0-4500                          |
| Chunk Overlap   | Overlap between chunks    | 3       | 0-4500                          |

8. Choose destination (Save to groups): Select an existing group

<figure><img src="/files/jRmI5oQLPUc2T1M7bnfY" alt=""><figcaption></figcaption></figure>

9. Click Create

Expected Result: Selected URLs are queued for processing and appear with “Processing” status, then “Completed” when ready, or “Rejected” if over limitations.

Domain Parsing Example:

* Input: <https://dev.pp1.fr/tnc>
* Crawled URL: <https://dev.pp1.fr/tnc/vehicule/porsche-911>
* Display: vehicule/porsche-911

Troubleshooting URL Issues:

* Failed to fetch sitemap: Verify URL is correct and sitemap is accessible
* No valid URLs found: The website may not have a standard sitemap format
* Processing failed: The URL content may be unavailable, not supported, or beyond the allowed crawling limit.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.diaflow.io/productivity-tools/diaflow-vectors/how-to-guides/managing-vector-content.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
