Managing Vector Content

How to Upload Documents

Prerequisites: You have documents in supported formats (PDF, DOCX, TXT, MD, CSV, XLSX, XLS, or JSON).

Steps:

  1. Click Add, choose Documents

  1. Select your file(s) using the file picker

  1. Configure chunking settings:

Setting

Description

Default

Range

Chunking Method

How text is divided

Word

Letter, Word, Sentence, Passage

Chunk Size

Size of each text segment

200

0-4500

Chunk Overlap

Overlap between chunks

3

0-4500

  1. Choose destination (Save to groups): Select an existing group

  1. Click Create

Expected Result: Document uploads and begins processing. Status shows “Processing” then “Completed”.

Tip: Use larger chunk sizes (400-500) for technical documents, smaller sizes (100-200) for conversational content.

Error Handling:

  • File type not supported: Ensure your file is PDF, DOCX, TXT, MD, CSV, XLSX, XLS, or JSON format

  • File is larger than 50MB: Compress your file or split into smaller sections

  • Processing failed: Check file integrity and try re-uploading or “Rejected” if over limitations.

  • The desired group not be visible: Please refresh the page to obtain the most current list.

How to Create Articles

Prerequisites: You want to add custom text content to your knowledge base.

Steps:

  1. Click Add, choose Articles

  1. Enter article details:

  • Name: Required field, must be unique

  • Content: Use the rich text editor to format your content

  1. Configure chunking settings:

Setting

Description

Default

Range

Chunking Method

How text is divided

Word

Letter, Word, Sentence, Passage

Chunk Size

Size of each text segment

200

0-4500

Chunk Overlap

Overlap between chunks

3

0-4500

  1. Choose destination (Save to groups): Select an existing group

  1. Click Create

Expected Result: Article is created and processed for vectorization.

Content Editor Features:

  • Text formatting (bold, italic, strikethrough, underline, square code, headers, and similar formatting options)

  • Bullet points and numbered lists

  • Image/video insertion

  • Link embedding

Error Handling:

  • Limitation: Articles should not exceed 300,000 words.

  • The desired group not be visible: Please refresh the page to obtain the most current list.

How to Add URLs and Crawl Websites

Option 1: Individual URLs

Prerequisites: You have specific web pages to add to your knowledge base.

Steps:

  1. Click Add, choose URLs

  1. Select Add individual links

  1. Enter the URL (must include http:// or https://)

  1. Click Add another link for multiple URLs

  2. Configure chunking settings:

Setting

Description

Default

Range

Chunking Method

How text is divided

Word

Letter, Word, Sentence, Passage

Chunk Size

Size of each text segment

200

0-4500

Chunk Overlap

Overlap between chunks

3

0-4500

  1. Choose destination (Save to groups): Select an existing group

  1. Click Create

Option 2: Crawl Website Sitemap

Prerequisites: The target website has an accessible sitemap.

Steps:

  1. Click Add, choose URLs

  1. Select Crawl website sitemap

  1. Enter the website URL (e.g., https://example.com)

  1. Click Fetch

  2. Wait for sitemap retrieval (loading screen appears)

  3. Review the URL list:

  • Deselect unwanted URLs by unchecking boxes

  • URLs may display as relative paths (e.g., /products/item1 instead of full URL)

  1. Configure chunking settings:

Setting

Description

Default

Range

Chunking Method

How text is divided

Word

Letter, Word, Sentence, Passage

Chunk Size

Size of each text segment

200

0-4500

Chunk Overlap

Overlap between chunks

3

0-4500

  1. Choose destination (Save to groups): Select an existing group

  1. Click Create

Expected Result: Selected URLs are queued for processing and appear with “Processing” status, then “Completed” when ready, or “Rejected” if over limitations.

Domain Parsing Example:

  • Input: https://dev.pp1.fr/tnc

  • Crawled URL: https://dev.pp1.fr/tnc/vehicule/porsche-911

  • Display: vehicule/porsche-911

Troubleshooting URL Issues:

  • Failed to fetch sitemap: Verify URL is correct and sitemap is accessible

  • No valid URLs found: The website may not have a standard sitemap format

  • Processing failed: The URL content may be unavailable, not supported, or beyond the allowed crawling limit.

Last updated

Was this helpful?