Overview
Tool Name
Purpose
The document_index_tools enable you to create, manage, and query document indices for high quality search, question answering, and document organization. Ideal for building knowledge hubs by ingesting files from Git, local storage, or other repositories, then performing keyword and semantic retrieval at scale.Functions Available
_document_index
:Single multifunction entry point controlled by the action parameter that performs index and document operations.
CREATE_INDEX
: Create a new index for future ingestion and search.RENAME_INDEX
: Rename an existing index without reingesting content.DELETE_INDEX
: Delete an index and all of its files.LIST_INDICES
: List indices with basic metadata.ADD_DOCUMENTS
: Add one file or an entire directory to an index. Supports recursive ingest.LIST_DOCUMENTS
: List files inside an index with optional path and text filters.GET_DOCUMENT
: Read file content with pagination by lines, characters, or chunks.DELETE_DOCUMENT
: Remove a specific file from an index.RENAME_DOCUMENT
: Rename a file within an index.SEARCH
: Run keyword or semantic search across one or many indices.ASK
: Answer natural language questions using composable retrieval across indices.GET_INDICES_STATS
: Return per index statistics such as counts and timestamps.
Key Features
Build and Maintain Indices
Create, rename, list, and delete indices to keep knowledge bases organized.
Ingest Files and Folders
Add single files or whole directories recursively, then manage documents over time.
Full Text Search
Search across one or multiple indices and receive ranked results with context.
Question Answering
Ask natural language questions and retrieve sourced answers from indexed content.
Preview at Scale
Read documents with line or character pagination or navigate chunk based previews.
Govern and Clean Up
Delete or rename documents and indices to maintain hygiene without full reindexing.
Input Parameters for Each Function
Common to _document_index
Name | Definition | Format |
---|---|---|
action | One of the supported actions. See list above. | String (enum) |
index_name | Target index. Required for most actions except global SEARCH or ASK and some listing actions. | String |
top_n | Number of results to return where applicable. Default 10 for searches. | Integer |
CREATE_INDEX
Name | Definition | Format |
---|---|---|
index_name | Name of the new index to create. | String |
RENAME_INDEX
Name | Definition | Format |
---|---|---|
index_name | Current index name. | String |
new_index_name | New name for the index. | String |
DELETE_INDEX
Name | Definition | Format |
---|---|---|
index_name | Name of the index to delete. | String |
LIST_INDICES
No additional parameters. Returns the names and metadata for all indices.
ADD_DOCUMENTS
Name | Definition | Format |
---|---|---|
index_name | Target index. | String |
filepath | Path to a file or directory. Supports BOT_GIT: prefix for files in the bot repository. Directories ingest recursively. | String |
Use
BOT_GIT:/path/to/file_or_dir
for reliable, versioned ingestion from your mission repository. This keeps paths stable across runs.LIST_DOCUMENTS
Name | Definition | Format |
---|---|---|
index_name | Target index. | String |
path_filter | Optional substring to match file paths. | String |
query | Optional text filter to match file content. | String |
show_files_only | If true, list files only at the current level. | Boolean |
GET_DOCUMENT
Name | Definition | Format |
---|---|---|
index_name | Target index. | String |
file_id | Identifier of the file if known. | String |
filepath | Path to the file. Provide file_id or filepath . | String |
pagination_mode | One of lines , chars , or None for chunk based reading. | String or null |
start | Zero based starting position for the selected mode. | Integer |
count | Number of lines, characters, or chunks to return. | Integer |
For code and structured text prefer
pagination_mode="lines"
. For prose prefer pagination_mode="chars"
. Use chunk mode when you want the tool’s native segmentation.DELETE_DOCUMENT
Name | Definition | Format |
---|---|---|
index_name | Target index. | String |
filepath | Path of the file to remove from the index. | String |
RENAME_DOCUMENT
Name | Definition | Format |
---|---|---|
index_name | Target index. | String |
filepath | Current path of the file. | String |
new_filename | New filename to assign. | String |
SEARCH
Name | Definition | Format |
---|---|---|
index_name | Optional. If omitted, search across all indices. | String |
query | Search text. Keyword or semantic query depending on implementation. | String |
top_n | Number of results to return. Default 10. | Integer |
ASK
Name | Definition | Format |
---|---|---|
query | Natural language question. Retrieval runs across indices using a composable graph. | String |
top_n | Optional fan out control for underlying retrieval. | Integer |
GET_INDICES_STATS
No additional parameters. Returns per index statistics such as file counts and timestamps.
Use Cases
- Mission Knowledge Base Build an internal knowledge hub for requirements, SOPs, and architecture notes. Example: Index a runbooks folder and enable search plus Q&A for on call engineers.
- RAG for Onboarding Provide new team members with a question answering interface over onboarding materials. Example: Ask, “How do I rotate credentials” and get a sourced answer.
- Git Docs Discovery Ingest a docs directory from Git and search by keyword or path. Example: Find all pages mentioning “rate limit” in API docs.
- Analyst Document Access Search PDFs and specs, then preview with pagination to pull the right excerpt. Example: Return 100 lines starting from line 300 in a long changelog.
- Governance Library Maintain data policy indices and generate summaries on demand. Example: Ask for retention rules and cite the policy document.
Deleting an index or document is permanent. Confirm that no downstream workflows rely on the content before removal.
Workflow/How It Works
- Step 1: Create or Select an Index
Use
CREATE_INDEX
or locate an existing one viaLIST_INDICES
. - Step 2: Ingest Content
Use
ADD_DOCUMENTS
to add single files or entire directories. PreferBOT_GIT:
paths for repository sources. - Step 3: Explore Inventory
Use
LIST_DOCUMENTS
withpath_filter
orquery
to scope the file set. - Step 4: Retrieve Content Snippets
Use
GET_DOCUMENT
with pagination to preview without loading entire files. - Step 5: Discover or Ask
Use
SEARCH
for precise discovery with ranked results. UseASK
for natural language answers with sources. - Step 6: Maintain Hygiene
Use
RENAME_DOCUMENT
orDELETE_DOCUMENT
for files andRENAME_INDEX
orDELETE_INDEX
for indices.
Integration Relevance
- file_manager_tools and git_action: Register or modify files, then ingest using
ADD_DOCUMENTS
withBOT_GIT:
paths. - project_manager_tools: Link indices to project or mission tasks for traceability.
- web_access_tools: Scrape pages, save content, then index for unified search.
- google_drive_tools: Export summaries to Docs or Sheets. Ingest Drive downloads after staging.
- delegate_work: Automate nightly indexing, refreshes, or QA sweeps.
Configuration Details
- Directory strategy: Organize repo paths such as
BOT_GIT:docs/specs
andBOT_GIT:runbooks
for predictable listings. - File types: Prefer text friendly formats like Markdown and plain text for better chunking and recall.
- Chunking settings: Defaults are managed internally. If you need custom chunk sizes, document conventions and keep files under practical size limits.
- Access controls: Align indices with team access. Do not mix confidential and public content in the same index.
Use
LIST_INDICES
and GET_INDICES_STATS
together to monitor growth and spot unusually large or stale indices that may need cleanup.Limitations or Notes
- Large files and PDFs Text extraction quality depends on parsers. Consider OCR or cleanup for scanned content.
- Binary formats Non text binaries are not semantically indexed. Store summaries separately if needed.
- Path stability
Renaming or moving files outside the tool requires readding or a
RENAME_DOCUMENT
call to maintain searchability. - ASK scope
ASK
typically runs across all indices and ignoresindex_name
. UseSEARCH
to scope to a specific index. - Pagination fidelity
Line and character counts can vary due to encoding. Rely on
has_more
andactual_count
for navigation.
Output
- Index Management
Success flags and messages for create, rename, and delete.
LIST_INDICES
returns names plus metadata.GET_INDICES_STATS
returns counts and timestamps. - Ingestion and Inventory
For
ADD_DOCUMENTS
a per file status report with added and skipped items.LIST_DOCUMENTS
returns paths, sizes, timestamps, and optional metadata. - Reading and Retrieval
GET_DOCUMENT
returns a content slice withstart
,count
,actual_count
, andhas_more
.SEARCH
returns ranked results with score, snippet, index name, and path.ASK
returns a best answer plus sources and footnotes.
Prefer
SEARCH
when you need multiple hits with scores. Prefer ASK
when you want a concise answer with citations.