Overview

Tool Name

document_index_tools

Purpose

The document_index_tools enable you to create, manage, and query document indices for high quality search, question answering, and document organization. Ideal for building knowledge hubs by ingesting files from Git, local storage, or other repositories, then performing keyword and semantic retrieval at scale.

Functions Available

  1. _document_index: Single multifunction entry point controlled by the action parameter that performs index and document operations.
  2. CREATE_INDEX: Create a new index for future ingestion and search.
  3. RENAME_INDEX: Rename an existing index without reingesting content.
  4. DELETE_INDEX: Delete an index and all of its files.
  5. LIST_INDICES: List indices with basic metadata.
  6. ADD_DOCUMENTS: Add one file or an entire directory to an index. Supports recursive ingest.
  7. LIST_DOCUMENTS: List files inside an index with optional path and text filters.
  8. GET_DOCUMENT: Read file content with pagination by lines, characters, or chunks.
  9. DELETE_DOCUMENT: Remove a specific file from an index.
  10. RENAME_DOCUMENT: Rename a file within an index.
  11. SEARCH: Run keyword or semantic search across one or many indices.
  12. ASK: Answer natural language questions using composable retrieval across indices.
  13. GET_INDICES_STATS: Return per index statistics such as counts and timestamps.

Key Features

Build and Maintain Indices

Create, rename, list, and delete indices to keep knowledge bases organized.

Ingest Files and Folders

Add single files or whole directories recursively, then manage documents over time.

Full Text Search

Search across one or multiple indices and receive ranked results with context.

Question Answering

Ask natural language questions and retrieve sourced answers from indexed content.

Preview at Scale

Read documents with line or character pagination or navigate chunk based previews.

Govern and Clean Up

Delete or rename documents and indices to maintain hygiene without full reindexing.

Input Parameters for Each Function

Common to _document_index

NameDefinitionFormat
actionOne of the supported actions. See list above.String (enum)
index_nameTarget index. Required for most actions except global SEARCH or ASK and some listing actions.String
top_nNumber of results to return where applicable. Default 10 for searches.Integer

CREATE_INDEX

NameDefinitionFormat
index_nameName of the new index to create.String

RENAME_INDEX

NameDefinitionFormat
index_nameCurrent index name.String
new_index_nameNew name for the index.String

DELETE_INDEX

NameDefinitionFormat
index_nameName of the index to delete.String

LIST_INDICES

No additional parameters. Returns the names and metadata for all indices.

ADD_DOCUMENTS

NameDefinitionFormat
index_nameTarget index.String
filepathPath to a file or directory. Supports BOT_GIT: prefix for files in the bot repository. Directories ingest recursively.String
Use BOT_GIT:/path/to/file_or_dir for reliable, versioned ingestion from your mission repository. This keeps paths stable across runs.

LIST_DOCUMENTS

NameDefinitionFormat
index_nameTarget index.String
path_filterOptional substring to match file paths.String
queryOptional text filter to match file content.String
show_files_onlyIf true, list files only at the current level.Boolean

GET_DOCUMENT

NameDefinitionFormat
index_nameTarget index.String
file_idIdentifier of the file if known.String
filepathPath to the file. Provide file_id or filepath.String
pagination_modeOne of lines, chars, or None for chunk based reading.String or null
startZero based starting position for the selected mode.Integer
countNumber of lines, characters, or chunks to return.Integer
For code and structured text prefer pagination_mode="lines". For prose prefer pagination_mode="chars". Use chunk mode when you want the tool’s native segmentation.

DELETE_DOCUMENT

NameDefinitionFormat
index_nameTarget index.String
filepathPath of the file to remove from the index.String

RENAME_DOCUMENT

NameDefinitionFormat
index_nameTarget index.String
filepathCurrent path of the file.String
new_filenameNew filename to assign.String
NameDefinitionFormat
index_nameOptional. If omitted, search across all indices.String
querySearch text. Keyword or semantic query depending on implementation.String
top_nNumber of results to return. Default 10.Integer

ASK

NameDefinitionFormat
queryNatural language question. Retrieval runs across indices using a composable graph.String
top_nOptional fan out control for underlying retrieval.Integer

GET_INDICES_STATS

No additional parameters. Returns per index statistics such as file counts and timestamps.

Use Cases

  1. Mission Knowledge Base Build an internal knowledge hub for requirements, SOPs, and architecture notes. Example: Index a runbooks folder and enable search plus Q&A for on call engineers.
  2. RAG for Onboarding Provide new team members with a question answering interface over onboarding materials. Example: Ask, “How do I rotate credentials” and get a sourced answer.
  3. Git Docs Discovery Ingest a docs directory from Git and search by keyword or path. Example: Find all pages mentioning “rate limit” in API docs.
  4. Analyst Document Access Search PDFs and specs, then preview with pagination to pull the right excerpt. Example: Return 100 lines starting from line 300 in a long changelog.
  5. Governance Library Maintain data policy indices and generate summaries on demand. Example: Ask for retention rules and cite the policy document.
Deleting an index or document is permanent. Confirm that no downstream workflows rely on the content before removal.

Workflow/How It Works

  1. Step 1: Create or Select an Index Use CREATE_INDEX or locate an existing one via LIST_INDICES.
  2. Step 2: Ingest Content Use ADD_DOCUMENTS to add single files or entire directories. Prefer BOT_GIT: paths for repository sources.
  3. Step 3: Explore Inventory Use LIST_DOCUMENTS with path_filter or query to scope the file set.
  4. Step 4: Retrieve Content Snippets Use GET_DOCUMENT with pagination to preview without loading entire files.
  5. Step 5: Discover or Ask Use SEARCH for precise discovery with ranked results. Use ASK for natural language answers with sources.
  6. Step 6: Maintain Hygiene Use RENAME_DOCUMENT or DELETE_DOCUMENT for files and RENAME_INDEX or DELETE_INDEX for indices.

Integration Relevance

  • file_manager_tools and git_action: Register or modify files, then ingest using ADD_DOCUMENTS with BOT_GIT: paths.
  • project_manager_tools: Link indices to project or mission tasks for traceability.
  • web_access_tools: Scrape pages, save content, then index for unified search.
  • google_drive_tools: Export summaries to Docs or Sheets. Ingest Drive downloads after staging.
  • delegate_work: Automate nightly indexing, refreshes, or QA sweeps.

Configuration Details

  • Directory strategy: Organize repo paths such as BOT_GIT:docs/specs and BOT_GIT:runbooks for predictable listings.
  • File types: Prefer text friendly formats like Markdown and plain text for better chunking and recall.
  • Chunking settings: Defaults are managed internally. If you need custom chunk sizes, document conventions and keep files under practical size limits.
  • Access controls: Align indices with team access. Do not mix confidential and public content in the same index.
Use LIST_INDICES and GET_INDICES_STATS together to monitor growth and spot unusually large or stale indices that may need cleanup.

Limitations or Notes

  1. Large files and PDFs Text extraction quality depends on parsers. Consider OCR or cleanup for scanned content.
  2. Binary formats Non text binaries are not semantically indexed. Store summaries separately if needed.
  3. Path stability Renaming or moving files outside the tool requires readding or a RENAME_DOCUMENT call to maintain searchability.
  4. ASK scope ASK typically runs across all indices and ignores index_name. Use SEARCH to scope to a specific index.
  5. Pagination fidelity Line and character counts can vary due to encoding. Rely on has_more and actual_count for navigation.

Output

  • Index Management Success flags and messages for create, rename, and delete. LIST_INDICES returns names plus metadata. GET_INDICES_STATS returns counts and timestamps.
  • Ingestion and Inventory For ADD_DOCUMENTS a per file status report with added and skipped items. LIST_DOCUMENTS returns paths, sizes, timestamps, and optional metadata.
  • Reading and Retrieval GET_DOCUMENT returns a content slice with start, count, actual_count, and has_more. SEARCH returns ranked results with score, snippet, index name, and path. ASK returns a best answer plus sources and footnotes.
Prefer SEARCH when you need multiple hits with scores. Prefer ASK when you want a concise answer with citations.