Document Index Tools
Manage creation, updates, queries, and lifecycle operations for document indexes, ensuring scalable and robust content organization.
Overview
Tool Name
Purpose
The document_index_tools group facilitates document creation, maintenance, querying, and indexing. It supports advanced organization for large collections, ensuring searchability and dynamic updates to content. By leveraging features like metadata tagging, file additions/removals, and natural language querying, teams can build flexible workflows tailored to organizational needs.
Functions Available
-
CREATE_INDEX
Establishes a new document index and organizes content for streamlined search and retrieval. -
LIST_INDICES
Lists all available document indexes to verify or review their existence. -
ADD_DOCUMENTS
Inserts additional documents into an existing index, enabling dynamic updates. -
LIST_DOCUMENTS
Provides a list of all documents contained within a specific index. -
SEARCH
Performs keyword-based or phrase-based lookups across the indexed content. -
ASK
Interprets natural language queries, generating dynamic insights from the index. -
DELETE_DOCUMENT
Removes a single document from an index without deleting the entire collection. -
DELETE_INDEX
Permanently deletes an entire document index, clearing all associated content. -
RENAME_INDEX
Renames a document index to reflect reorganization or updated naming conventions.
Key Features
Flexible Index Creation
Build new indexes and customize them with metadata or a specific index type (static/dynamic).
Scalable Content Updates
Dynamically add or remove documents in existing indexes to keep content aligned with changing workflows.
Advanced Retrieval & Queries
Search for terms, apply metadata filters, or perform natural language queries for deeper insights.
Lifecycle Management
Rename or delete entire indexes and manage content lifecycle within large repositories.
Input Parameters for Each Function
CREATE_INDEX
Parameters
Name | Definition | Format |
---|---|---|
index_name | A unique name for the new document index (required). Example: "legal_documents_archive" . | String |
documents | List of document content to include (required). Example: ["Contract 1 text", "Contract 2"] . | List of String |
metadata | (Optional) JSON object for tagging/index-level metadata. Example: {"category": "contracts"} . | JSON Object |
index_type | (Optional) "dynamic" or "static" (default: "static" ) to define future updates. | String |
LIST_INDICES
Parameters
(No parameters required; returns names of all active document indexes.)
ADD_DOCUMENTS
Parameters
Name | Definition | Format |
---|---|---|
index_name | The index to update with new documents (required). | String |
documents | List of documents to add (required). | List of String |
metadata | (Optional) JSON object for appended docs’ metadata. | JSON Object |
LIST_DOCUMENTS
Parameters
Name | Definition | Format |
---|---|---|
index_name | Name of the index whose document list is retrieved (required). | String |
SEARCH
Parameters
Name | Definition | Format |
---|---|---|
index_name | Name of the index to search (required). | String |
query | Keywords or phrases to find (required). | String |
filters | (Optional) JSON object to filter results by metadata. Example: {"year": "2023"} . | JSON Object |
top_n | (Optional) Number of results to return (default: 10 ). | Integer |
ASK
Parameters
Name | Definition | Format |
---|---|---|
index_name | Name of the index for the natural language query (required). | String |
question | A question about the indexed content (required). | String |
DELETE_DOCUMENT
Parameters
Name | Definition | Format |
---|---|---|
index_name | Name of the index containing the doc to delete (required). | String |
document_id | Unique identifier of the document to remove (string, required). Acquirable via LIST_DOCUMENTS . | String |
DELETE_INDEX
Parameters
Name | Definition | Format |
---|---|---|
index_name | Name of the document index to remove permanently (required). | String |
RENAME_INDEX
Parameters
Name | Definition | Format |
---|---|---|
current_name | Existing name of the index (required). | String |
new_name | New name for the index (required). | String |
Use Cases
-
Comprehensive Document Management
- Create indexes, add documents, and search or ask for context-based retrieval.
- Example: Building
"All_Product_Manuals"
index for quick reference of product user guides.
-
Dynamic Querying & Insights
- SEARCH for targeted keywords or use ASK to interpret natural language queries.
- Example: “Which documents mention ‘Snowflake pipeline improvements’?”
-
Document Lifecycle Management
- Remove outdated files via DELETE_DOCUMENT and retire entire indexes with DELETE_INDEX.
- Example: Deleting older draft specs from
"technical_specs_index"
while retaining final docs.
-
Collaborative Access
- LIST_DOCUMENTS helps teams see all content in an index, sharing it among multiple departments.
- Example: Display the entire
"finance_reports_index"
to confirm Q2 budgets are uploaded.
-
Index Organization
- Use RENAME_INDEX to reorganize or unify naming conventions over time.
- Example: Changing
"marketing_plans_2023"
to"marketing_plans_current"
after updates.
Workflow/How It Works
-
Step 1 - Create or List Indexes
- CREATE_INDEX organizes documents into a cohesive, searchable unit.
- LIST_INDICES to review all existing indexes.
-
Step 2 - Add or Modify Content
- For dynamic indexes, ADD_DOCUMENTS to expand coverage or DELETE_DOCUMENT to remove old files.
-
Step 3 - Search or Query
- SEARCH for direct keyword-based lookups; ASK for advanced, natural language questions across the index.
-
Step 4 - Manage Index Lifecycle
- LIST_DOCUMENTS for an overview, RENAME_INDEX for reorganization, or DELETE_INDEX to retire the entire set.
Integration Relevance
- Team Knowledge Repositories: Combine with
project_manager_tools
to manage multi-project docs in a single index. - Data Enrichment: Pair with
database_tools
to store query outputs or logs in an indexed environment. - Workflow Automation: Integrate with
process_scheduler_tools
to schedule index updates or cleanup tasks. - Testing & QA: Link with
manage_tests_tools
for storing test logs or documentation, ensuring easy future retrieval.
Configuration Details
- Index Naming: Use consistent and unique names to reflect the domain (e.g., “Legal_Archive_2023”).
- Metadata: Provide well-structured metadata for improved filtering and search relevance.
- Index Type: Default is
"static"
, but choose"dynamic"
if frequent doc additions or removals are expected.
Limitations or Notes
- Document Volume
- Index performance may degrade if extremely large sets or documents are added—consider splitting them.
- Data Confidentiality
- Sensitive info in indexes should have access restrictions or encryption if needed.
- Renaming Risks
- Changing index names might break references in existing workflows or scripts.
- Irreversible Deletions
- Deleting indexes or documents is permanent—confirm backups or references before finalizing.
Output
-
Index & Document Management
- Confirmations for creation, listing, renaming, or deletion of indexes/documents, plus error messages if parameters are invalid.
-
Search & Query Results
- JSON-formatted outputs containing matched documents, relevant snippets, or references.
How It Works
You begin by creating an index (CREATE_INDEX
) with a name and set of documents. If the index is dynamic, additional documents can be appended (ADD_DOCUMENTS
). Searching can be direct (SEARCH
) or more question-oriented (ASK
). Over time, documents or entire indexes can be removed (DELETE_DOCUMENT
or DELETE_INDEX
), reflecting your content lifecycle. If naming conventions shift, RENAME_INDEX
helps reorganize indexes without re-creating them from scratch.
Example
- Create:
_create_index(index_name="Research_Archive", documents=["Paper1 text", "Paper2 text"], index_type="dynamic")
- Add:
_add_documents_to_index(index_name="Research_Archive", documents=["Paper3 text"])
- Search:
_search(index_name="Research_Archive", query="Quantum entanglement", top_n=5)
- Ask:
_ask(index_name="Research_Archive", question="What are the main findings on quantum entanglement?")
- List:
_list_documents(index_name="Research_Archive")
- Delete:
_delete_document(index_name="Research_Archive", document_id="doc_34567")
or_delete_index(index_name="Research_Archive")