Skip to main content

Overview

Tool Name

_cortex_search

Purpose

The _cortex_search tool enables you to build and query semantic search services using Snowflake Cortex AI. Unlike traditional keyword matching, Cortex Search understands meaning and context, returning relevant results even when exact terms don’t match. Perfect for product catalogs, document libraries, knowledge bases, and customer support systems.

Functions Available

  1. _cortex_search: Manages Cortex Search services for creating AI-powered semantic search over table data, executing natural language queries, and maintaining search indexes. Controlled by the action parameter.

Key Features

Semantic Understanding

Find results based on meaning, not just keywords—understands synonyms, context, and intent.

Natural Language Queries

Search using conversational language without complex query syntax.

Metadata Filtering

Combine semantic search with structured filters on metadata columns.

Fast Indexing

Optimized vector-based indexing for sub-second query response times.

Easy Management

Create, refresh, monitor, and delete search services with simple commands.

Input Parameters for Each Function

Parameters
NameDefinitionFormat
actionOperation to perform. Values: create_service, query, list_services, describe_service, drop_service, refresh_service, get_status.String (required)
service_nameUnique identifier for the search service.String
table_nameSource Snowflake table to index (for create_service).String
search_columnColumn containing text content to search (for create_service).String
metadata_columnsAdditional columns to return with search results (for create_service).List of Strings
warehouseSnowflake warehouse for indexing operations (for create_service).String
queryNatural language search text (for query action).String
limitMaximum number of results to return (for query action). Default: 10.Integer
Choose a search_column with rich text content (descriptions, summaries, full text) for best semantic results. Short codes or IDs work poorly.

Use Cases

  1. Product Discovery Enable customers to find products using natural descriptions like “eco-friendly kitchen appliances” or “budget laptops for students.”
  2. Document Retrieval Search company documentation, policies, or knowledge bases with queries like “remote work policies” or “data retention guidelines.”
  3. Customer Support Find relevant help articles, FAQs, or past tickets based on problem descriptions rather than exact keywords.
  4. Content Recommendation Match user interests to relevant articles, research papers, or media content based on semantic similarity.
  5. Data Exploration Help analysts discover datasets or tables by searching descriptions and metadata with natural language.

Workflow/How It Works

  1. Step 1: Identify Source Data Choose a Snowflake table with a text column rich enough for semantic search (descriptions, documents, summaries).
  2. Step 2: Create Search Service Run create_service specifying the table_name, search_column, and metadata_columns to index.
    _cortex_search(
        action="create_service",
        service_name="product_catalog",
        table_name="products",
        search_column="description",
        metadata_columns=["product_id", "name", "category", "price"],
        warehouse="COMPUTE_WH"
    )
    
  3. Step 3: Wait for Indexing Snowflake builds the vector index in the background. Use get_status to monitor progress.
  4. Step 4: Query with Natural Language Execute searches using conversational queries:
    _cortex_search(
        action="query",
        service_name="product_catalog",
        query="wireless headphones with noise cancellation",
        limit=5
    )
    
  5. Step 5: Refresh as Needed When source data changes, run refresh_service to update the index.
  6. Step 6: Monitor and Maintain Use list_services and describe_service to track all search services and their configurations.

Integration Relevance

  • data_connector_tools for exploring tables and identifying good search candidates.
  • _query_database to prepare and clean source data before indexing.
  • file_manager_tools to log search queries and results for analytics.
  • project_manager_tools to track search service creation and maintenance tasks.
  • image_tools to build search UIs with visual result previews.

Configuration Details

  • Search Column Selection: Choose columns with substantive text (100+ characters ideal). Avoid short codes or categorical values.
  • Metadata Columns: Include fields users need to see or filter on (IDs, names, categories, dates).
  • Warehouse Sizing: Larger warehouses speed up initial indexing; small warehouses suffice for queries.
  • Table Requirements: Source table must be in Snowflake (not external tables or views with complex logic).
  • Refresh Strategy: Schedule refresh_service after ETL runs to keep index current.
Cortex Search services consume compute during indexing and storage for vector embeddings. Monitor costs and limit services to high-value use cases.

Limitations or Notes

  1. Snowflake Cortex Availability: Requires Snowflake account with Cortex features enabled.
  2. Language Support: Optimized for English; other languages may have reduced accuracy.
  3. Column Type Constraints: search_column must contain text (VARCHAR, STRING); cannot index binary or complex types.
  4. Table Size: Very large tables (100M+ rows) may have longer indexing times and higher costs.
  5. Query Complexity: Simple natural language works best; complex boolean logic or regex not supported.
  6. No Partial Updates: Changes to source data require full refresh_service; incremental updates not available.
  7. Result Ranking: Uses AI-based relevance; exact ranking algorithm not customizable.

Supported Actions

create_service - Create new semantic search service
query - Execute natural language search
list_services - Show all search services
describe_service - Get service configuration details
drop_service - Delete a search service
refresh_service - Update index with new table data
get_status - Check indexing status and health

Not Supported

❌ External tables or views as source (must be native Snowflake tables)
❌ Real-time incremental indexing (requires full refresh)
❌ Custom relevance scoring or ranking algorithms
❌ Boolean operators (AND, OR, NOT) in queries
❌ Fuzzy matching or regex patterns
❌ Cross-table joins in search (single table per service)
❌ Image, audio, or video content search

Output

  • create_service: Confirmation with service name, indexed columns, and status.
  • query: List of matching results with metadata, ranked by semantic relevance.
  • list_services: Table of all services with names, source tables, and creation dates.
  • describe_service: Detailed configuration including search column, metadata columns, and index statistics.
  • refresh_service: Status update with refresh timestamp and row count.
  • get_status: Health check with indexing progress, last refresh, and error messages if any.
  • drop_service: Confirmation of service deletion.
  • Errors: Structured messages with resolution hints (e.g., invalid column names, warehouse unavailable).

Best Practices

Optimize Search Columns

Use columns with rich, descriptive text (200-500 words ideal). Concatenate multiple fields if needed.

Select Smart Metadata

Include only columns users need to see or filter—too many slows results, too few limits usability.

Schedule Refreshes

Automate refresh_service after data updates to keep search current without manual intervention.

Monitor Performance

Log query patterns and latency to identify slow searches or opportunities for optimization.

Example: Complete Product Search Setup

# 1. Create search service over product catalog
_cortex_search(
    action="create_service",
    service_name="product_search",
    table_name="RETAIL_DB.PRODUCTS.CATALOG",
    search_column="description",
    metadata_columns=["product_id", "name", "category", "price", "rating"],
    warehouse="COMPUTE_WH"
)

# 2. Check indexing status
_cortex_search(
    action="get_status",
    service_name="product_search"
)

# 3. Execute semantic search
results = _cortex_search(
    action="query",
    service_name="product_search",
    query="affordable wireless earbuds with long battery life",
    limit=10
)

# 4. Refresh after daily ETL
_cortex_search(
    action="refresh_service",
    service_name="product_search"
)

Ready to build intelligent search into your applications? Start by identifying a table with rich text content and create your first Cortex Search service! 🚀
I