Cortex Search Tools

Overview

Tool Name

_cortex_search

Purpose

The _cortex_search tool enables you to build and query semantic search services using Snowflake Cortex AI. Unlike traditional keyword matching, Cortex Search understands meaning and context, returning relevant results even when exact terms don’t match. Perfect for product catalogs, document libraries, knowledge bases, and customer support systems.

Functions Available

_cortex_search: Manages Cortex Search services for creating AI-powered semantic search over table data, executing natural language queries, and maintaining search indexes. Controlled by the action parameter.

Key Features

Semantic Understanding

Find results based on meaning, not just keywords—understands synonyms, context, and intent.

Natural Language Queries

Search using conversational language without complex query syntax.

Metadata Filtering

Combine semantic search with structured filters on metadata columns.

Fast Indexing

Optimized vector-based indexing for sub-second query response times.

Easy Management

Create, refresh, monitor, and delete search services with simple commands.

Input Parameters for Each Function

`_cortex_search`

Parameters

Name	Definition	Format
action	Operation to perform. Values: `create_service`, `query`, `list_services`, `describe_service`, `drop_service`, `refresh_service`, `get_status`.	String (required)
service_name	Unique identifier for the search service.	String
table_name	Source Snowflake table to index (for `create_service`).	String
search_column	Column containing text content to search (for `create_service`).	String
metadata_columns	Additional columns to return with search results (for `create_service`).	List of Strings
warehouse	Snowflake warehouse for indexing operations (for `create_service`).	String
query	Natural language search text (for `query` action).	String
limit	Maximum number of results to return (for `query` action). Default: 10.	Integer

Choose a search_column with rich text content (descriptions, summaries, full text) for best semantic results. Short codes or IDs work poorly.

Use Cases

Product Discovery Enable customers to find products using natural descriptions like “eco-friendly kitchen appliances” or “budget laptops for students.”
Document Retrieval Search company documentation, policies, or knowledge bases with queries like “remote work policies” or “data retention guidelines.”
Customer Support Find relevant help articles, FAQs, or past tickets based on problem descriptions rather than exact keywords.
Content Recommendation Match user interests to relevant articles, research papers, or media content based on semantic similarity.
Data Exploration Help analysts discover datasets or tables by searching descriptions and metadata with natural language.

Workflow/How It Works

Step 1: Identify Source Data Choose a Snowflake table with a text column rich enough for semantic search (descriptions, documents, summaries).

Step 2: Create Search Service Run create_service specifying the table_name, search_column, and metadata_columns to index.

_cortex_search(
    action="create_service",
    service_name="product_catalog",
    table_name="products",
    search_column="description",
    metadata_columns=["product_id", "name", "category", "price"],
    warehouse="COMPUTE_WH"
)

Step 3: Wait for Indexing Snowflake builds the vector index in the background. Use get_status to monitor progress.

Step 4: Query with Natural Language Execute searches using conversational queries:

_cortex_search(
    action="query",
    service_name="product_catalog",
    query="wireless headphones with noise cancellation",
    limit=5
)

Step 5: Refresh as Needed When source data changes, run refresh_service to update the index.
Step 6: Monitor and Maintain Use list_services and describe_service to track all search services and their configurations.

Integration Relevance

data_connector_tools for exploring tables and identifying good search candidates.
_query_database to prepare and clean source data before indexing.
file_manager_tools to log search queries and results for analytics.
project_manager_tools to track search service creation and maintenance tasks.
image_tools to build search UIs with visual result previews.

Configuration Details

Search Column Selection: Choose columns with substantive text (100+ characters ideal). Avoid short codes or categorical values.
Metadata Columns: Include fields users need to see or filter on (IDs, names, categories, dates).
Warehouse Sizing: Larger warehouses speed up initial indexing; small warehouses suffice for queries.
Table Requirements: Source table must be in Snowflake (not external tables or views with complex logic).
Refresh Strategy: Schedule refresh_service after ETL runs to keep index current.

Cortex Search services consume compute during indexing and storage for vector embeddings. Monitor costs and limit services to high-value use cases.

Limitations or Notes

Snowflake Cortex Availability: Requires Snowflake account with Cortex features enabled.
Language Support: Optimized for English; other languages may have reduced accuracy.
Column Type Constraints: search_column must contain text (VARCHAR, STRING); cannot index binary or complex types.
Table Size: Very large tables (100M+ rows) may have longer indexing times and higher costs.
Query Complexity: Simple natural language works best; complex boolean logic or regex not supported.
No Partial Updates: Changes to source data require full refresh_service; incremental updates not available.
Result Ranking: Uses AI-based relevance; exact ranking algorithm not customizable.

Supported Actions

✅ create_service - Create new semantic search service
✅ query - Execute natural language search
✅ list_services - Show all search services
✅ describe_service - Get service configuration details
✅ drop_service - Delete a search service
✅ refresh_service - Update index with new table data
✅ get_status - Check indexing status and health

Not Supported

❌ External tables or views as source (must be native Snowflake tables)
❌ Real-time incremental indexing (requires full refresh)
❌ Custom relevance scoring or ranking algorithms
❌ Boolean operators (AND, OR, NOT) in queries
❌ Fuzzy matching or regex patterns
❌ Cross-table joins in search (single table per service)
❌ Image, audio, or video content search

Output

create_service: Confirmation with service name, indexed columns, and status.
query: List of matching results with metadata, ranked by semantic relevance.
list_services: Table of all services with names, source tables, and creation dates.
describe_service: Detailed configuration including search column, metadata columns, and index statistics.
refresh_service: Status update with refresh timestamp and row count.
get_status: Health check with indexing progress, last refresh, and error messages if any.
drop_service: Confirmation of service deletion.
Errors: Structured messages with resolution hints (e.g., invalid column names, warehouse unavailable).

Best Practices

Optimize Search Columns

Use columns with rich, descriptive text (200-500 words ideal). Concatenate multiple fields if needed.

Select Smart Metadata

Include only columns users need to see or filter—too many slows results, too few limits usability.

Schedule Refreshes

Automate refresh_service after data updates to keep search current without manual intervention.

Monitor Performance

Log query patterns and latency to identify slow searches or opportunities for optimization.

Example: Complete Product Search Setup

# 1. Create search service over product catalog
_cortex_search(
    action="create_service",
    service_name="product_search",
    table_name="RETAIL_DB.PRODUCTS.CATALOG",
    search_column="description",
    metadata_columns=["product_id", "name", "category", "price", "rating"],
    warehouse="COMPUTE_WH"
)

# 2. Check indexing status
_cortex_search(
    action="get_status",
    service_name="product_search"
)

# 3. Execute semantic search
results = _cortex_search(
    action="query",
    service_name="product_search",
    query="affordable wireless earbuds with long battery life",
    limit=10
)

# 4. Refresh after daily ETL
_cortex_search(
    action="refresh_service",
    service_name="product_search"
)

Getting Started

Creating A Mission

Blueprint

Genesis Data Agents

Genesis Agent Security Controls

Genesis Data Agent's Toolkit

Setup

Slack and Teams

Data Connections

Deployment Options

Overview

Tool Name

Purpose

Functions Available

Key Features

Input Parameters for Each Function

`_cortex_search`

Use Cases

Workflow/How It Works

Integration Relevance

Configuration Details

Limitations or Notes

Supported Actions

Not Supported

Output

Best Practices

Optimize Search Columns

Select Smart Metadata

Schedule Refreshes

Monitor Performance

Example: Complete Product Search Setup

Getting Started

Creating A Mission

Blueprint

Genesis Data Agents

Genesis Agent Security Controls

Genesis Data Agent's Toolkit

Setup

Slack and Teams

Data Connections

Deployment Options

​Overview

​Tool Name

​Purpose

​Functions Available

​Key Features

​Input Parameters for Each Function

​_cortex_search

​Use Cases

​Workflow/How It Works

​Integration Relevance

​Configuration Details

​Limitations or Notes

​Supported Actions

​Not Supported

​Output

​Best Practices

Optimize Search Columns

Select Smart Metadata

Schedule Refreshes

Monitor Performance

​Example: Complete Product Search Setup

Overview

Tool Name

Purpose

Functions Available

Key Features

Input Parameters for Each Function

`_cortex_search`

Use Cases

Workflow/How It Works

Integration Relevance

Configuration Details

Limitations or Notes

Supported Actions

Not Supported

Output

Best Practices

Example: Complete Product Search Setup