Overview
Tool Name
Purpose
The _cortex_search tool enables you to build and query semantic search services using Snowflake Cortex AI. Unlike traditional keyword matching, Cortex Search understands meaning and context, returning relevant results even when exact terms don’t match. Perfect for product catalogs, document libraries, knowledge bases, and customer support systems.Functions Available
_cortex_search
:Manages Cortex Search services for creating AI-powered semantic search over table data, executing natural language queries, and maintaining search indexes. Controlled by the action parameter.
Key Features
Semantic Understanding
Find results based on meaning, not just keywords—understands synonyms, context, and intent.
Natural Language Queries
Search using conversational language without complex query syntax.
Metadata Filtering
Combine semantic search with structured filters on metadata columns.
Fast Indexing
Optimized vector-based indexing for sub-second query response times.
Easy Management
Create, refresh, monitor, and delete search services with simple commands.
Input Parameters for Each Function
_cortex_search
Parameters
Name | Definition | Format |
---|---|---|
action | Operation to perform. Values: create_service , query , list_services , describe_service , drop_service , refresh_service , get_status . | String (required) |
service_name | Unique identifier for the search service. | String |
table_name | Source Snowflake table to index (for create_service ). | String |
search_column | Column containing text content to search (for create_service ). | String |
metadata_columns | Additional columns to return with search results (for create_service ). | List of Strings |
warehouse | Snowflake warehouse for indexing operations (for create_service ). | String |
query | Natural language search text (for query action). | String |
limit | Maximum number of results to return (for query action). Default: 10. | Integer |
Choose a search_column with rich text content (descriptions, summaries, full text) for best semantic results. Short codes or IDs work poorly.
Use Cases
- Product Discovery Enable customers to find products using natural descriptions like “eco-friendly kitchen appliances” or “budget laptops for students.”
- Document Retrieval Search company documentation, policies, or knowledge bases with queries like “remote work policies” or “data retention guidelines.”
- Customer Support Find relevant help articles, FAQs, or past tickets based on problem descriptions rather than exact keywords.
- Content Recommendation Match user interests to relevant articles, research papers, or media content based on semantic similarity.
- Data Exploration Help analysts discover datasets or tables by searching descriptions and metadata with natural language.
Workflow/How It Works
- Step 1: Identify Source Data Choose a Snowflake table with a text column rich enough for semantic search (descriptions, documents, summaries).
-
Step 2: Create Search Service
Run
create_service
specifying the table_name, search_column, and metadata_columns to index. -
Step 3: Wait for Indexing
Snowflake builds the vector index in the background. Use
get_status
to monitor progress. -
Step 4: Query with Natural Language
Execute searches using conversational queries:
-
Step 5: Refresh as Needed
When source data changes, run
refresh_service
to update the index. -
Step 6: Monitor and Maintain
Use
list_services
anddescribe_service
to track all search services and their configurations.
Integration Relevance
- data_connector_tools for exploring tables and identifying good search candidates.
- _query_database to prepare and clean source data before indexing.
- file_manager_tools to log search queries and results for analytics.
- project_manager_tools to track search service creation and maintenance tasks.
- image_tools to build search UIs with visual result previews.
Configuration Details
- Search Column Selection: Choose columns with substantive text (100+ characters ideal). Avoid short codes or categorical values.
- Metadata Columns: Include fields users need to see or filter on (IDs, names, categories, dates).
- Warehouse Sizing: Larger warehouses speed up initial indexing; small warehouses suffice for queries.
- Table Requirements: Source table must be in Snowflake (not external tables or views with complex logic).
- Refresh Strategy: Schedule
refresh_service
after ETL runs to keep index current.
Cortex Search services consume compute during indexing and storage for vector embeddings. Monitor costs and limit services to high-value use cases.
Limitations or Notes
- Snowflake Cortex Availability: Requires Snowflake account with Cortex features enabled.
- Language Support: Optimized for English; other languages may have reduced accuracy.
- Column Type Constraints: search_column must contain text (VARCHAR, STRING); cannot index binary or complex types.
- Table Size: Very large tables (100M+ rows) may have longer indexing times and higher costs.
- Query Complexity: Simple natural language works best; complex boolean logic or regex not supported.
- No Partial Updates: Changes to source data require full
refresh_service
; incremental updates not available. - Result Ranking: Uses AI-based relevance; exact ranking algorithm not customizable.
Supported Actions
✅ create_service - Create new semantic search service✅ query - Execute natural language search
✅ list_services - Show all search services
✅ describe_service - Get service configuration details
✅ drop_service - Delete a search service
✅ refresh_service - Update index with new table data
✅ get_status - Check indexing status and health
Not Supported
❌ External tables or views as source (must be native Snowflake tables)❌ Real-time incremental indexing (requires full refresh)
❌ Custom relevance scoring or ranking algorithms
❌ Boolean operators (AND, OR, NOT) in queries
❌ Fuzzy matching or regex patterns
❌ Cross-table joins in search (single table per service)
❌ Image, audio, or video content search
Output
- create_service: Confirmation with service name, indexed columns, and status.
- query: List of matching results with metadata, ranked by semantic relevance.
- list_services: Table of all services with names, source tables, and creation dates.
- describe_service: Detailed configuration including search column, metadata columns, and index statistics.
- refresh_service: Status update with refresh timestamp and row count.
- get_status: Health check with indexing progress, last refresh, and error messages if any.
- drop_service: Confirmation of service deletion.
- Errors: Structured messages with resolution hints (e.g., invalid column names, warehouse unavailable).
Best Practices
Optimize Search Columns
Use columns with rich, descriptive text (200-500 words ideal). Concatenate multiple fields if needed.
Select Smart Metadata
Include only columns users need to see or filter—too many slows results, too few limits usability.
Schedule Refreshes
Automate
refresh_service
after data updates to keep search current without manual intervention.Monitor Performance
Log query patterns and latency to identify slow searches or opportunities for optimization.
Example: Complete Product Search Setup
Ready to build intelligent search into your applications? Start by identifying a table with rich text content and create your first Cortex Search service! 🚀