Overview

Tool Name

data_connector_tools

Purpose

The data_connector_tools are dedicated to establishing and managing database connections, querying databases, and exploring metadata. With this toolset, users can efficiently connect to, interact with, and manage data sources for structured querying and analysis.

Functions Available

  1. _query_database: Executes SQL queries against connected databases.Allows row limit constraints or overriding default limits.

  2. _add_database_connection: Creates a new database connection using specific credentials and connection strings. Integrates structured databases for querying and metadata exploration.

  3. _delete_database_connection: Removes an existing database connection from the system.

  4. _list_database_connections : Retrieves a list of all accessible database connections for users, along with connection metadata.

  5. _search_metadata: Searches database metadata to identify schemas, tables, columns, and other components. Can narrow search results by database, schema, or table.

  6. _data_explorer: Enables narrowing and quick exploration of database schemas or objects. Useful for discovering table relationships and refining queries.

  7. _get_full_table_details: Provides complete structural details (e.g., columns, types, relationships) for a specific table in the database.

Key Features

Execute SQL Queries

Run queries against connected databases with optional row limits and override capabilities.

Manage Connections

Add, remove, and list database connections, including associated metadata.

Metadata Exploration

Search database schemas, tables, and columns to discover structures for ETL or analytics.

Detailed Table Information

Retrieve full table details—columns, data types, relationships—to refine workflows.

Input Parameters for Each Function

1. _query_database

Parameters

NameDefinitionFormat
connection_idSpecifies the database connection used (required).String
querySQL query to execute (required).String
max_rowsDefines the limit on rows returned (optional).Integer
max_rows_overrideBoolean to override max_rows if set to true (default: false).Boolean

2. _add_database_connection

Parameters

NameDefinitionFormat
connection_idUnique identifier for the new database connection (required).String
connection_stringSQLAlchemy-compliant connection string for the database (required).String
allowed_bot_ids(Optional) List of bot IDs authorized to access this connection (or "*" for all).List
description(Optional) Descriptor for the connection’s purpose.String

3. _delete_database_connection

Parameters

NameDefinitionFormat
connection_idID of the database connection to delete (required).String

4. _list_database_connections

Parameters (No parameters required; retrieves all visible and accessible database connections.)

5. _search_metadata

Parameters

NameDefinitionFormat
search_stringString pattern to search within metadata (required).String
database(Optional) Restrict search to a specific database.String
schema(Optional) Limit search to a specific schema.String
table(Optional) Restrict search to a specific table.String
top_n(Optional) Maximum number of metadata results to return.Integer

6. _data_explorer

Parameters

NameDefinitionFormat
search_string(Optional) Partial string to explore metadata results.String
database(Optional) Specify database to scope metadata exploration.String
schema(Optional) Focus on a specific schema.String
table(Optional) Refine results to a particular table group.String
top_n(Optional) Number of top results for quick exploration.Integer

7. _get_full_table_details

Parameters

NameDefinitionFormat
connection_idDatabase connection ID for the table (required).String
databaseTarget database containing the table (required).String
schemaSchema of the desired table (required).String
tableName of the table to retrieve details for (required).String

Use Cases

  1. Query Execution

    • Run SQL queries to retrieve subsets of data for further analysis or reporting.

    • Example: Executing a query to count records in a table or retrieve sales trends.

  2. Metadata Exploration

    • Discover existing schemas or table structures with _search_metadata to design custom queries.

    • Example: Finding column names in a sales DB to refine joins across multiple tables.

  3. Database Integration

    • Establish new database connections with _add_database_connection for new integration pipelines.

    • Example: Adding an external financial database to a centralized data platform.

  4. Table Details Retrieval

    • Use _get_full_table_details for granular information about table structure.

    • Example: Inspect columns, data types, constraints for ETL processes.

When handling large queries, always set an appropriate max_rows value or consider using max_rows_override carefully to manage resource utilization effectively.

Workflow/How It Works

  1. Step 1: Add Database Connections

    • Create connections via _add_database_connection, providing valid credentials and descriptions.
  2. Step 2: Explore Metadata

    • Use _search_metadata or _data_explorer to understand table structures or discover columns.
  3. Step 3: Execute Queries

    • Run SQL commands with _query_database, applying max_rows or overrides for large results.
  4. Step 4: Retrieve Table Details

    • Use _get_full_table_details for in-depth info on table schemas or relationships.
  5. Step 5: Manage Connections

    • Remove obsolete connections with _delete_database_connection or check status via _list_database_connections.

Integration Relevance

  • Process Automation: Pair with process_manager_tools to automate query execution and metadata checks.

  • Snowflake Workflows: Integrate command outputs with snowflake_tools for efficient file staging/ingestion.

  • Visualization Tools: Export query results to analytics platforms or dashboards for KPIs.

Configuration Details

  • Use accurate connection strings per SQLAlchemy standards when adding or modifying connections.

  • Ensure required permissions for querying databases and metadata exploration.

  • Large queries should consider max_rows to control memory usage.

Limitations or Notes

  1. Row Limitations

    • Without specifying max_rows, datasets may be truncated by default. Consider overrides for larger queries.
  2. Connection Management

    • Unused connections can impact performance—remove them with _delete_database_connection.
  3. Query Errors

    • Validate table/column names in SQL statements to avoid execution errors.

Always confirm that permissions are sufficient for metadata exploration and data retrieval to prevent unexpected query failures.

Output

  • Connection Management

    • Confirmations or error messages after adding, deleting, or listing connections.
  • Query & Metadata Results

    • JSON-formatted query data or structured metadata (schemas, tables, columns).
  • Table Details

    • Full breakdown of columns, constraints, and structure for specific tables.

Example on Streamlit

  • Run SQL queries to retrieve subsets of data for further analysis or reporting.

In this example, we’ll be tasking the Genbot with creating a data-driven dashboard for a professional baseball team’s analytics department.

Example: Executing a query to count records in a table or retrieve sales trends.

Metadata Exploration

Discover existing schemas or table structures with _search_metadata to design custom queries.

Example: Finding column names in a sales DB to refine joins across multiple tables.

Example on Genesis API

User: What are the top teams by runs and ERA?
   * *Example:* Adding an external financial database to a centralized data platform.

Genbot: 🧰 Using tool: SearchMetadata...
🧰 Using tool: RunQuery...
🧰 Using tool: RunQuery...
The results are in! Here's what we found for the years 2010-2015:
---
### Top 5 Teams by Total Runs
| Year | Team   | Total Runs | 
|------|--------|------------| 
| 2011 | BOS    | 875        | 
| 2013 | BOS    | 853        | 
| 2010 | BOS    | 818        | 
| 2015 | BOS    | 748        | 
| 2012 | BOS    | 734        | 
---
### Top 5 Teams by Best Earned Run Average (ERA)
| Year | Team   | Average ERA | 
|------|--------|-------------| 
| 2010 | BOS    | 4.00        | 
| 2011 | BOS    | 4.00        | 
| 2013 | BOS    | 4.00        | 
| 2015 | BOS    | 4.00        | 
| 2014 | BOS    | 5.00        | 
---
### Observations:
1. Boston (BOS) dominated both categories, appearing consistently at the top across these years.
2. ERA averages are relatively stable around 4.00, with a spike to 5.00 in 2014.
3. Total runs peaked in 2011 (875 runs).
---
Would you like to do any further analytics (e.g., calculate win ratios, combine these results, or create a dashboard)? Let me know! ⚾😊

Table Details Retrieval

  1. Use _get_full_table_details for granular information about table structure.

  2. Example: Inspect columns, data types, and constraints for ETL processes.

When handling large queries, always set an appropriate max_rows value or consider using max_rows_override carefully to manage resource utilization effectively.

Example on Slack

  • Large queries should consider max_rows to control memory usage.

database_tools – Assists in discovering database metadata, locating database tables, and running SQL queries on a database.

In this example, we will be creating a graph showing the $ we spent on Snowflake per day for the last 120 days.

Along with an explanation of how the Genbot did this analysis while mentioning the tools used.

Validate table/column names in SQL statements to avoid execution errors.

Genbots With Database Tools Have Their Own Workspace

Always confirm that permissions are sufficient for metadata exploration and data retrieval to prevent unexpected query failures.

A workspace schema is a specialized schema tailored for a specific user or purpose within a broader database environment. It acts as a dedicated space for you to create and manage database objects like tables, views, stages, and other elements.

Output

How To Use

Before using the workspace schema, you can ask the Genbot that has access to the database tools to explain what it is.

In this example, we are asking Eve, “How would I tell you to use that workspace schema?”

Remember that Genbots only have access to this if they have the database tool. You can grant Genbots with whichever tool required for them to be exceptional at their role.