Code Executor Tools - Genesis Computing

Overview

Tool Name

code_executor_tools

Purpose

The code_executor_tools enable secure execution of Python code within Genesis Data Agents. Run dynamic calculations, data transformations, API integrations, and custom logic without pre-defined functions. Perfect for ad-hoc analysis, prototyping, data science workflows, and extending agent capabilities with custom code on-the-fly.

Functions Available

python_exec: Execute Python code in a secure sandboxed environment with access to common libraries, data sources, and file systems.

Key Features

Secure Execution

Run code in isolated sandbox with resource limits and security constraints to prevent system abuse.

Rich Library Access

Pre-loaded with pandas, numpy, requests, and other common data science and utility libraries.

Data Source Integration

Direct access to configured databases, file systems, and APIs within executed code.

File System Access

Read and write files to Git repositories and storage locations from within executed code.

Dynamic Variables

Pass variables into code execution and retrieve results for further processing.

Error Handling

Comprehensive error messages with stack traces for debugging failed executions.

Input Parameters for Each Function

`python_exec`

Parameters

Name	Definition	Format
code	Python code to execute. Can be single line or multi-line block.	String (required)
variables	Dictionary of variables to make available in the execution context.	Object
timeout	Maximum execution time in seconds (default: 30, max: 300).	Integer
return_var	Name of variable to return from execution context (default: returns all variables).	String
libraries	Additional libraries to import (beyond pre-loaded defaults).	Array of Strings
safe_mode	Enable additional security restrictions (default: True).	Boolean

Use variables parameter to pass data from the agent context into your code. This is more efficient than embedding large data directly in code strings.

Use Cases

Ad-Hoc Data Analysis Perform quick calculations, statistical analysis, or data exploration without creating permanent functions or scripts.
Data Transformation Execute custom transformation logic on datasets that does not fit into standard SQL or built-in functions.
API Integration Make HTTP requests, parse responses, and integrate with external APIs dynamically based on user requirements.
Prototyping & Testing Test algorithms, validate assumptions, or prototype solutions before implementing them as permanent features.
Custom Business Logic Implement organization-specific calculations, rules, or validations that vary by use case or customer.

Workflow/How It Works

Step 1: Simple Calculation Execute basic Python for quick results:

python_exec(
    code="result = 42 * 365; print(f'Days in 42 years: {result}')"
)

Step 2: Data Analysis with Pandas Analyze data using pre-loaded libraries:

python_exec(
    code="""
import pandas as pd
import numpy as np

data = {
    'product': ['A', 'B', 'C', 'D', 'E'],
    'sales': [150, 200, 175, 300, 225],
    'cost': [100, 150, 125, 200, 175]
}

df = pd.DataFrame(data)
df['profit'] = df['sales'] - df['cost']
df['margin'] = (df['profit'] / df['sales'] * 100).round(2)

summary = {
    'total_sales': df['sales'].sum(),
    'total_profit': df['profit'].sum(),
    'avg_margin': df['margin'].mean(),
    'top_product': df.loc[df['profit'].idxmax(), 'product']
}

print("Sales Analysis:")
print(df)
print(f"Summary: {summary}")
"""
)

Step 3: Pass Variables from Context Use data from previous operations:

python_exec(
    code="""
import pandas as pd

df = pd.DataFrame(customer_data)

df['segment'] = pd.cut(
    df['total_spend'], 
    bins=[0, 1000, 5000, float('inf')],
    labels=['Bronze', 'Silver', 'Gold']
)

segment_summary = df.groupby('segment').agg({
    'customer_id': 'count',
    'total_spend': 'sum'
}).round(2)

print(segment_summary)
result = segment_summary.to_dict()
""",
    variables={'customer_data': customer_data},
    return_var='result'
)

Step 4: API Integration Make external API calls and process responses:

python_exec(
    code="""
import requests

response = requests.get(
    'https://api.openweathermap.org/data/2.5/weather',
    params={
        'q': city_name,
        'appid': api_key,
        'units': 'metric'
    }
)

if response.status_code == 200:
    weather_data = response.json()
    result = {
        'temperature': weather_data['main']['temp'],
        'humidity': weather_data['main']['humidity'],
        'description': weather_data['weather'][0]['description']
    }
    print(f"Weather in {city_name}: {result['temperature']}°C")
else:
    result = {'error': f"API call failed: {response.status_code}"}
""",
    variables={'city_name': 'London', 'api_key': 'your_api_key'},
    return_var='result'
)

Step 5: File Operations Read and write files within execution:

python_exec(
    code="""
import json

with open('/workspace/config.json', 'r') as f:
    config = json.load(f)

config['last_run'] = '2024-01-15'
config['status'] = 'completed'

with open('/workspace/config.json', 'w') as f:
    json.dump(config, f, indent=2)

print(f"Config updated: {config}")
result = config
""",
    return_var='result'
)

Step 6: Complex Data Processing Implement multi-step algorithms:

python_exec(
    code="""
import pandas as pd
import numpy as np

df = pd.DataFrame(transaction_data)
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date')

df['rolling_avg_7d'] = df['amount'].rolling(window=7).mean()
df['rolling_sum_30d'] = df['amount'].rolling(window=30).sum()

mean = df['amount'].mean()
std = df['amount'].std()
df['z_score'] = (df['amount'] - mean) / std
df['is_anomaly'] = abs(df['z_score']) > 3

anomalies = df[df['is_anomaly']]
summary = {
    'total_transactions': len(df),
    'anomaly_count': len(anomalies),
    'anomaly_dates': anomalies['date'].dt.strftime('%Y-%m-%d').tolist(),
    'avg_7d_trend': df['rolling_avg_7d'].iloc[-1]
}

print(f"Detected {len(anomalies)} anomalies")
result = summary
""",
    variables={'transaction_data': transactions},
    return_var='result',
    timeout=60
)

Integration Relevance

data_connector_tools to fetch data from databases, then process with custom Python code.
file tools to read/write files before or after code execution.
github_connector_tools / gitlab_connector_tools to version control generated scripts.
image_tools to process images or generate visualizations from code execution results.
web_access_tools to fetch data via HTTP within executed code.
project_manager_tools to execute custom code as part of automated mission tasks.

Configuration Details

Execution Environment: Python 3.9+ with isolated namespace per execution.
Pre-loaded Libraries: pandas, numpy, requests, json, datetime, os, sys, re, math, statistics.
Resource Limits: CPU time limit (default 30s), memory limit (default 512MB).
File System Access: Read/write to designated workspace directories and Git repositories.
Network Access: HTTP/HTTPS requests allowed; other protocols restricted.
Security Mode: Restricted access to system operations, subprocess execution, and dangerous functions.
Return Value: Can return primitive types, dictionaries, lists, or serializable objects.

Code execution happens in a sandboxed environment, but always validate and sanitize user-provided code to prevent malicious operations. Never execute untrusted code without review.

Limitations or Notes

Execution Timeout: Default 30 seconds, maximum 300 seconds (5 minutes). Long-running operations may be terminated.
Memory Limits: Default 512MB. Large dataset operations may exceed limits and fail.
No Subprocess: Cannot spawn subprocesses or execute system commands for security.
Library Restrictions: Cannot install arbitrary packages; limited to pre-loaded libraries.
Network Limitations: Only HTTP/HTTPS allowed; no raw socket access or other protocols.
File System Scope: Access limited to designated directories; cannot access system files.
Serialization: Return values must be JSON-serializable or primitive types.
State Persistence: Each execution is isolated; no state persists between calls.
Standard Output: Print statements captured but may be truncated for very large outputs.

Supported Operations

✅ Basic Python operations - Variables, functions, loops, conditionals
✅ Data manipulation - pandas DataFrames, numpy arrays, list/dict operations
✅ File I/O - Read/write text and binary files in workspace
✅ HTTP requests - GET, POST, PUT, DELETE with requests library
✅ JSON/CSV processing - Parse and generate structured data formats
✅ Date/time calculations - datetime, timedelta operations
✅ Regular expressions - Pattern matching and text processing
✅ Mathematical operations - Statistics, linear algebra, calculations
✅ String manipulation - Formatting, parsing, encoding
✅ Exception handling - Try/except blocks and error recovery

Not Supported

❌ Installing new packages (pip install) during execution
❌ Subprocess execution (subprocess, os.system, etc.)
❌ System file access (outside workspace)
❌ Database connections (use data_connector_tools instead)
❌ Multi-threading or multiprocessing
❌ Raw socket programming
❌ Modifying Python environment or interpreter
❌ Accessing environment variables (except explicitly passed)
❌ Infinite loops (terminated by timeout)
❌ Heavy machine learning model training (use dedicated tools)

Output

Success: Execution result with returned variable(s), stdout output, and execution time.
Variables: Dictionary of all variables in execution context (unless specific return_var specified).
Standard Output: Captured print statements and logging output.
Execution Time: Time taken to execute code in seconds.
Errors: Exception type, error message, and full stack trace for debugging.
Warnings: Any resource warnings (approaching timeout, high memory usage).

Best Practices

Small & Focused

Keep code blocks focused on single tasks. Break complex operations into multiple executions.

Error Handling

Always use try/except blocks for operations that might fail (API calls, file I/O, parsing).

Resource Awareness

Monitor execution time and memory usage. Use sampling for large datasets.

Data Validation

Validate input data and function parameters before processing to avoid runtime errors.

Logging

Use print statements to track execution progress and debug issues.

Return Values

Always assign results to variables for retrieval. Do not rely solely on print output.

Example: Complete Data Analysis Workflow

# Step 1: Fetch data from database
query_result = _query_database(
    connection_id="analytics_db",
    query="SELECT customer_id, order_date, order_amount, product_category FROM orders WHERE order_date >= '2024-01-01' LIMIT 10000",
    max_rows=10000
)

# Step 2: Perform complex analysis
analysis_result = python_exec(
    code="""
import pandas as pd
import numpy as np

df = pd.DataFrame(data)
df['order_date'] = pd.to_datetime(df['order_date'])
df['order_amount'] = pd.to_numeric(df['order_amount'], errors='coerce')
df = df.dropna()

print(f"Loaded {len(df)} orders")

customer_stats = df.groupby('customer_id').agg({
    'order_amount': ['sum', 'mean', 'count'],
    'order_date': ['min', 'max']
}).reset_index()

customer_stats.columns = ['customer_id', 'total_spend', 'avg_order', 'order_count', 'first_order', 'last_order']
customer_stats['lifetime_days'] = (customer_stats['last_order'] - customer_stats['first_order']).dt.days

def segment_customer(row):
    if row['total_spend'] > 5000:
        return 'VIP'
    elif row['total_spend'] > 2000:
        return 'Gold'
    elif row['total_spend'] > 500:
        return 'Silver'
    else:
        return 'Bronze'

customer_stats['segment'] = customer_stats.apply(segment_customer, axis=1)

category_stats = df.groupby('product_category').agg({
    'order_amount': ['sum', 'mean', 'count'],
    'customer_id': 'nunique'
}).reset_index()

category_stats.columns = ['category', 'total_revenue', 'avg_order', 'order_count', 'unique_customers']
category_stats = category_stats.sort_values('total_revenue', ascending=False)

daily_sales = df.groupby(df['order_date'].dt.date)['order_amount'].agg(['sum', 'count']).reset_index()
daily_sales.columns = ['date', 'revenue', 'orders']
daily_sales['avg_order_value'] = (daily_sales['revenue'] / daily_sales['orders']).round(2)
daily_sales['revenue_7d_avg'] = daily_sales['revenue'].rolling(window=7).mean().round(2)
daily_sales['revenue_growth'] = daily_sales['revenue'].pct_change() * 100

summary = {
    'total_revenue': float(df['order_amount'].sum()),
    'total_orders': len(df),
    'unique_customers': df['customer_id'].nunique(),
    'avg_order_value': float(df['order_amount'].mean()),
    'segments': customer_stats['segment'].value_counts().to_dict(),
    'top_category': category_stats.iloc[0]['category'],
    'top_category_revenue': float(category_stats.iloc[0]['total_revenue'])
}

result = {
    'summary': summary,
    'customer_segments': customer_stats.to_dict('records'),
    'category_performance': category_stats.to_dict('records'),
    'daily_trends': daily_sales.tail(30).to_dict('records')
}
""",
    variables={'data': query_result['data']},
    return_var='result',
    timeout=60
)

print(f"Summary: {analysis_result['result']['summary']}")

Advanced Features

Statistical Analysis

python_exec(
    code="""
import numpy as np
from scipy import stats

data = np.array(values)

result = {
    'mean': float(np.mean(data)),
    'median': float(np.median(data)),
    'std': float(np.std(data)),
    'min': float(np.min(data)),
    'max': float(np.max(data)),
    'q25': float(np.percentile(data, 25)),
    'q75': float(np.percentile(data, 75))
}

stat, p_value = stats.normaltest(data)
result['is_normal'] = p_value > 0.05
result['normality_p_value'] = float(p_value)

print(f"Distribution analysis: {result}")
""",
    variables={'values': [23, 45, 67, 89, 34, 56, 78, 90, 12, 45]},
    return_var='result'
)

Text Processing

python_exec(
    code="""
import re
from collections import Counter

text = text_data.lower()
text = re.sub(r'[^a-z0-9 ]', '', text)
words = text.split()

stop_words = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for'}
words = [w for w in words if w not in stop_words and len(w) > 2]

word_freq = Counter(words)
top_words = word_freq.most_common(20)

result = {
    'total_words': len(words),
    'unique_words': len(set(words)),
    'top_words': [{'word': w, 'count': c} for w, c in top_words]
}

print(f"Analyzed {result['total_words']} words")
""",
    variables={'text_data': long_text_content},
    return_var='result'
)

Data Validation

python_exec(
    code="""
import pandas as pd

df = pd.DataFrame(dataset)

validation_results = {
    'row_count': len(df),
    'column_count': len(df.columns),
    'missing_values': df.isnull().sum().to_dict(),
    'duplicate_rows': df.duplicated().sum(),
    'data_types': df.dtypes.astype(str).to_dict()
}

errors = []

required_cols = ['customer_id', 'order_date', 'amount']
missing_cols = [col for col in required_cols if col not in df.columns]
if missing_cols:
    errors.append(f"Missing required columns: {missing_cols}")

if 'amount' in df.columns:
    negative_amounts = (df['amount'] < 0).sum()
    if negative_amounts > 0:
        errors.append(f"Found {negative_amounts} negative amounts")

if 'order_date' in df.columns:
    try:
        pd.to_datetime(df['order_date'])
    except:
        errors.append("Invalid date format in order_date column")

validation_results['errors'] = errors
validation_results['is_valid'] = len(errors) == 0

print(f"Validation {'passed' if validation_results['is_valid'] else 'failed'}")
result = validation_results
""",
    variables={'dataset': data_to_validate},
    return_var='result'
)

API Pagination Handler

python_exec(
    code="""
import requests
import time

def fetch_paginated_data(base_url, api_key, max_pages=10):
    all_data = []
    page = 1
    
    while page <= max_pages:
        response = requests.get(
            base_url,
            params={'page': page, 'per_page': 100},
            headers={'Authorization': f'Bearer {api_key}'}
        )
        
        if response.status_code != 200:
            print(f"Error on page {page}: {response.status_code}")
            break
        
        data = response.json()
        if not data.get('items'):
            break
        
        all_data.extend(data['items'])
        print(f"Fetched page {page}: {len(data['items'])} items")
        
        if not data.get('has_next'):
            break
        
        page += 1
        time.sleep(0.5)
    
    return all_data

result = fetch_paginated_data(api_url, token)
print(f"Total items fetched: {len(result)}")
""",
    variables={'api_url': 'https://api.example.com/data', 'token': 'your_token'},
    return_var='result',
    timeout=120
)

Troubleshooting

Execution Timeout

Reduce dataset size or use sampling
Optimize algorithms (vectorize operations with numpy/pandas)
Increase timeout parameter (up to 300 seconds)
Break into smaller execution steps
Use generators or lazy evaluation for large data

Memory Error

Process data in chunks instead of loading all at once
Use pandas iterator options (chunksize parameter)
Delete large intermediate variables with del variable
Use more memory-efficient data types (int8 vs int64)
Consider using data_connector_tools for heavy processing

Import Error - Module Not Found

Verify library is in pre-loaded list
Check for typos in import statement
Use alternative built-in libraries when possible
Request library addition from Genesis team
Implement functionality manually if simple

File Not Found

Verify file path is relative to workspace
Check file exists using os.path.exists()
Ensure file was created in previous step
Use absolute paths within workspace directories
Check file permissions and access rights

Serialization Error on Return

Convert numpy arrays to lists: array.tolist()
Convert pandas DataFrames to dicts: df.to_dict()
Use float() for numpy float types
Use int() for numpy integer types
Convert datetime to ISO strings: dt.isoformat()

API Request Failed

Check internet connectivity (if applicable)
Verify API endpoint URL is correct
Check authentication credentials
Add error handling with try/except
Check for rate limiting and add delays
Verify API response format matches expectations

Syntax Error in Code

Check for proper indentation (use 4 spaces)
Verify quotes are properly closed
Check for balanced parentheses/brackets
Test code in local Python environment first
Use syntax highlighter to identify issues

Security Considerations

Code Execution Risks: Always validate user-provided code before execution. Never execute code from untrusted sources without thorough review.

Sandbox Protections

The code executor implements multiple security layers:

Restricted Imports: Cannot import subprocess, os.system, eval, exec, compile, or other dangerous functions
File System Isolation: Access limited to designated workspace directories
Network Filtering: Only HTTP/HTTPS; no raw sockets or other protocols
Resource Limits: CPU time and memory caps prevent denial-of-service
Namespace Isolation: Each execution has isolated variable scope
No Privilege Escalation: Cannot access system files or escalate permissions

Safe Coding Practices

Input Validation

Validate and sanitize all user inputs before using them in code execution.

Error Containment

Use try/except blocks to prevent error propagation and information leakage.

Resource Monitoring

Monitor execution time and memory usage. Set appropriate timeouts.

Audit Logging

Log all code executions with user context for security auditing.

Performance Optimization

Efficient Data Processing

# Inefficient - Row-by-row iteration
python_exec(
    code="""
result = []
for row in data:
    result.append(row['value'] * 2)
"""
)

# Efficient - Vectorized operations
python_exec(
    code="""
import pandas as pd
df = pd.DataFrame(data)
result = (df['value'] * 2).tolist()
"""
)

Memory Management

python_exec(
    code="""
import pandas as pd

chunks = []
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
    processed = chunk[chunk['amount'] > 100]
    chunks.append(processed)

result = pd.concat(chunks, ignore_index=True)
del chunks
"""
)

Getting Started

Creating A Mission

Genesis Data Agents

Genesis Data Agent's Toolkit

Setup

Slack and Teams

Data Connections

Deployment Options

​Overview

​Tool Name

​Purpose

​Functions Available

​Key Features

​Input Parameters for Each Function

​python_exec

​Use Cases

​Workflow/How It Works

​Integration Relevance

​Configuration Details

​Limitations or Notes

​Supported Operations

​Not Supported

​Output

​Best Practices

Small & Focused

Error Handling

Resource Awareness

Data Validation

Logging

Return Values

​Example: Complete Data Analysis Workflow

​Advanced Features

​Statistical Analysis

​Text Processing

​Data Validation

​API Pagination Handler

​Troubleshooting

​Security Considerations

​Sandbox Protections

​Safe Coding Practices

Input Validation

Error Containment

Resource Monitoring

Audit Logging

​Performance Optimization

​Efficient Data Processing

​Memory Management

Overview

Tool Name

Purpose

Functions Available

Key Features

Input Parameters for Each Function

`python_exec`

Use Cases

Workflow/How It Works

Integration Relevance

Configuration Details

Limitations or Notes

Supported Operations

Not Supported

Output

Best Practices

Example: Complete Data Analysis Workflow

Advanced Features

Statistical Analysis

Text Processing

Data Validation

API Pagination Handler

Troubleshooting

Security Considerations

Sandbox Protections

Safe Coding Practices

Performance Optimization

Efficient Data Processing

Memory Management