Overview
Tool Name
Purpose
The process_scheduler_tools enable scheduling and automation of recurring tasks within Genesis Data Agents. Create scheduled jobs that execute prompts, trigger workflows, run data pipelines, and perform automated operations at specified intervals or times. Perfect for ETL automation, report generation, monitoring tasks, data refreshes, and any operation that needs to run on a schedule.Functions Available
scheduler_action
:Manage scheduled jobs including creation, execution, monitoring, and history tracking with support for cron, interval, and one-time scheduling.
Key Features
Flexible Scheduling
Support for cron expressions, interval-based scheduling, and one-time execution dates.
Prompt Execution
Schedule natural language prompts to be executed by specified bots at scheduled times.
Job Management
Create, list, pause, resume, modify, and delete scheduled jobs with full lifecycle control.
Execution History
Track job run history with timestamps, outputs, and execution status for auditing.
Manual Triggers
Execute scheduled jobs on-demand without waiting for the next scheduled time.
Error Handling
Automatic retry logic, failure notifications, and detailed error logging.
Input Parameters for Each Function
scheduler_action
Parameters
Name | Definition | Format |
---|---|---|
action | Scheduler operation to perform. Values: STATUS , ADD_JOB , LIST_JOBS , GET_JOB , REMOVE_JOB , PAUSE_JOB , RESUME_JOB , RUN_JOB , MODIFY_JOB , GET_HISTORY , GET_RUN , CLEAR_HISTORY , KILL_RUN . | String (required) |
job_id | Unique identifier for the scheduled job. Required for most job-specific operations. | String |
what_to_do_prompt | Natural language prompt describing the task to execute. This is the ACTUAL TASK, not a scheduling request. | String |
trigger | Scheduling configuration as JSON object. Types: cron , interval , date . | Object |
job_bot_id | Bot ID that will execute the scheduled job (defaults to current bot). | String |
job_thread_id | Thread ID for job execution context (optional). | String |
name | Human-readable name for the job (optional). | String |
coalesce | Whether to run missed jobs as single execution (default: False). | Boolean |
max_instances | Maximum concurrent instances of job (default: 1). | Integer |
misfire_grace_time | Seconds to wait before considering job misfired (default: None). | Integer |
run_id | Specific run identifier for retrieving execution details or killing runs. | String |
limit | Maximum number of history records to return (default: 100). | Integer |
offset | Number of history records to skip for pagination (default: 0). | Integer |
The what_to_do_prompt should describe the actual task to perform, not the scheduling itself. For example: “Check for new data and update the dashboard” NOT “Schedule a check every 5 minutes”.
Use Cases
- Automated Data Refreshes Schedule regular data extracts, transformations, and loads to keep dashboards and reports current.
- Report Generation Generate and distribute daily, weekly, or monthly reports automatically at specified times.
- System Monitoring Check system health, data quality, pipeline status, or SLA compliance at regular intervals.
- Data Quality Checks Run validation jobs to detect anomalies, missing data, or schema changes on a schedule.
- Batch Processing Schedule resource-intensive operations during off-peak hours to optimize system performance.
Workflow/How It Works
-
Step 1: Check Scheduler Status
Verify scheduler is running and get statistics:
-
Step 2: Create Cron-Based Job
Schedule a job using cron expression (runs every day at 9 AM):
-
Step 3: Create Interval-Based Job
Schedule a job to run every 30 minutes:
-
Step 4: Create One-Time Job
Schedule a job for specific date/time:
-
Step 5: List All Jobs
View all scheduled jobs:
-
Step 6: Get Job Details
Retrieve specific job configuration:
-
Step 7: Run Job Manually
Trigger immediate execution:
-
Step 8: View Execution History
Check job run history:
-
Step 9: Pause/Resume Jobs
Temporarily disable or re-enable jobs:
-
Step 10: Modify Existing Job
Update job configuration:
Integration Relevance
- data_connector_tools to schedule automated data extraction and loading operations.
- airflow_tools to trigger Airflow DAGs on schedule or as part of workflow orchestration.
- dbt_action to schedule dbt model runs for regular data transformations.
- github_connector_tools / gitlab_connector_tools to schedule automated commits or deployments.
- slack_tools to send scheduled notifications, reports, or alerts.
- project_manager_tools to schedule task execution and mission progress updates.
Configuration Details
- Scheduler Type: APScheduler with persistent job store (survives restarts).
- Time Zone: Jobs run in UTC by default; specify timezone in cron/date triggers.
- Persistence: Job definitions and history stored in database.
- Concurrency: Configurable max_instances per job (default: 1).
- Missed Jobs: Coalesce option determines if missed jobs run once or skipped.
- Grace Period: misfire_grace_time allows late job execution within window.
- Job Storage: Job run outputs and logs retained based on retention policy.
Trigger Types
Cron Trigger
Schedule jobs using cron-like expressions:- Every day at 9 AM:
{"type": "cron", "hour": 9, "minute": 0}
- Every Monday at 8:30 AM:
{"type": "cron", "hour": 8, "minute": 30, "day_of_week": "mon"}
- First day of every month:
{"type": "cron", "hour": 6, "minute": 0, "day": 1}
- Every hour:
{"type": "cron", "minute": 0}
- Every 15 minutes:
{"type": "cron", "minute": "*/15"}
Interval Trigger
Schedule jobs at regular intervals:- Every 30 seconds:
{"type": "interval", "seconds": 30}
- Every 5 minutes:
{"type": "interval", "minutes": 5}
- Every 2 hours:
{"type": "interval", "hours": 2}
- Every day:
{"type": "interval", "days": 1}
Date Trigger
Schedule one-time job execution:For cron triggers, be mindful of timezone settings. All times default to UTC unless explicitly specified. Convert local times to UTC or specify timezone in trigger configuration.
Limitations or Notes
- Execution Environment: Jobs execute in bot context with access to bot’s tools and permissions.
- Long-Running Jobs: Jobs with execution time > 5 minutes may be terminated (configure timeout if needed).
- Concurrency: max_instances controls how many copies of a job can run simultaneously.
- Missed Executions: If scheduler is down during scheduled time, coalesce determines behavior.
- History Retention: Job run history retained for configurable period (default: 90 days).
- Job Limits: Recommended maximum of 100 active jobs per bot for performance.
- Prompt Complexity: Keep what_to_do_prompt clear and focused; complex multi-step operations may need breaking down.
- Thread Context: If job_thread_id not specified, each run creates new thread context.
Supported Actions
✅ STATUS - Get scheduler status and statistics✅ ADD_JOB - Create new scheduled job
✅ LIST_JOBS - List all scheduled jobs
✅ GET_JOB - Get specific job details
✅ REMOVE_JOB - Delete scheduled job
✅ PAUSE_JOB - Temporarily disable job
✅ RESUME_JOB - Re-enable paused job
✅ RUN_JOB - Execute job immediately
✅ MODIFY_JOB - Update job configuration
✅ GET_HISTORY - View job execution history
✅ GET_RUN - Get specific run details and output
✅ CLEAR_HISTORY - Delete old history records
✅ KILL_RUN - Terminate running job execution
Not Supported
❌ Sub-second scheduling (minimum interval: 1 second)❌ Conditional triggers based on external events (use separate monitoring job)
❌ Cross-bot job dependencies (implement in job prompt logic)
❌ Job priority or queue management
❌ Distributed job execution across multiple scheduler instances
❌ Job chaining or workflow DAGs (use Airflow for complex workflows)
❌ Dynamic schedule modification during execution
Output
- ADD_JOB: Job ID, next run time, and creation confirmation.
- LIST_JOBS: Table of jobs with ID, name, trigger type, next run time, and status.
- GET_JOB: Complete job configuration including trigger, bot assignment, and metadata.
- GET_HISTORY: List of execution records with run ID, start time, end time, status, and output summary.
- GET_RUN: Complete run details including full output, error messages, and execution context.
- RUN_JOB: Run ID and confirmation of manual execution trigger.
- STATUS: Scheduler health, active jobs count, running jobs, and pending executions.
- Errors: Detailed error messages with troubleshooting guidance.
Best Practices
Clear Prompts
Write specific, actionable prompts. Good: “Query sales table and email summary”. Bad: “Do sales stuff”.
Appropriate Intervals
Don’t over-schedule. Balance freshness needs with system load. Consider impact of frequent executions.
Error Handling
Include error handling instructions in prompts: “If query fails, log error and notify team”.
Monitor History
Regularly review execution history to catch failures, performance issues, or unexpected behavior.
Use Descriptive Names
Name jobs clearly to indicate purpose. Makes management and troubleshooting easier.
Test Before Scheduling
Run jobs manually first to verify they work correctly before setting them on a schedule.
Example: Complete Scheduling Workflow
Advanced Features
Conditional Execution
Implement conditional logic in job prompts:Multi-Step Workflows
Chain multiple operations in a single scheduled job:Dynamic Scheduling
Create jobs programmatically based on configuration:Error Recovery Jobs
Create monitoring jobs that check for and recover from failures:Scheduled Cleanup Jobs
Maintain system health with automated cleanup:Monitoring & Alerting
Track Job Health
Troubleshooting
Job Not Executing at Expected Time
Job Not Executing at Expected Time
- Verify job is not paused (check status with GET_JOB)
- Check scheduler status (use STATUS action)
- Verify trigger configuration is correct
- Check for timezone mismatches (cron times are UTC by default)
- Review misfire_grace_time setting
- Check if max_instances limit is reached
Job Execution Fails Immediately
Job Execution Fails Immediately
- Review error message in GET_RUN output
- Verify what_to_do_prompt is valid and clear
- Check if bot has required permissions and tools
- Test prompt manually before scheduling
- Review job execution history for patterns
- Ensure bot_id specified is valid and active
Job Runs But Produces Wrong Results
Job Runs But Produces Wrong Results
- Review full output using GET_RUN action
- Check if prompt is ambiguous or unclear
- Verify input data and assumptions are correct
- Test prompt manually to reproduce issue
- Add more specific instructions to prompt
- Review execution context and variables
Missed Job Executions
Missed Job Executions
- Check if scheduler was running during scheduled time
- Review coalesce setting (determines missed job behavior)
- Check system logs for scheduler downtime
- Verify misfire_grace_time is appropriate
- Consider using interval triggers for critical jobs
- Monitor scheduler health proactively
Cannot Add New Job
Cannot Add New Job
- Verify job_id is unique (not already in use)
- Check trigger configuration format is valid
- Ensure what_to_do_prompt is provided
- Verify bot_id exists if specified
- Check for scheduler capacity limits
- Review error message for specific validation failures
Job Running Too Long
Job Running Too Long
- Check if job is actually stuck or just slow
- Review GET_RUN output for progress
- Use KILL_RUN to terminate if truly stuck
- Optimize prompt to break into smaller tasks
- Consider increasing timeout if legitimate long runtime
- Add progress logging to identify bottlenecks
History Not Showing Recent Runs
History Not Showing Recent Runs
- Allow time for execution to complete
- Check if job actually executed (verify next_run_time updated)
- Verify run wasn’t cleared by CLEAR_HISTORY
- Check offset/limit parameters for pagination
- Review retention policy settings
- Query with larger limit to see all history
Scheduler Architecture
Understanding the scheduler components:Key Components
- Scheduler Engine: Core APScheduler managing job lifecycle and execution timing
- Job Store: Persistent storage for job definitions and configurations
- Executor: Thread pool that runs scheduled jobs
- Trigger Conditions: Cron, interval, or date-based scheduling logic
- Run History: Database storing execution records, outputs, and status
- Bot Instance: Target bot that executes the what_to_do_prompt
Performance Considerations
Avoid Over-Scheduling
Too many frequent jobs can overwhelm system. Use appropriate intervals based on actual needs.
Optimize Prompts
Keep prompts focused and efficient. Break complex operations into separate jobs if needed.
Monitor Resource Usage
Track job execution times and resource consumption. Optimize or reschedule resource-intensive jobs.
Clean History Regularly
Use CLEAR_HISTORY periodically to prevent history table bloat and maintain performance.
Comparison: Scheduling Options
Feature | Process Scheduler | Airflow | Cron (System) |
---|---|---|---|
Setup Complexity | ✅ Simple | ⚠️ Complex | ✅ Simple |
Scheduling Options | ✅ Cron, interval, date | ✅ Cron-based | ✅ Cron only |
Job Dependencies | ⚠️ In prompt logic | ✅ Native DAG support | ❌ Manual |
Execution History | ✅ Built-in database | ✅ Metadata database | ⚠️ Logs only |
UI Management | ✅ API-based | ✅ Web UI | ⚠️ Config files |
Monitoring | ✅ GET_HISTORY API | ✅ Rich monitoring | ⚠️ Manual log checking |
Error Handling | ✅ Captured in history | ✅ Retry logic | ⚠️ Manual |
Use Case | Simple scheduled tasks | Complex workflows | System maintenance |
Use process_scheduler_tools for straightforward scheduled tasks within Genesis. For complex multi-step workflows with dependencies, consider airflow_tools. For simple system maintenance, native cron may suffice.
Migration Guide
From Cron to Process Scheduler
Benefits of Migration
- ✅ Execution history and monitoring
- ✅ Error handling and retry logic
- ✅ Easy modification without server access
- ✅ Centralized job management
- ✅ No need for script files on server