Overview
Tool Name
Purpose
The system_stats_tools group provides real-time performance telemetry and health insights. Use it to fetch current server time and timezone, gather CPU, memory, disk, network, and process metrics, and build a reliable picture of platform load and stability.Key Features & Functions
Current Time & Timezone
Retrieve authoritative server datetime, timezone, and sync context for logs and distributed jobs.
CPU & Load
Capture CPU utilization and load averages to spot saturation and sizing gaps.
Memory Utilization
Track total, used, cached, and available memory to prevent swapping.
Disk & I/O
Inspect free space and IOPS to protect pipelines from storage contention.
Network Throughput
Observe interface bytes per second and error counters for connectivity health.
Processes & Uptime
Surface top consumers, uptime, and boot time for triage and audits.
Health Signals
Combine metrics into health indicators and alerts for proactive action.
Input Parameters for Each Function
get_server_datetime
Parameters
No parameters.
get_system_stats
Parameters
No parameters.
Use Cases
- High-load monitoring Track CPU, memory, and I/O while ETL or ML jobs run to detect bottlenecks.
- Capacity planning Export daily snapshots to trend headroom and justify hardware or quota changes.
- Incident triage Compare current metrics with baseline to isolate the noisy neighbor or failing disk.
- SLA verification Correlate job runtimes with system pressure to validate performance agreements.
- Time synchronization checks Confirm server time and timezone for consistent, debuggable logs.
Workflow/How It Works
- Fetch time context with
get_server_datetimeto stamp logs and verify timezone. - Pull point-in-time metrics using
get_system_statsfor CPU, memory, disk, network, and processes. - Store snapshots in your monitoring datastore for trend analysis.
- Compare to baseline to detect regressions after deploys or config changes.
- Alert on thresholds such as CPU greater than 85 percent for 5 minutes or disk free less than 15 percent.
Align time sources across nodes to keep multi-host traces and metrics comparable.
Integration Relevance
- genesis_job_tools to correlate background job activity with resource spikes.
- system_manager_tools to verify health before maintenance and after restarts.
- harvester_tools to watch metadata crawls for resource pressure.
- data_connector_tools to contextualize slow queries with host contention.
- Any tool that schedules heavy workloads benefits from pre- and post-run stats.
Configuration Details
- Ensure the runtime has permissions for OS metrics and sensors.
- Choose a sampling cadence that balances fidelity and overhead.
- Normalize units and field names when exporting to external observability stacks.
- Keep timezone consistent across environments to simplify comparisons.
- Retain history according to compliance and troubleshooting needs.
Limitations or Notes
- Metrics availability varies by OS, virtualization layer, and permissions.
- High-frequency polling can distort readings and consume resources.
- Some hardware sensors may be missing in cloud or containerized environments.
- Network counters may exclude virtual interfaces without elevated access.
- Real-time values fluctuate, so use smoothing or short windows when alerting.
- Large fleets require aggregation and sampling to stay efficient.
Output
- Server DateTime: Current timestamp, timezone, and sync context.
- CPU Metrics: Utilization by core and load averages.
- Memory Metrics: Total, used, free, buffers, cache.
- Disk Metrics: Free space, read or write rates, and latency where available.
- Network Metrics: Per-interface throughput and error counts.
- Processes: Top consumers by CPU or memory with PIDs.
- Uptime: System uptime and last boot time.
- Health Summary: Optional composite indicators and threshold flags.

