The Business Analyst Agent frames questions, aligns metric definitions, discovers data, produces validated queries & summaries, and turns findings into actionable tasks and artifacts.
Purpose
Translate business questions into precise metrics and reproducible analyses.
Scope
Requirements & metric definitions, data discovery, queries/validation, visuals & summaries, backlog of questions.
Design Goals
Grounded SQL/tables/docs, traceable assumptions, read-first safe defaults on production data.
Typical Use Cases
Typical Use Cases
- Define & validate KPI packs (e.g., PMPM, denial rate, readmission, ALOS)
- Build query prototypes for dashboards and ad-hoc decisions
- Reconcile metric discrepancies across teams/sources
- Prepare analysis briefs for stakeholder reviews
- Support model design with clarified business rules & testable definitions
Inputs & Prerequisites
Inputs & Prerequisites
Access
- Read access to connected sources (e.g.,
hcls_demo_1_sources; Snowflake if used) - Optional: Google Drive service account for Sheets/Docs
- Prior docs (requirements, mapping tables, ERDs, glossary)
- Stakeholders and decision timelines
- Default schema:
GENESIS.EVE_WORKSPACE - Default stage:
GENESIS.EVE_WORKSPACE.MY_STAGE
Keep a lightweight glossary handy—synonyms & allowed values speed up alignment and validation.
Core Workflows
Core Workflows
1
1) Frame the question
Clarify decision, timeframe, grain, filters, and acceptance criteria. Identify stakeholders and review cadence.
2
2) Discover data
Enumerate candidate tables/columns, keys, and constraints. Link to existing models/mappings and known lineage.
3
3) Define metrics
Specify name, formula, grain, dimensions, and filters. Record rationale, caveats, and test cases.
4
4) Generate analyses
Write SQL prototypes with safe
LIMITs. Produce aggregations & comparisons (time/cohort/segment). Add compact visuals.5
5) Validate & iterate
Reconcile results across sources/samples. Log open questions. Mark metrics as
proposed / accepted / rejected.6
6) Publish deliverables
Export query library, KPI dictionary, and results. Attach artifacts to the mission and notify stakeholders.
Every published KPI has acceptance criteria and a validation query.
Default Outputs
Default Outputs
- Analysis brief (objectives, assumptions, highlights, risks)
- KPI/metric dictionary with acceptance criteria
- SQL query library (organized by topic/metric)
- Result tables or CSV/XLSX/Sheets exports
- Lightweight visuals & narrative summary
- Backlog of tasks (open questions, data issues, follow-ups)
Tools & Permissions
Tools & Permissions
Commonly used
- project_manager_tools, data_connector_tools, snowflake_tools
- document_index_tools, file_manager_tools, artifact_manager_tools
- google_drive_tools, git_action, slack_tools, web_access_tools, delegate_work
- Read-first on production; workspace writes only when requested
- No external sharing without explicit approval
Grant the least privilege needed; review access during each milestone.
Safety & Operational Notes
Safety & Operational Notes
- Enforce
LIMITs and filters; avoid full scans/cross joins - Mask PHI/PII in outputs; exclude raw personal data by default
- Confirm before creating persistent tables; use the workspace schema
- Record assumptions and data quality caveats in the brief
Configuration
Configuration
- Naming: metric IDs, folder/tags for the query library
- Sampling defaults and cost guards
- Export modes: CSV/XLSX/Sheets; directory/stage locations
- Review cadence and Slack channels for updates
- Glossary/linking to shared business terms
Example Workflow — Healthcare Data Pipeline (Summary)
Example Workflow — Healthcare Data Pipeline (Summary)
Mission: Healthcare Data Pipeline (
Minimal result preview
proj_Eve_eCvaHE)Available source tables (hcls_demo_1_sources.main):- CLAIMS: claim header info incl. dates, amounts, status
- CLAIM_DETAILS: service-level details with procedure/diagnosis codes
- PATIENTS: patient demographics
- PROVIDERS: provider directory & specialty
- PMPM allowed cost by claim type (last 6 months)
- Inpatient 30-day readmission rate by diagnosis code (last 6 months)
- Grain/dims:
- PMPM → month, claim_type
- Readmission → discharge month, diagnosis_code
- Filters:
- PMPM →
CLAIM_STATUS = 'PAID' - Readmission →
CLAIM_TYPE = 'INPATIENT', non-null admission/discharge
- PMPM →
- Acceptance:
- PMPM within ±1% of finance rollup (if available)
- Readmission within ±0.5 pp of quality reference (if available)
- Eligibility not present → approx member-months by distinct
PATIENT_IDper month - Use
CLAIM_DETAILS.DIAGNOSIS_CODEas DRG proxy when DRG absent - Provider keys available for future drilldowns
| metric_id | name | formula | grain | dims | filters |
|---|---|---|---|---|---|
| pmpm_allowed | PMPM Allowed Cost | sum(INSURANCE_PAID + PATIENT_RESPONSIBILITY) / members | month | claim_type | CLAIM_STATUS = ‘PAID’ |
| readmit_30d_rate | 30d Readmission Rate (IP) | sum(readmit_30d) / count(*) | month | diagnosis_code | CLAIM_TYPE = ‘INPATIENT’ |
| month_start | claim_type | total_allowed | members | pmpm_allowed |
|---|---|---|---|---|
| 2025-03-01 | INPATIENT | 1,234,567.89 | 8,152 | 151.52 |
| 2025-03-01 | OUTPATIENT | 987,654.32 | 10,044 | 98.33 |

