The DBT Building Agent automates project scaffolding, model generation (staging/intermediate/marts), tests, documentation, and safe execution—turning specs and source metadata into reproducible dbt assets.
Posture: read-first on production; write in dev/test or a workspace schema. External sharing only with explicit approval.

Overview

Purpose

Automate dbt scaffolding; generate staging/intermediate/marts models; add tests/docs; run builds and capture artifacts for review.

Scope

Project scaffolding, source registration, models (staging/base, intermediate joins, marts), tests at scale, docs/exposures, seeds & snapshots, compile/run/test/build.

Design

Deterministic, idempotent generation; safety-first defaults; clear lineage via ref()/source(); YAML-managed tests/docs.

Typical use cases

  • Bootstrap a new dbt project from connected warehouse metadata
  • Generate staging and marts from a mapping specification
  • Introduce tests and docs at scale to an existing project
  • Migrate SQL pipelines into modular dbt models with macros and contracts

Inputs and prerequisites

  • Data access: read access to source schemas; dev/test write access for models
  • Specifications: mapping tables/specs, naming conventions, test coverage targets
  • Repo access: branch to write code (feature branch by default)
  • Environment: dbt profile/adapter (Cloud or Core)

Core workflows

1

Discover sources

Enumerate schemas/tables/columns; infer keys and relationships from metadata.
2

Scaffold project

Create model folders; write packages.yml and dbt_project.yml.
3

Generate models

Staging per source table, intermediate joins, marts aligned to specs.
4

Configure tests & docs

Produce schema.yml with tests; add descriptions and exposures.
5

Execute & validate

dbt compile/run/test; capture artifacts; open tasks for any issues.

Default outputs

  • Code tree: dbt_project.yml, packages.yml, models/, macros/, seeds/, snapshots/
  • Models: staging (stg_*), intermediate, marts (dims/facts)
  • Artifacts: manifest.json, run_results.json (optional catalog.json)
  • Test and build summaries suitable for review

Tools and permissions

  • Common: project_manager_tools, data_connector_tools, dbt (Core/Cloud), git_action, artifact_manager_tools, file_manager_tools
  • Optional: snowflake_tools, google_drive_tools, slack_tools, document_index_tools, delegate_work
  • Posture: read-first on production; writes in dev/test or workspace; external sharing only with approval

Safety and operational notes

  • Use YAML configs for materializations; avoid destructive changes without confirmation.
  • Enforce safe limits during profiling; avoid wide cross joins.
  • Mask sensitive data in logs/exports; record assumptions and caveats.
  • Confirm before creating persistent tables; prefer workspace/dev schemas.

Configuration

  • Naming: stg_, dim_, fct_; snake_case
  • Materializations: view/table/incremental via YAML
  • Tests: mandatory (not_null, unique); optional (relationships, accepted_values)
  • Environments: adapter-specific configs per target

Example workflow (Healthcare Data Pipeline)

This illustration uses the mission’s tables in hcls_demo_1_sources.main. Examples are SQLite-friendly; adapt functions for your adapter as needed.

ERD (agent-driven: from specs & metadata → reproducible dbt assets)