Overview
Tool Name
Purpose
The github_connector_tools enable direct integration between Genesis Data Agents and GitHub repositories. Manage repositories, files, branches, pull requests, and workflows programmatically—all within your data agent conversations. Perfect for automated documentation, code generation, data pipeline versioning, and collaborative data projects.Functions Available
git_action
:Core Git operations including repository management, branching, committing, and history tracking.
file
:File operations for reading, writing, searching, and managing repository content.
git_action
(GitHub-specific actions):GitHub platform features including pull requests, remote repository management, and credential storage.
Key Features
Repository Management
Clone, create, list, and delete repositories with full version control integration.
Branch Operations
Create, switch, merge branches and manage parallel development workflows.
Pull Request Automation
Create, review, merge, and close pull requests programmatically for collaborative workflows.
File Management
Read, write, search, copy, move, and delete files with pattern matching and bulk operations.
Commit History
Access commit logs, fetch history, and track changes across repository timelines.
Secure Authentication
Store GitHub credentials securely with support for Personal Access Tokens (PAT).
Input Parameters for Each Function
git_action
Parameters
Name | Definition | Format |
---|---|---|
action | Git operation to perform. Values: commit , get_history , create_branch , switch_branch , get_status , clone_repo , list_repos , pull , push , create_pull_request , etc. | String (required) |
repo_id | Repository identifier. Required for most operations except list_repos and credential storage. | String |
commit_message | Message for commit action. | String |
branch_name | Name of branch for create/switch/pull/push operations. | String |
url | Git repository URL for cloning or remote operations. | String |
max_count | Maximum number of commits to return in history (for get_history ). | Integer |
file
Parameters
Name | Definition | Format |
---|---|---|
action | File operation. Values: read , write , delete , list , find , search , copy , move , commit , etc. | String (required) |
repo_id | Repository identifier where files are located. | String (required) |
file_path | Relative path to file in repository (no leading /). | String |
content | Content to write to file (for write action). | String |
pattern | Glob pattern for find/search operations (e.g., *.py , **/*.md ). | String |
source_patterns | List of glob patterns for bulk copy/move/delete operations. | Array of Strings |
target_path | Destination directory for copy/move operations. | String |
message | Commit message (for commit action). | String |
Use repo_id consistently across git_action and file operations to work within the same repository context. List available repos with
git_action(action="list_repos")
.Use Cases
- Automated Documentation Generate and commit data dictionaries, analysis reports, or pipeline documentation directly to GitHub repositories.
- Code Generation & Versioning Create dbt models, SQL scripts, or Python notebooks and version them in GitHub for team collaboration.
- Data Pipeline Deployment Clone configuration repositories, update parameters, commit changes, and create pull requests for review.
- Collaborative Data Projects Share analysis notebooks, datasets, and findings with team members through structured GitHub workflows.
- Backup & Disaster Recovery Automatically commit critical configurations, metadata, and artifacts to GitHub for version-controlled backups.
Workflow/How It Works
-
Step 1: Authenticate with GitHub
Store GitHub credentials securely using Personal Access Token:
-
Step 2: Create or Clone Repository
Start with a new repository or clone an existing one:
-
Step 3: Manage Files
Create, read, and organize files within the repository:
-
Step 4: Commit Changes
Track changes with descriptive commit messages:
-
Step 5: Create Pull Request
Collaborate with team through pull requests:
-
Step 6: Push to GitHub
Sync local changes to remote repository:
Integration Relevance
- project_manager_tools to track data projects and link GitHub repositories to missions.
- data_connector_tools to export analysis results and commit them to version control.
- dbt_action to manage dbt projects in GitHub with full version control.
- file_manager_tools to organize artifacts before committing to repositories.
- slack_tools to notify teams when pull requests are created or commits are pushed.
Configuration Details
- Personal Access Token: Generate from GitHub Settings → Developer settings → Personal access tokens. Required scopes:
repo
,workflow
. - Repository Naming: Use lowercase with hyphens (e.g.,
data-pipeline-prod
) for consistency. - Branch Strategy: Use feature branches (
feature/
,bugfix/
) for changes; protectmain
branch. - Commit Messages: Follow conventional commits format:
feat:
,fix:
,docs:
,refactor:
. - File Paths: Always use forward slashes (/) and relative paths without leading slash.
- Large Files: GitHub has 100MB file size limit; use Git LFS for larger artifacts or store in data platforms.
Never commit sensitive credentials, API keys, or passwords to GitHub repositories. Use environment variables or secret management systems instead.
Limitations or Notes
- File Size Limits: GitHub limits individual files to 100MB; repositories to 1GB recommended size.
- API Rate Limits: GitHub API has rate limits (5,000 requests/hour for authenticated users); batch operations when possible.
- Private Repository Access: Requires appropriate permissions and PAT scopes for private repos.
- Binary Files: Git is optimized for text files; large binary files can slow repository performance.
- Merge Conflicts: Automatic merging may fail with conflicts; manual resolution required.
- Branch Protection: Protected branches may prevent direct pushes; use pull requests instead.
- Network Dependency: All operations require internet connectivity to GitHub servers.
Supported Actions
✅ clone_repo - Clone repository from GitHub URL✅ create_repo - Create new local repository
✅ list_repos - Show all available repositories
✅ commit - Commit staged changes
✅ push - Push commits to remote
✅ pull - Pull changes from remote
✅ create_branch - Create new branch
✅ switch_branch - Switch to different branch
✅ get_status - Show repository status
✅ get_history - View commit history
✅ create_pull_request - Create PR on GitHub
✅ merge_pull_request - Merge PR
✅ store_github_credentials - Save authentication
✅ list_remote_repos - List repos on GitHub
✅ file operations - Full CRUD on repository files
Not Supported
❌ Git submodules or subtrees❌ Git LFS (Large File Storage) direct management
❌ GitHub Actions workflow execution
❌ GitHub Issues or Projects management
❌ Repository webhooks configuration
❌ GitHub Pages deployment
❌ Repository transfer or ownership changes
❌ Advanced merge strategies (squash, rebase from tool)
Output
- clone_repo: Confirmation with repository path and clone details.
- commit: Commit hash, files changed, and commit message confirmation.
- push: Push status, branch name, and remote URL.
- get_history: List of commits with hash, author, date, and message.
- get_status: Current branch, staged/unstaged files, and sync status.
- create_pull_request: PR number, URL, and creation confirmation.
- file operations: Success confirmation, file paths, and content previews.
- list_repos: Table of repositories with names, paths, and status.
- Errors: Detailed error messages with resolution guidance (e.g., authentication failures, merge conflicts).
Best Practices
Commit Often
Make small, focused commits with clear messages. Easier to review, revert, and understand history.
Branch Strategy
Use feature branches for development. Keep main/master stable and deployable.
Pull Before Push
Always pull latest changes before pushing to avoid conflicts and ensure smooth integration.
Meaningful Messages
Write descriptive commit messages explaining what and why, not just what changed.
Example: Complete Data Project Workflow
Advanced Features
Bulk File Operations
Copy multiple files matching patterns:Search Across Files
Find specific content in repository:Fetch Deep History
Get more commit history for analysis:Troubleshooting
Authentication Failed
Authentication Failed
- Verify Personal Access Token is valid and not expired
- Check token has required scopes:
repo
,workflow
- Re-store credentials with
store_github_credentials
Push Rejected
Push Rejected
- Pull latest changes first:
git_action(action="pull")
- Check for merge conflicts in
get_status
- Verify you have write permissions to repository
File Not Found
File Not Found
- Verify file_path uses forward slashes (/)
- Check path is relative without leading slash
- Use
file(action="list")
to see available files
Branch Already Exists
Branch Already Exists
- Switch to existing branch instead of creating
- Use
get_branch
to check current branch - Delete old branch or use different name