Skip to content

drzero

Codex CLI skill bundle for Dr. Zero autonomous repository improvement. Six workflow skills dispatching to domain specialist agents for analysis, autonomous improvement, swarm coordination, execution, configuration, and status monitoring.

v0.1.0
Codex

By Thomas Hudak ([email protected])

Plugin Structure

🤖
19
Agents
16
Skills
⌨️
0
Commands
🪝
0
Hooks
📋
0
Rules

Installation

Install from a release zip

Dr. Zero skills are bundled with the OTC Awesome LLM Codex marketplace plugin. Download otc-awesome-llm-codex-plugin-<version>.zip from the matching GitHub Release, then register the unzipped marketplace root:

unzip otc-awesome-llm-codex-plugin-8.31.0.zip -d /tmp/otc-awesome-llm-codex-plugin
codex plugin marketplace add /tmp/otc-awesome-llm-codex-plugin

See the full Codex Getting Started guide for marketplace setup details, available MCP tools, and install targets.

Dr. Zero plugin overview showing autonomous repository improvement workflow
Dr. Zero uses a two-phase architecture: dual-scoring curriculum learning followed by multi-agent swarm execution.
Dr. Zero help and usage commands in the terminal
Use the six Dr. Zero skills to analyze, improve, configure, and monitor your repository.

Documentation

Dr. Zero for Codex — Autonomous Repository Improvement

Status: Experimental Alpha (Internal Use Only)

Dr. Zero for Codex brings the autonomous repository improvement workflow to the Codex CLI as a native skill bundle. It shares the same two-phase architecture (GRPO/HRPO dual-scoring from arXiv:2601.07055) as the Claude Code plugin but adapts it for the Codex agent framework with TOML-based custom agents and skill-template entry points.

Dr. Zero Codex Skills

The bundle includes 16 workflow skills, each a user-facing entry point that dispatches to Codex custom agents:

SkillDescriptionExample
drzero-helpShow setup steps, modes, and examples.@drzero help
drzeroDefault autonomous improvement loop.@drzero /drzero
drzero-autonomousExplicit autonomous propose/solve loop.@drzero drzero-autonomous "Improve test coverage"
drzero-swarmCoordinated multi-agent build work.@drzero drzero-swarm "Build a Pokemon API in FastAPI"
drzero-mortySmall direct tasks with minimal ceremony.@drzero drzero-morty "Update the README"
drzero-pickleSmallest viable CI or fix change.@drzero drzero-pickle "Make CI pass with the smallest diff"
drzero-councilArchitecture or design debate before implementation.@drzero drzero-council "Should this be a plugin skill or script?"
drzero-cronenbergParallel implementation variants for comparison.@drzero drzero-cronenberg "Try three caching approaches"
drzero-portal-gunCross-repo coordination.@drzero drzero-portal-gun "Update auth across these services"
drzero-citadelGoverned release-sensitive work with quality gates.@drzero drzero-citadel "Prepare payment changes for release"
drzero-unityPeer-to-peer parallel cleanup.@drzero drzero-unity "Fix lint errors across the repo"
drzero-analysisRead-only architecture, quality, or risk analysis.@drzero drzero-analysis "Review test coverage gaps"
drzero-executionExecute a specific WorkItem through domain routing.@drzero drzero-execution "Implement WorkItem DZ-12"
drzero-statusShow current Dr. Zero session status.@drzero drzero-status
drzero-configInitialize, show, validate, and edit shared YAML config.@drzero drzero-config validate
drzero-pingHealth check plugin config, scripts, and runtime.@drzero drzero-ping

Agent Architecture

Dr. Zero for Codex uses 19 custom agents defined as TOML files in codex/agents/:

AgentRole
dr0-architectureArchitecture review and design decisions
dr0-backendBackend service implementation
dr0-complianceRegulatory and policy compliance
dr0-databaseDatabase schema and query optimization
dr0-devopsCI/CD pipelines and tooling
dr0-documentationDocumentation generation and maintenance
dr0-frontendFrontend implementation and UX
dr0-gitopsGit workflow and branch management
dr0-implementationGeneral-purpose implementation
dr0-infrastructureCloud infrastructure and IaC
dr0-monitoringObservability, alerting, and dashboards
dr0-networkingNetwork configuration and connectivity
dr0-orchestratorCoordinates swarm execution and parallel dispatch
dr0-performancePerformance profiling and optimization
dr0-proposerGenerates WorkItems in Phase 1 (scored by HRPO)
dr0-secretsSecrets management and rotation
dr0-securityThree-headed review: Security, Quality, Savage
dr0-solverAttempts WorkItems in Phase 2 (scored by GRPO)
dr0-testingTest strategy, generation, and coverage

Each agent is a Codex custom agent (codex/agents/dr0-{domain}.toml) dispatched by the orchestrator or solver based on WorkItem domain routing.

Two-Phase Architecture

Phase 1: Dual-Scoring Curriculum Learning (arXiv:2601.07055)

  1. Proposer analyzes CI failures, lint errors, test gaps to generate WorkItems
  2. Solver attempts each WorkItem, producing code patches and running acceptance tests
  3. HRPO scores the proposer: format compliance + difficulty calibration
  4. GRPO scores the solver: binary acceptance-test success
  5. Difficulty auto-adjusts to target a 50% success rate

Phase 2: Agent Swarm Execution

  • Orchestrator coordinates domain specialist agents via Codex subagent dispatch
  • Domain agents execute domain-specific tasks (up to 6 concurrent)
  • Security reviews changes through three lenses: security, quality, savage
  • Output: Production-ready changes with tests passing

Differences from Claude Code Plugin

FeatureClaude CodeCodex
Asset typePlugin (.claude-plugin/)Skill bundle (codex/skills-templates/)
Agent formatMarkdown (.md)TOML (.toml)
Entry pointsCommands (/drzero:)Commands (@drzero /drzero)
Agent dispatchTask tool with SDK precedenceCodex subagent system
Configurationdrzero.yml in project rootdrzero.yml in project root
InstallationClaude Code Plugin InstallCodex Marketplace Plugin Install
Parallelismagents.max_threads (default: 6)agents.max_threads (default: 6)

Configuration

Settings stored in drzero.yml in the project root:

# DrZero + Agent Swarm Configuration
version: "1.0"

dr_zero:
  max_iterations: 3
  tasks_per_iteration: 3
  target_success_rate: 0.5

  proposer:
    type: heuristic
    persona: healthcare

  solver:
    backend: codex
    temperature: 0.7

  production_features:
    enable_checkpoints: true
    enable_rbac: true
    enable_audit: true
    enable_telemetry: true
    enable_cat_defense: true
    enable_rate_limiting: true

  terminal:
    all_tests_pass: true
    lint_clean: true
    coverage_threshold: 80
    max_diff_lines: 500

agent_swarm:
  domains:
    - orchestration
    - architecture
    - implementation
    - testing
    - security
    - documentation
    - monitoring
    - secrets
    - gitops
    - devops
    - infrastructure
    - database
    - frontend
    - backend
    - compliance
    - performance
    - networking

  orchestrator:
    agent: orchestration
    quality_reviewer: security
    definition_of_done:
      - tests_pass
      - lint_clean
      - docs_updated
      - security_cleared
      - pr_ready

  turns:
    default: 1
    max: 5

  prompts:
    next_step: |
      Based on the current state of the codebase and the work completed in the previous turn:
      1. Review the changes made
      2. Identify remaining work
      3. Propose the next actionable step

      Focus on incremental progress toward the definition of done.

    quality_check: |
      Review the implementation for:
      - Test coverage and passing status
      - Code quality and lint cleanliness
      - Security vulnerabilities
      - Documentation completeness

      Provide specific, actionable feedback.

    done_criteria: |
      Evaluate if the implementation is PR ready to merge:
      - All tests pass
      - Lint is clean
      - Security cleared
      - Documentation updated
      - No obvious regressions

github:
  enabled: false
  issue_labels:
    - dr-zero
    - autonomous
  auto_create_pr: false

observability:
  audit_log_dir: logs/
  metrics_dir: logs/metrics/
  log_level: INFO
  export_formats:
    - prometheus
    - json
    - csv

Security Features

  • Scope boundary validation: Prevents path traversal and CAT attacks
  • Command whitelisting: Validates acceptance_test commands against safe tool list
  • Anti-hallucination enforcement: All GRPO/HRPO scores computed by dr0 Python package only

Documentation

License

Internal Use Only — Optum Tech Compute