Retour aux articles

Dockdock-Go – Security Gatekeeper for Harbor Registries

A detailed technical retrospective of Dockdock-Go, a Master 1 student project implementing multi-layer security scanning and human approval gates for Harbor container registries

Level Sony
Security Kubernetes Harbor Rust DevSecOps Container Security
Dockdock-Go – Security Gatekeeper for Harbor Registries
Table des matières

This article documents a Master 1 student project completed in 2023–2024, written from the perspective of December 2024. It provides an honest technical retrospective including architecture decisions, implementation challenges, technical debt, and lessons learned.


Written in December 2024

Team: 4 students (1× Rust backend, 2× infra/Kubernetes, 1× frontend/docs (me) )

Effective Duration: ≈9 months shared with courses and other projects

Current Status: Functional Proof-of-Concept in local demo, not production-maintained

1. The Real Problem We Wanted to Solve

In most organizations using Harbor as a private registry:

  • Developers (or CI pipelines) push images directly
  • Built-in vulnerability scanning (Trivy or Clair) often triggers after the push
  • There’s generally no systematic static malware detection
  • Dynamic (runtime) analysis is virtually non-existent before admission
  • There’s no mandatory human checkpoint for critical images
  • Rejections aren’t centrally tracked in an exploitable way

Our objectives:

  1. Enforce explicit request + human approval before admission to sensitive projects
  2. Stack multiple analysis layers: vulnerabilities + static malware + runtime behavior
  3. Centralize results and decisions in the existing Harbor interface
  4. Keep the system light enough to remain realistic in a student context

2. Static Architecture

Architecture globale de Dockdock-Go

3. Detailed Workflow (user)

sequenceDiagram
    participant Dev
    participant UI as Harbor UI + Dockdock-Go
    participant API
    participant Harbor
    participant K8s
    participant Dyn as Dynamic Service
    participant DB
    participant Analyst

    Dev->>UI: Submit admission request<br>image:tag · source · target project
    UI->>+API: POST /api/v1/requests {…}

    API->>Harbor: Replicate artifact → temporary project
    Note over API,Harbor: Async operation → polling status

    API->>K8s: Create YaraHunter Job with image:tag
    K8s-->>API: Logs → PVC → API reads JSON/text

    API->>Harbor: Trigger scan → polling (5s) up to 90s max
    loop Polling scan results
        Harbor-->>API: Complete Trivy report
    end

    API->>Dyn: POST /analyze {image, tag, rules}
    Dyn-->>API: {runtime flags, logs?}

    API->>DB: INSERT request + analyses + flags<br>status = PENDING
    API-->>UI: 201 Created + refresh list

    loop Human decision (minutes to days)
        Analyst->>UI: List pending requests
        Analyst->>UI: Open details → aggregated flags + raw reports
        Analyst->>UI: PATCH /requests/{id} {status: APPROVED|REJECTED, comment?}
    end

    alt APPROVED
        API->>Harbor: Final replication temp → target project
        API->>DB: UPDATE status = ACCEPTED, approved_at = now()
    else REJECTED
        API->>DB: UPDATE status = REJECTED, rejected_at = now(), reason = …
        API->>Harbor: (future option) delete temp artifact
    end

    API-->>UI: Update visible to all

4. Notable Implementation Points

Mandatory Image Import into K3s

YaraHunter only scans what’s locally present → manual docker save | sudo k3s ctr images import - required. Attempted automation via sidecar crictl pull → abandoned (auth and permission issues).

Flag Normalization

Single flags table:

- type (enum)

  • vuln: detected vulnerability (CVE, dependency, config, etc.)
  • malware: malware / IOC detection
  • dynamic: dynamic analysis result (sandbox, behavior)
  • policy: rule violation (compliance, access, secrets, etc.)
  • other: generic category

Allowed values:

  • vuln | malware | dynamic | policy | other

- severity (low/medium/high/critical)

Standardized severity level:

  • 🟢 low
  • 🟠 medium
  • 🔴 high
  • critical

- details (JSONB)

Flexible and conventionally typed data according to type.

Examples of useful keys (case-dependent):

{
  "title": "string",
  "description": "string",
  "source": "string",
  "confidence": 0.0,
  "timestamp": "2026-02-24T12:34:56Z",
  "cve": "CVE-2026-12345",
  "cwe": "CWE-79",
  "package": "string",
  "version": "string",
  "fix_version": "string",
  "ioc": {
    "hash": "string",
    "domain": "string",
    "ip": "string",
    "url": "string"
  },
  "policy_id": "string",
  "rule": "string",
  "control": "string",
  "evidence": [
    {
      "type": "log|snippet|stacktrace|file|command|http",
      "summary": "string",
      "data": "string",
      "path": "string",
      "line_start": 0,
      "line_end": 0,
      "timestamp": "2026-02-24T12:34:56Z"
    }
  ],
  "recommendation": "string",
  "references": ["https://example.com/reference-1", "https://example.com/reference-2"]
}

- raw_path (string)

Optional path to the raw artifact (e.g., on PVC) if needed:

  • complete logs
  • sandbox dump
  • suspicious file
  • original scanner report

Example:

  • /pvc/scans/2026-02-24/job-123/report.json
  • /pvc/artifacts/sample-abc123.bin

Harbor Scan Polling

Tokio loop with interval(5s) + max counter → configurable timeout via environment variable.

YaraHunter Parsing

Patched binary to write structured JSON to /output/report.json → mounted via PVC (persistent volume claim) → read by API.

Image Status

Late refactor: moved status from images table to static_analyses for better normalization.

5. Structural Limitations & Technical Debt (December 2024 Status)

  • API completely open (no authentication) → critical if exposed
  • Entire workflow is synchronous in initial request → no queue
  • Dynamic analysis service never completed → mocked responses
  • Strong dependency on mounting ~/.kube/config → impossible multi-cluster without rework
  • No notifications (email / Slack / webhook) after decision
  • Frontend breaks as soon as Harbor upgrades beyond v2.9
  • Tests: only unit tests on domain (~80 cases) → zero integration / E2E tests
  • No monitoring (limited structured logs, zero Prometheus metrics)

6. What We Would Do Radically Differently Today

  1. Never patch the Harbor frontend → prefer webhooks + dedicated UI or plugin
  2. Use Kyverno or OPA Gatekeeper for native Kubernetes admission policies
  3. Integrate Cosign + Fulcio + Rekor (Sigstore) for signing & SLSA attestations from the start
  4. Orchestrate scans with Argo Workflows or Tekton Pipelines (visual + retry + parallelism)
  5. Add Prometheus + Loki / Grafana from day one
  6. Migrate to axum instead of Actix-web (better maintained and more ergonomic)
  7. Make malware scanner serverless (Keda / Knative) for scale-to-zero
  8. Provide a Kubernetes CRD for admission requests (operator pattern)

7. Conclusion – Real Project Value

The code produced is not industrializable as-is: it’s an ambitious POC with significant technical debt.

But the project was extremely formative:

  • Understanding Harbor’s concrete limitations in depth
  • Realizing how crucial async and queues are in supply-chain security workflows
  • Learning that the “human approval gate” often remains more powerful and flexible than 100% automatic blocking
  • Discovering the pitfalls of patching an open-source monolith (especially the frontend)
  • Working with Rust in realistic conditions, Kubernetes Jobs, reverse-engineering third-party APIs, parsing unstructured reports
  • Experiencing the difficulty of coordinating 4 people on a distributed project with academic deadlines

If we were to propose a serious follow-up today, we’d start with:

  • Harbor + webhooks
  • Kyverno / Gatekeeper for admission policies
  • Argo Workflows for scan orchestration
  • Lightweight dedicated UI (or integration in Backstage / Port)
  • Complete Sigstore for provenance

But for an end-of-Master-1 project, we’re quite proud of the result: a working demo that impresses juries and taught us enormously about container security and distributed systems architecture.


Links (unmaintained archives):

Technical questions welcome (DB schema, Rust extracts, YaraHunter patch, etc.).

Happy reading and successful security projects!

Commentaires