Skip to content

The security pipeline

Every .deb package that enters Repod passes through a seven-step validation pipeline before it becomes available to apt install. This page explains the design rationale behind the pipeline, what each step actually checks, and what the failure modes mean β€” not how to configure it, but why it is structured the way it is.

Why a pipeline?

The traditional approach to private APT repositories is "upload and serve". A developer runs reprepro includedeb jammy mypackage.deb and the package is immediately available to every machine pointing at the repository. This works well for small teams with high trust. It is a significant supply chain risk for anything else.

Supply chain attacks against package repositories have become one of the most effective vectors for compromising infrastructure at scale. The attacker does not need to compromise the target systems directly β€” they compromise the package being deployed to those systems. A malicious dependency, a trojanized binary, a package with a known but unpatched CVE: all of these can be introduced through the upload path if that path has no validation.

The Repod pipeline enforces a mandatory validation contract: no package can bypass the checks. There is no "fast path", no admin override that skips steps, and no mechanism to promote a quarantined file directly to the APT tree without it passing through the pipeline. This is not an accident β€” it is documented as a feature in the section on bypassing below.

The pipeline is synchronous for individual uploads (the API call blocks until validation completes) and runs inline for each import from external sources. This means validation latency is directly visible to the caller, which creates natural pressure to keep the tools fast and the policies well-tuned.

Pipeline overview

flowchart TD
    Upload["πŸ“¦ .deb received\n/repos/staging/incoming/"]

    Upload --> S1["Step 1 β€” Format validation\ndpkg-deb --info"]
    S1 -->|Fail| R1["❌ Rejected\nmoved to quarantine/"]
    S1 -->|Pass| S2["Step 2 β€” SHA-256 provenance\nvs Packages.gz index"]
    S2 -->|Fail| R2["❌ Rejected\n(tampered file)"]
    S2 -->|Pass| S3["Step 3 β€” Antivirus\nClamAV clamscan"]
    S3 -->|Virus found| R3["❌ Rejected + quarantined"]
    S3 -->|Clean| S4["Step 4 β€” CVE scan\nGrype + NVD + KEV"]
    S4 -->|block policy| R4["❌ Rejected\ncve_status: blocked"]
    S4 -->|review policy| PR["⏳ Pending review\nmoved to pool/\nstatus: pending_review"]
    S4 -->|warn / allow| S5["Step 5 β€” GPG signature\ngpg --verify"]
    S5 -->|Invalid sig| R5["❌ Rejected"]
    S5 -->|Valid / absent| S6["Step 6 β€” Dependency check\nagainst pool/"]
    S6 -->|Missing deps| W6["⚠️ Warning\n(non-blocking by default)"]
    S6 -->|All present| S7["Step 7 β€” EPSS + CISA KEV\nenrichment"]
    W6 --> S7
    S7 --> Indexed["βœ… Indexed\nreprepro β†’ dists/\nstatus: indexed"]

    style R1 fill:#fdd,stroke:#c00
    style R2 fill:#fdd,stroke:#c00
    style R3 fill:#fdd,stroke:#c00
    style R4 fill:#fdd,stroke:#c00
    style R5 fill:#fdd,stroke:#c00
    style PR fill:#ffd,stroke:#aa0
    style W6 fill:#fff3cd,stroke:#856404
    style Indexed fill:#d4edda,stroke:#155724

EPSS + KEV enrichment timing

EPSS scores and CISA KEV flags are fetched during the CVE scan step (Step 4) and used to inform CVE policy decisions. Step 7 in the diagram represents the enrichment result being embedded in the manifest and reflected in the review queue display β€” it is not a separate gate.

Step-by-step breakdown

Step 1 β€” Format validation

Tool: dpkg-deb --info

The file is first verified to be a structurally valid Debian package. dpkg-deb --info parses the package control archive, extracts the control file, and reports the package name, version, architecture, and declared dependencies. If the file is truncated, corrupted, or not a .deb at all, this command returns a non-zero exit code.

On failure: The file is moved to /repos/staging/quarantine/ immediately. No further steps run. The pipeline short-circuits on format failure because subsequent tools (ClamAV, Grype) expect valid .deb input.

Why it matters: It prevents malformed files from wasting scan time and β€” more importantly β€” from triggering unexpected behavior in the scanning tools. Tools like clamscan and grype have been known to behave unpredictably on malformed input. Rejecting early avoids that surface.


Step 2 β€” SHA-256 provenance

Tool: Python hashlib.sha256 + comparison against the upstream Packages.gz index

When a package is imported from an upstream source (a security mirror, an Ubuntu archive), Repod records the expected SHA-256 from that source's Packages.gz file at import time. During validation, the file's actual SHA-256 is recomputed and compared against that reference.

For manually uploaded packages (where no upstream reference exists), this step passes with a note that provenance is unverifiable. The checksum is still computed and recorded in the manifest for later auditing.

On failure: Immediate rejection. A SHA-256 mismatch means the file as received does not match what the upstream source published. This is the primary defense against man-in-the-middle attacks on package transit.

Why it matters: TLS protects transit between your server and the upstream mirror, but it does not protect against a compromised mirror serving a different file at the same path. The SHA-256 from Packages.gz (which is itself signed by the upstream GPG key) provides an independent integrity signal.


Step 3 β€” Antivirus scan

Tool: clamscan with the ClamAV daily signature database at /var/lib/clamav/

The entire .deb file β€” including its embedded data archive β€” is scanned for known malware signatures. ClamAV's daily.cld database covers tens of thousands of malware families, backdoors, and exploit kits. The freshclam daemon (or a scheduled manual update) keeps this database current.

On failure (virus detected): clamscan returns exit code 1 with the threat name. The file is moved to quarantine and a FAILURE audit entry is written. The threat name is recorded in both the audit log and the manifest step result.

On tool error: If clamscan is unavailable or returns an unexpected exit code, the step is recorded as a warning but does not block the pipeline. This prevents an AV database update failure from halting all uploads.

Why it matters: Malware embedded in .deb packages is a known attack vector. The postinst maintainer script runs as root during installation; a trojanized package can execute arbitrary code on every machine that installs it. AV scanning catches known threats before they enter the distribution chain.


Step 4 β€” CVE scan

Tool: grype with the NVD, GitHub Advisory Database, and CISA KEV sources

Grype analyzes the .deb's contents β€” the installed files, libraries, and their versions β€” against its vulnerability database. The --distro flag is passed with the target distribution codename (jammy, bookworm, etc.) to improve match accuracy for distribution-specific patching.

Grype outputs structured JSON with one entry per match: CVE ID, severity, CVSS score, affected component, fix state, and available fix versions. Repod processes this output and applies the CVE policy defined in settings.json.

CVE policy application: The policy maps each severity level to one of four actions. The policy is evaluated per-severity across all matches, and the most restrictive outcome wins:

  • block β€” pipeline fails, package goes to quarantine
  • review β€” pipeline does not fail, but the package enters pending_review status and is not promoted to APT until a human approves
  • warn β€” package proceeds, the warning is recorded in the manifest and the UI
  • allow β€” no action; the CVE is recorded but ignored by the policy engine

On block policy triggered: cve_status is set to blocked, result.passed becomes False, and the file is quarantined. The full CVE list is embedded in the rejection audit entry.

On review policy triggered: cve_status is set to pending_review. The pipeline considers this a pass (the file is not quarantined), but the package is stored with status: pending_review and is withheld from the APT index until a human decision is made. See CVE review workflow.

Why it matters: CVE scanning at upload time catches known vulnerabilities before they reach production. Combined with the EPSS and KEV enrichment (see Step 7), the policy engine can distinguish between a theoretical vulnerability and one that is actively being exploited.


Step 5 β€” GPG signature verification

Tool: gpg --verify against the shared keyring at /repos/gnupg

If a .sig or .asc file is present alongside the uploaded .deb (same filename with a signature suffix), Repod verifies it against the keys in the GPG keyring. If no signature file is present, this step passes with a note that no signature was provided.

GPG verification is currently optional by design: not all packages in the wild carry a detached signature, and requiring one would make it impossible to upload packages from sources that do not sign their individual files (as opposed to signing their repository metadata). This may be made stricter through configuration in future versions.

On failure: A present but invalid signature is a hard failure. An absent signature is a soft pass with a note.

Why it matters: For packages imported from trusted internal sources, a GPG signature provides a chain of custody guarantee: the binary was produced by the expected key. An invalid signature on a file that claimed to be signed is a stronger signal than a missing signature on one that made no such claim.


Step 6 β€” Dependency resolution

Tool: dpkg-deb -f <file> Depends + filename search in /repos/pool/

The Depends: field from the package's control file is parsed into a structured list of package names (with optional version constraints). Each declared dependency is checked against the contents of pool/ β€” specifically, whether a .deb with a matching name exists.

On failure (missing dependencies): By default, this is a non-blocking warning. The step is recorded as failed but strict_deps=False means the overall pipeline result remains passed: true. The missing dependencies are recorded in the manifest's deps_missing field, which is exposed in the UI.

If strict_deps=True is passed (not the default for uploads), missing dependencies become a hard failure.

Why it matters: A package that declares dependencies that are not in the local repository will fail to install with apt install. This check surfaces that problem before the package is ever deployed, giving the operator the opportunity to import the missing dependencies first. In air-gapped environments, this is particularly valuable because there is no fallback to an upstream mirror.


Step 7 β€” EPSS + CISA KEV enrichment

Sources: FIRST.org EPSS API (api.first.org/data/1.0/epss) and CISA KEV feed (cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json)

For every CVE found in Step 4, two additional signals are fetched:

  • EPSS score: A probability (0.0–1.0) representing the likelihood that this CVE will be exploited in the wild within the next 30 days. Published daily by FIRST.org.
  • CISA KEV flag: Whether this CVE appears in the CISA Known Exploited Vulnerabilities catalog β€” meaning it is already being actively exploited by threat actors.

Both sources are cached on disk with a 24-hour TTL in /repos/security/kev_cache.json and /repos/security/epss_cache.json. This allows Repod to function in air-gapped environments after an initial population, and prevents rate-limiting from external APIs during high-volume upload sessions.

On enrichment failure: Non-blocking. If the external APIs are unreachable and the cache is stale, enrichment is skipped and CVEs are recorded without EPSS/KEV data. This is explicitly a graceful degradation β€” enrichment is additional context, not a gate.

Why it matters: CVSS scores measure theoretical severity; EPSS measures actual exploitation likelihood. A CVE with CVSS 9.8 but EPSS 0.003 is technically severe but statistically unlikely to be targeted. A CVE with CVSS 7.2 but EPSS 0.94 represents a near-certain exploitation attempt somewhere in the world right now. The CISA KEV flag is even more concrete: it means the exploitation is confirmed. These signals allow the review queue to be prioritized meaningfully rather than generating uniform panic about all high-severity findings.

CVE policy configuration

The policy table below shows which actions are available at each severity level and what Repod recommends as a starting point.

Severity Available actions Recommended default Rationale
critical block, review, warn, allow block Critical CVEs have a CVSS base score β‰₯ 9.0. Combined with CISA KEV prevalence at this level, automatic blocking is appropriate.
high block, review, warn, allow review High CVEs (7.0–8.9) are significant but may have mitigating factors. Human review is appropriate.
medium block, review, warn, allow warn Medium CVEs are common in production software. Automatic blocking creates too much friction; a visible warning is sufficient.
low block, review, warn, allow allow Low CVEs have minimal practical impact and high false-positive rates at this severity. Recording them is valuable; acting on them automatically is not.
negligible block, review, warn, allow allow Informational only.

Policy and CISA KEV interaction

Even if a CVE's severity-level policy is warn or allow, the presence of a CISA KEV flag is surfaced prominently in the review queue and manifest. A KEV-flagged CVE at Medium severity (EPSS-driven) can be escalated manually regardless of the policy engine's automatic decision.

Package statuses

Status Meaning Who can see it Next action
indexed Package passed all checks and is available via apt install All authenticated users; APT clients Normal use; monitoring for new CVEs
pending_review Package passed format/AV/integrity checks but has CVEs requiring human approval admin, maintainer, auditor in the UI; not visible to APT clients CISO/admin reviews CVE details and approves or rejects
quarantined Package was rejected by the pipeline or by a human decision admin, maintainer, auditor in the UI only Investigate the rejection reason; re-upload a fixed version
rejected Explicit human rejection from the review queue (distinct from automatic quarantine) admin, maintainer, auditor File moved to staging/quarantine/; decision recorded permanently

Performance

The pipeline execution time is dominated by the CVE scan (Grype) and the antivirus scan (ClamAV). Typical timings on a modest server:

Step Typical duration
Format validation < 1 second
SHA-256 provenance < 1 second
ClamAV scan 3–15 seconds depending on package size
Grype CVE scan 15–90 seconds depending on package complexity and DB cache state
GPG verification < 1 second
Dependency check < 2 seconds
EPSS + KEV enrichment 1–5 seconds (network) or < 1 second (cache hit)

Total: typically 20–120 seconds per package.

For large batch imports (the security sync job that fetches packages from upstream mirrors), validation runs sequentially per package within a single sync job. The scheduler is designed to run overnight (03:00 by default) specifically to avoid competing with interactive upload traffic. Individual upload requests are rate-limited to 20 per minute per client.

Bypassing the pipeline

There is no way to bypass the pipeline without modifying the source code.

There is no --skip-validation flag, no admin API endpoint that promotes a file directly, and no environment variable that disables steps. The pipeline is invoked unconditionally in run_validation_pipeline(), which is called by both the upload router and the import router before any file is moved to pool/.

This is a feature, not a limitation.

The value of the security pipeline is its unconditional nature. A bypass mechanism β€” even one restricted to admin role β€” creates a policy exception path that can be abused, that must be audited separately, and that erodes the compliance story ("all packages are validated before serving"). The answer to "we need to deploy this urgently and the scanner is slow" is to tune the pipeline (configure an appropriate policy, pre-populate the Grype database) rather than to add a shortcut around it.

Emergency deployment

If a package must be deployed before the pipeline can complete normally β€” for example, a zero-day patch with a known EPSS-high CVE in a dependency β€” the correct path is: upload the package (it enters pending_review), then use the CISO review queue to approve it immediately with a documented justification. This preserves the audit trail and the separation of duties, while unblocking the deployment.