Audit log: file-on-PVC sink doesn't reach central logging and isn't tamper-evident #35

Open
opened 2026-05-25 23:39:58 +00:00 by navicore · 0 comments
Owner

Summary

anz writes audit events (AuditEvent in src/audit.rs) to a local file
(audit.log, on the PVC in the homelab deployment). In a Kubernetes cluster
whose log pipeline scrapes container stdout (our Loki + otel-collector-logs
setup), this means the audit events never leave the node, while ordinary
operational stdout does. The file-based sink also provides no meaningful
tamper-resistance over stdout.

How it surfaced

Wiring up navidocs (docs.navicore.tech) machine-to-machine auth via the new
client_credentials grant: a failed attempt (wrong client secret) returned
401 but produced nothing in kubectl logs for anz — because the audit
event goes to the audit-log file, not stdout. (A tracing::warn! has since been
added to the client_credentials reject path, which does reach stdout/Loki —
that is currently the only audit-ish signal that leaves the host.)

The problem

  • File vs stdout is not an integrity boundary. Whoever can forge anz's
    stdout is the anz process (or node root), and that same principal can edit
    audit.log in place. Neither forces an attacker to cross a boundary twice.
  • In our cluster the file is the weaker choice. The log collector tails
    container stdout (/var/log/pods/...), not in-container PVC files, so
    audit.log is a single unreplicated copy sitting next to the audited system,
    while stdout would flow off-host to Loki.
  • A dedicated audit log is legitimately justified by separation of concerns /
    stable schema / independent retention / compliance framing
    — but not by
    tamper-resistance, which is where the current design reads backwards.

Where audit integrity actually comes from (increasing assurance)

  1. Off-host, fast, into a store anz's own principal can't reach. Cheapest
    win: also emit each AuditEvent to stdout as a structured JSON line so Loki
    ingests it.
  2. Append-only / access-segregated sink — object storage with object-lock /
    WORM, or a Loki tenant the anz service account can't delete from.
  3. Tamper-evidence at the source — per-event sequence number + hash chain
    (prev_hash), optionally signed, so deletion/edits are detectable even if
    the sink is compromised. A plain appendable file can't offer this.

Proposed (incremental)

  • Dual-emit AuditEvent to stdout as structured JSON (in addition to, or
    instead of, the file) so it reaches central logging. Smallest,
    highest-value change.
  • Document that the audit-log file is local/structured retention, not an
    integrity control.
  • (Later) Add seq + prev_hash to AuditEvent for tamper-evidence;
    consider signing.
  • (Later) Ship to an append-only / access-segregated sink.

References

  • src/audit.rsAuditEvent, AuditLogger (file sink), LogEventParams
  • Context: navidocs client_credentials integration; docs/ROADMAP.md Known Gaps.
## Summary anz writes audit events (`AuditEvent` in `src/audit.rs`) to a local **file** (`audit.log`, on the PVC in the homelab deployment). In a Kubernetes cluster whose log pipeline scrapes container **stdout** (our Loki + otel-collector-logs setup), this means the audit events never leave the node, while ordinary operational stdout does. The file-based sink also provides no meaningful tamper-resistance over stdout. ## How it surfaced Wiring up navidocs (`docs.navicore.tech`) machine-to-machine auth via the new `client_credentials` grant: a failed attempt (wrong client secret) returned `401` but produced **nothing** in `kubectl logs` for anz — because the audit event goes to the audit-log file, not stdout. (A `tracing::warn!` has since been added to the `client_credentials` reject path, which *does* reach stdout/Loki — that is currently the only audit-ish signal that leaves the host.) ## The problem - **File vs stdout is not an integrity boundary.** Whoever can forge anz's stdout is the anz process (or node root), and that same principal can edit `audit.log` in place. Neither forces an attacker to cross a boundary twice. - **In our cluster the file is the *weaker* choice.** The log collector tails container stdout (`/var/log/pods/...`), not in-container PVC files, so `audit.log` is a single unreplicated copy sitting next to the audited system, while stdout would flow off-host to Loki. - A dedicated audit log is legitimately justified by **separation of concerns / stable schema / independent retention / compliance framing** — but *not* by tamper-resistance, which is where the current design reads backwards. ## Where audit integrity actually comes from (increasing assurance) 1. **Off-host, fast**, into a store anz's own principal can't reach. Cheapest win: also emit each `AuditEvent` to stdout as a structured JSON line so Loki ingests it. 2. **Append-only / access-segregated sink** — object storage with object-lock / WORM, or a Loki tenant the anz service account can't delete from. 3. **Tamper-evidence at the source** — per-event sequence number + hash chain (`prev_hash`), optionally signed, so deletion/edits are *detectable* even if the sink is compromised. A plain appendable file can't offer this. ## Proposed (incremental) - [ ] Dual-emit `AuditEvent` to stdout as structured JSON (in addition to, or instead of, the file) so it reaches central logging. Smallest, highest-value change. - [ ] Document that the audit-log file is local/structured retention, **not** an integrity control. - [ ] (Later) Add `seq` + `prev_hash` to `AuditEvent` for tamper-evidence; consider signing. - [ ] (Later) Ship to an append-only / access-segregated sink. ## References - `src/audit.rs` — `AuditEvent`, `AuditLogger` (file sink), `LogEventParams` - Context: navidocs `client_credentials` integration; `docs/ROADMAP.md` Known Gaps.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
navicore/anz#35
No description provided.