CI runners don't have kopia binaries; the command's dependency-check
exited before reaching the streaming branch, so both stream regression
tests failed there even though the code path under test was correct.
Mock DependencyHelper.exists() (and the DR manager) so the test
focuses on the streaming-Console-print branch the v7.6.3 fix targets.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rich's Console.print() does not accept an `err=` kwarg; every
`disaster-recovery export --stream` invocation blew up with TypeError
before any ZIP byte hit stdout. Fix: bind a separate
`Console(stderr=True)` for the 3 streaming-path messages.
Reproduced over SSH against the testlab. Regression covered by 2 new
tests in test_disaster_recovery_commands.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0038 — finishes the Plan 0029 story for the direct SFTP wizard,
which still shipped the broken `--path=user@host:path` shape and an
invalid `--sftp-port` flag until now. Verified locally: Kopia rejects
`--sftp-port` with `unknown long flag`; the bug never triggered because
port 22 (default) skipped the branch.
Single source of truth: new `helpers/backend_helper.py::build_sftp_kopia_params()`
+ `ensure_known_hosts()`. SFTP wizard, Tailscale wizard, and
`rebuild_kopia_params()` repair hook all feed through it.
`advanced config repair-kopia-params` is now backend-dispatched —
`BackendBase.rebuild_kopia_params(credentials)` is the entrypoint, each
backend overrides if it has a credentials-based rebuild path. New
`MissingCredentialsError` lets each backend declare its required keys
without the command hardcoding them. No `if backend == "sftp"` branches
left in the command.
SFTP wizard now also persists a `[credentials]` block so future configs
are repairable through the same flow as Tailscale.
Doctor sanity regex extended from `--path=…` to `--path[=\s]…` — the
v7.0.0–v7.6.0 direct-SFTP wizard wrote the space form, which the
Plan 0029 regex did not catch.
1244 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI surfaced two issues my local lint missed:
1. After the sudo_helper migration, `os` is no longer used anywhere in
`disaster_recovery_manager.py` (was only there for `os.environ.get`
and `os.chown` — both now go through the helper). ruff F401 caught
it. Dropped.
2. `test_dr_export.py::test_export_sets_ownership` patched
`pwd.getpwnam` and `os.chown` directly — testing the OS-level
implementation, not the new helper-mediated contract. Re-anchored
the patches at the helper boundary
(`kopi_docka.helpers.sudo_helper.os.chown`) and set SUDO_UID/SUDO_GID
env-vars instead of mocking pwd. Same intent, new seam.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0037. The SUDO_USER / SUDO_UID / SUDO_GID handling was duplicated
at 11 sites across 4 files with three different validation niveaus —
some checked SUDO_USER against a shell-injection regex, others didn't.
New module kopi_docka/helpers/sudo_helper.py exposes one typed API:
@dataclass SudoUserInfo: name, uid, gid, home, invoked_with_sudo
def get_sudo_user_info() -> SudoUserInfo
def chown_to_sudo_user(path) -> None
def find_in_sudo_user_home(relative) -> Optional[Path]
def sudo_user_home_path(relative) -> Optional[Path]
All 11 sites migrated:
- cores/disaster_recovery_manager.py (2 sites)
- cores/restore_manager.py (2 sites)
- helpers/file_operations.py (1 site)
- backends/rclone.py (4 sites)
Side benefit: the four previously-unvalidated SUDO_USER reads now go
through the same shell-injection validation as the other sites, at
zero cost for legitimate usernames. backends/rclone.py no longer needs
`import os` either.
+18 unit tests in tests/unit/test_helpers/test_sudo_helper.py covering
under-sudo / no-sudo / shell-injection / path-traversal / garbage-UID /
chown success+failure / find/path-build variants. All 1234 prior tests
unchanged (behavioral parity).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five remaining naive datetime.now().isoformat() call sites turned up
in a read-only codebase scan after v7.5.2 fixed the snapshot-tag
timestamps. All emit ISO strings that round-trip through
datetime.fromisoformat(); without a timezone they parse as naive and
break comparisons with tz-aware values downstream (same class of bug
that crashed the restore wizard in v7.5.2).
* cores/backup_manager.py:101 — BackupMetadata.timestamp goes into
/backup/kopi-docka/metadata/*.json
* cores/backup_volume_handler.py:125, 207 — TAR-mode snapshot tags
(legacy backup path)
* cores/disaster_recovery_manager.py:697 — created_at in DR-bundle
recovery-info.json
* cores/disaster_recovery_manager.py:1069 — timestamp in DR-bundle
backup-status.json
All five now emit datetime.now(timezone.utc).isoformat().
Pre-existing user-facing local-time strftime() calls used in filenames
(config-backup-*, restore-rollback tarballs, DR-bundle output
filenames) stay naive on purpose — they represent local wall-clock
time, not machine-parseable timestamps.
Two new regression tests cover the DR-bundle path; the existing v7.5.2
snapshot-timestamp tests cover the backup_manager paths. No user-
visible behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Editing kopia_params in /etc/kopi-docka.json from one backend to
another (e.g. rclone -> sftp) followed by `advanced repo init`
silently kept Kopia talking to the old backend: initialize() only
checked is_connected() and treated any active connection as
"we're done", regardless of which backend it pointed to. The user
saw "Repository initialized successfully" while their new SFTP
target stayed empty.
Add _current_storage_type() that reads storage.type from Kopia's
connect-config and _expected_storage_type() that takes the first
token of kopia_params. In initialize(), compare both before the
is_connected() shortcut; on mismatch, log a warning and call
disconnect() so the subsequent create/connect actually retargets
to the new backend.
The check is intentionally limited to storage type — a path/host
swap inside the same backend still hits the shortcut and needs a
manual disconnect. That keeps the change minimal and focused on
the observed regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The restore wizard crashed with "can't compare offset-naive and
offset-aware datetimes" whenever the snapshot list contained one
tag-timestamp from pre-v7.5.2 backup_manager (naive, written via
datetime.now().isoformat()) and one fallback timestamp from the
restore_manager exception/default branch (aware, datetime.now(
timezone.utc)). sort() then crashed before any restore points
were shown.
Introduce _parse_snapshot_timestamp() in restore_manager that
always returns a tz-aware UTC datetime — naive legacy tags are
treated as UTC, parse errors fall back to now(utc). Use it from
both _find_restore_points() and _find_restore_points_for_machine().
In backup_manager, write new snapshot-tag timestamps and the
networks metadata backup_timestamp as datetime.now(timezone.utc).
isoformat() so future tags round-trip cleanly without ever needing
the legacy-as-UTC assumption.
Existing snapshots remain restorable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related issues in the disaster-recovery script generator:
* The legacy generic `else` branch printed "Unsupported auto-connect
for this repository scheme" and `exit 1` — even though the
file-restore steps that ran before it had succeeded. Every SFTP /
Tailscale recovery looked like a failure to the user.
* recover.sh had no real SFTP branch at all.
Now there's an explicit `elif repo_type == "sftp":` block that builds
a non-interactive `kopia repository connect sftp --path=… --host=…
--username=… --keyfile=… --known-hosts=…` from the connection info in
recovery-info.json. The block also guards on $KEYFILE being readable
before connecting — missing key prints a "install with mode 600 and
re-run" warning and exits 0 (since the file-restore portion already
succeeded). Unknown backends now also exit 0 with a manual-connect
hint instead of failing hard.
Also extends `_extract_repo_from_status` for SFTP to capture
port/username/keyfile/knownHostsFile (not just host/path) so the
generated connect call is complete. Adds module-level `sha256_file()`
helper used by the SHA256 fingerprints in instructions / doctor.
Plan 0030 / Phases 1 + 2. +11 unit tests covering the new branches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section 5.1 used to print a generated `sudo sed -i 's#…#…#' /etc/...`
line. Replaced with a one-line pointer to the new
`advanced config repair-kopia-params` subcommand — easier to read, no
shell-escaping foot-guns, no copy/paste mistakes.
The `_build_sftp_migration_command` helper is removed entirely; the
two unit tests that exercised it are replaced by a single
TestBackendSanityHint case asserting doctor surfaces the new command
(and no longer emits `sudo sed -i`).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A one-command path to the v7.4.0 migration that v7.4.0 itself only
handed out as a sed snippet. Reads the still-correct [credentials]
section, rebuilds kopia_params in Kopia-SFTP's canonical
--path / --host / --username / --keyfile / --known-hosts shape, and
writes it back through the existing atomic Config.save().
Behavior:
- Shows an old-vs-new diff and prompts for confirmation.
- `--dry-run` to preview, `--yes` to skip the prompt.
- Idempotent: a second run says "already in the canonical shape".
- Refuses non-SFTP backends and configs missing remote_path / peer
FQDN / ssh_key — those need the full wizard, not a parameter
rebuild.
Follow-up to Plan 0029. The Kopia repository itself is untouched —
only the local config string changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New Section 5.1 "Backend Sanity" in `kopi-docka doctor`. Targets the
three legacy-wizard fingerprints from v7.0.0 – v7.3.13:
- --path=HOST:PATH (host embedded in path)
- missing --username
- missing --keyfile / --sftp-password
When any are present, doctor prints a copy/paste-ready `sed` command
populated with the user's actual peer FQDN, ssh user, key path, etc.
from [credentials] — so the fix is one command, not a documentation
trail. Section is silent on healthy configs and on non-SFTP backends.
Plan 0029 / Phase 3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two correctness fixes to the Tailscale wizard, both surfaced during a
live rclone+GDrive → Tailscale-SFTP migration.
* setup_interactive() now builds kopia_params as Kopia's SFTP CLI
actually expects — separate --path / --host / --username / --keyfile
/ --known-hosts flags. The pre-v7.4 form shipped --path=HOST:PATH and
omitted --username and --keyfile entirely; Kopia accepted the form
at `repository connect` but every subsequent snapshot hung
indefinitely. A new _ensure_known_hosts() helper runs ssh-keyscan up
front so the very first systemd/cron-driven connect doesn't stall on
a host-key prompt.
* _mirror_key_to_persistent_path() classifies the remote's SSH layout
via inode comparison (`stat -c '%d:%i' /root/.ssh /boot/config/ssh/root`)
before writing. On Unraid 6.12+ those two paths share an inode
(symlink/bind-mount) and the legacy mirror — `touch
/boot/config/ssh/root` — actually treated the persistent directory
as a file, which on real systems failed and would have clobbered
/root, /known_hosts etc. if it ever succeeded. Cases handled:
- unraid-modern-symlinked → no-op (already persistent)
- unraid-modern-separate → write to /boot/config/ssh/root/authorized_keys
- unraid-legacy → write to /boot/config/ssh/root (file)
- standard-linux → no-op
- unknown → log + skip
Plan 0029 / Phase 1 + Phase 4. 13 new test cases cover both paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ensure_repository() used to print "rclone cold-start can take 60-120 s
on Google Drive" in the connect spinner for every backend, which is
plainly misleading for Tailscale/SFTP runs that connect in well under
a second. The hint is now gated on kopia_params starting with `rclone`.
Plan 0029 / Phase 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live `--log-level debug` output on a real rclone+GDrive system showed
that `kopi-docka backup --dry-run` paid 98 s for a cold
`kopia repository status` round-trip just to render a simulation
report, plus another 4 s on a `force_refresh=True` preflight that
bypassed a cache populated 4 s earlier. And those 98-102 s of black
terminal had no spinner — looked indistinguishable from a hang.
Fixes:
- Dry-run skips ensure_repository() entirely. A --dry-run is a pure
simulation; consulting kopia repository status served no purpose
beyond paying the rclone cold-start tax. Testlab: 98 s → 0.4 s.
- Preflight in _run_backup() drops force_refresh=True. The cache TTL
is 60 s and the call sequence is tight, so a cached read is
exactly what we want — still verifies connectivity within this
backup run, but doesn't pay double on slow backends.
- Rich spinner around the is_connected() and connect() calls in
ensure_repository() with an explicit "rclone cold-start can take
60-120 s on Google Drive" hint. The wait is physically unavoidable
(kopia spawns rclone-serve, OAuth refresh, first GDrive API hit)
but the user now sees that something is happening.
The dry-run regression is pinned: test_backup_dry_run now sets
mock_repo.is_connected.side_effect = AssertionError(...) so any
future code that re-introduces the dry-run status call fails loudly.
1132 unit tests passing. Live verified on the testlab: dry-run in
0.4 s, real backup unchanged in shape (~3 min for one nginx unit on
rclone), no force_refresh spam in --log-level debug output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0028 (v7.3.0) made the global Kopia policy "idempotent on every
connect". True at the Kopia layer — `kopia policy set --global` with
identical values is a no-op for Kopia itself — but NOT at the network
layer. On rclone backends `policy set` is a full repository metadata
round-trip every time. A live system reported 296 SECONDS for a single
retention re-write where every value already matched.
apply_global_defaults() (called on every connect()) and
update_global_retention() (called by `advanced snapshot retention set`)
now both:
1. Call `kopia policy show --global --json` first — that's a read,
~1-2 s.
2. Compare every retention/compression value against what we'd
write.
3. Skip the multi-minute write entirely if everything already
matches.
Live measured on the rclone+GDrive testlab:
Before (v7.3.8): retention set with already-matching values → 286 s
After (v7.3.9): retention set with already-matching values → 4 s
The long write still happens — and only happens — when retention or
compression actually drift between kopi-docka.json and the Kopia repo.
That's the one case where the round-trip is worth it.
Tests: 1132 unit tests passing. TestUpdateGlobalRetention rewritten
with read+write side_effect arrays; 5 new cases pin the contract
(skip when matches, write when drifted, write when show fails,
write-failure still returns False, single-value drift triggers write).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A real data-loss footgun surfaced on a live system. Sequence:
1. User has a working /etc/kopi-docka.json (root-owned 0600, typical
multi-user install with password_file pointing at another root-only
file).
2. User runs e.g. `kopi-docka advanced snapshot retention set --latest 5`
without sudo by accident.
3. _find_config_file() walks the default search order, hits
/etc/kopi-docka.json, sees it exists, os.access(..., R_OK) returns
False, logs a warning, and FALLS THROUGH.
4. The fall-through branch creates a brand-new
~/.config/kopi-docka/config.json from config_template.json — with
the default password "kopia-docka" and the default repo path
"filesystem --path /backup/kopia-repository".
5. From there on kopi-docka finds the user-scoped config first (search
order is user → root) and never touches the /etc one again. Backups
silently run against a nonexistent repo, the password is wrong, DR
bundles export against the wrong config, and the user has two
drifted configs without knowing.
Fix: _find_config_file() collects unreadable existing paths across the
search pass and raises PermissionError if any were found, instead of
silently creating a second config. The error message lists the
unreadable path(s) and — when one is under /etc/ — explicitly suggests
running with sudo. The only way to get a second config is now to ask
for it explicitly via --config.
Two new tests in tests/unit/test_helpers/test_config.py pin the "raise
instead of silent fallback" + the sudo-hint contracts. 1130 unit tests
passing.
Live reproduction verified: with a root:root 0600 file at /tmp and
the default search paths monkeypatched at it, Config() now raises with
the exact message from the docstring; no second file is created.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two UX problems on a live rclone+GDrive system (where `kopia policy set
--global` takes up to 5 minutes for one round-trip):
1. An empty `kopi-docka advanced snapshot retention set` would silently
overwrite the existing config with Typer's hard-coded defaults
(--latest 10 --daily 7 …), and burn a 30-90 s metadata round-trip
doing it. Every flag is now Optional[int] defaulting to None; with
no flags the command shows a yellow "Nothing to change" panel and
exits. --force overrides for scripts that genuinely want the no-op
re-write.
2. `retention set --daily 14` used to clobber the other five values
with Typer defaults. Now an omitted flag means "keep what's in the
config" — the effective values are template * user-flags.
Plus: the long kopia round-trip now runs under a Rich spinner with an
explicit "this typically takes 30-90 s on rclone backends" hint. On the
user's GDrive setup the call took 286 s; without the spinner that looks
identical to a hang.
Tests: 1128 passing. Four new TestCmdRetentionSet cases (explicit args,
kopia failure, partial args preserve, no-args without/with --force) plus
three updated CLI-layer cases for the new Optional[int] signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs uncovered by the post-v7.3.0 E2E testlab run.
1) KopiaRepository.connect() short-circuit broke Plan 0028's "idempotent
on every connect" promise. When is_connected() returned True (the
normal case for any backup run after the first), apply_global_defaults
was skipped. The config file's retention values and Kopia's actual
global policy could drift indefinitely — a kopi-docka.json edit
simply didn't reach the repo until a manual `kopia policy set`.
Fix: also run apply_global_defaults() on the already-connected
branch. Idempotent at the Kopia layer (`kopia policy set --global`
with identical values is a no-op).
2) admin snapshot retention show reported "Kopia policy unavailable" on
healthy repos. _display_retention looked up Kopia's global policy
under the "retentionPolicy" key; kopia policy show --global --json
puts it under the top-level "retention" key. Old key always returned
None. Fix: read "retention" first, accept legacy "retentionPolicy"
defensively.
Verification (live testlab):
- Before fix: config retention 3/0/7/4/6/1, Kopia repo 10/48/7/4/24/3.
- kopi-docka backup --unit test-stack-nginx ran → apply_global_defaults
actually fired this time → Kopia retention now matches config exactly.
- admin snapshot retention show panel renders both rows with the same
numbers; no more "Kopia policy unavailable" line.
Tests: 1124 passing. New cases:
- test_already_connected_still_reapplies_global_defaults
(pins the connect() fix — replaces the old short-circuit assertion)
- TestCmdRetentionShow.test_renders_kopia_values_from_retention_key
- TestCmdRetentionShow.test_accepts_legacy_retentionPolicy_key_too
- TestCmdRetentionShow.test_empty_policy_renders_unavailable
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three loose ends after Plan 0028 finalize:
1. Docs
- docs/CONFIGURATION.md: replace the old "Policy State Cache" /
"smart-skip" section with a "Global Retention Policy (since v7.3.0)"
explanation. Trim the historical v5.3.0 path-mismatch story to one
paragraph (Plan 0028 makes the per-path-vs-virtual-path question
moot anyway).
- docs/ARCHITECTURE.md: update BackupManager step list, method
table, and the backup sequence diagram to show
_collect_backup_sources() + sequential repo.create_snapshots(),
instead of the parallel ThreadPool + _ensure_policies path.
- docs/diagrams/04_sequenceDiagram.mmd: regenerated to match.
- docs/architecture_components.json: KopiaPolicyManager method list
updated (list_policies / delete_* in, set_*_for_target out);
KopiaRepository gains create_snapshots.
2. Migration
- New _maybe_cleanup_legacy_state_files() in KopiaRepository removes
the obsolete ~/.config/kopi-docka/policy_state.json (smart-skip
cache from v7.2.0). Idempotent, runs on first _run() call after
upgrade; OSError is downgraded to debug — housekeeping, not a
hard requirement.
- Live-verified on the rclone+GDrive testlab: created an empty
policy_state.json, ran doctor → file removed with the expected
INFO log line, doctor reports clean state.
3. Config templates / dead fields
- kopi_docka/templates/config_template.json: drop parallel_workers
and task_timeout (no longer consumed under Plan 0028).
- kopi_docka/helpers/config.py: keep both Pydantic fields but mark
them "Deprecated since v7.3.0 — ignored" in the description so
existing kopi-docka.json files still validate.
- BackupManager.__init__: remove the dead self.max_workers
assignment that survived Phase 3.
- DryRunReport: drop the "Parallel Workers" line from system info
and config review sections.
Tests: 1121 unit tests pass. New TestLegacyStateCleanup (3 tests) pins
the file-removal contract. test_dry_run_manager.py updated to assert
"Parallel Workers" is absent from the report.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this patch, `advanced policy prune` only deleted *orphaned*
per-path policies (entries whose path no longer matched any snapshot).
With Plan 0028 making per-path policies obsolete, that left an awkward
gap: doctor flagged every leftover per-path policy as "Legacy" and
pointed users at `policy prune` — but prune refused to touch a policy
as long as a snapshot still lived at that path.
cmd_prune now removes every per-path entry on this host/user under a
kopi-docka-managed prefix (/var/lib/docker/volumes/,
/var/cache/kopi-docka/staging/), regardless of snapshot state. The
Plan 0024 cross-host safety guards (host == socket.gethostname(),
user == getpass.getuser(), known prefix) are kept — a foreign host's
policies on a shared repo or a user's custom path are never touched
and now surface in a separate "Skipped (safety)" table.
Tests rewritten (TestPolicyPruneOrphanDetection →
TestPolicyPruneLegacyCleanup, 6 tests): matching-snapshot path is now
pruned; foreign host / unknown prefix stays. End-to-end verified on the
rclone+GDrive testlab — one leftover legacy policy detected, pruned in
one batch, doctor reports "Global-only — clean state".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0028 Phase 3. The backup hot path now goes through a single
KopiaRepository.create_snapshots(sources) call.
- KopiaRepository.create_snapshots(sources) iterates a list of
BackupSource sequentially, returns one snapshot ID per source (empty
string for per-source failures so callers can map them back to a
kind/volume). Docstring references kopia/kopia#1725 — when upstream
ships native multi-path snapshot create, swap the body and all
callers stay unchanged.
- backup_unit() restructured: discovery via _collect_backup_sources()
runs BEFORE containers are stopped (docker inspect needs them alive;
failing early also avoids unnecessary stops), then the snapshot loop
runs after stop in one create_snapshots() call. Volume metadata
(volumes_backed_up, networks_backed_up, docker_config_backed_up,
kopia_snapshot_ids) is reconstructed from the returned ID list
zipped with source tags.
- ThreadPoolExecutor and self.max_workers gone — sequential by user
preference (Sicherheit/Logs > Performance on VPS-class hardware).
Kopia's --max-parallel-snapshots is the right knob if parallelism is
needed later.
- TAR-mode volumes keep their legacy per-volume path via
volume_handler.backup_volume because they pipe through stdin and
can't be expressed as a BackupSource; the fallback runs sequentially
too.
Tests: 1115 unit tests pass, coverage 52.06 %. New
test_create_snapshots.py pins the empty-input / partial-failure /
sequential-order contract. test_workflow.py rewritten end-to-end
against the new flow. test_backup_manager.py TestParallelBackup
collapsed into TestSequentialSnapshotLoop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0028 Phase 2. Pure structural refactor — no behaviour change.
- New BackupSource dataclass in types.py captures the (path, kind, tags,
description) tuple kopia snapshot create needs, ahead of the snapshot
loop itself.
- BackupManager grows four _collect_*_sources() helpers:
_collect_recipe_sources, _collect_network_sources,
_collect_docker_config_sources, _collect_volume_sources. Each takes
the side-effects that used to live inline in _backup_* (staging dir
prep, docker inspect, secret redaction, network export) and returns
the BackupSource it staged.
- The legacy _backup_recipes / _backup_networks / _backup_docker_config
wrappers now delegate to the collector and only own the
repo.create_snapshot() call. Their return shape is unchanged so
backup_unit() doesn't move yet.
- Aggregate _collect_backup_sources() returns the full ordered source
list backup_unit() would snapshot (recipes → networks → docker_config →
volumes), gated on backup_scope exactly like the live code path. This
is the entry point Phase 3 will plug into repo.create_snapshots().
Tests: 1112 unit tests pass. New test_backup_source_collection.py adds
12 focused tests for path / tag construction in isolation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 0028 Phase 1. The backup hot path now never writes per-path Kopia
policies; the global policy at KopiaRepository.connect() / initialize()
covers every snapshot via Kopia's policy inheritance tree.
- KopiaRepository.connect() now calls apply_global_defaults() on every
successful connect (idempotent — Kopia treats identical --global writes
as a no-op). Retention changes in kopi-docka.json reach Kopia on the
next run without a manual step.
- BackupManager loses _ensure_policies, _apply_target_policy,
auto_prune_orphaned_policies, the policy_state attribute, and all
PolicyStateManager imports.
- KopiaPolicyManager loses set_retention_for_target and
set_compression_for_target — list_policies / delete_policy /
delete_policies_batch stay for `advanced policy prune` legacy cleanup.
- helpers/policy_state.py module deleted; smart-skip apparatus gone.
- doctor `_check_policy_alignment` reports any remaining per-path policies
as "Legacy" with a hint to run `kopi-docka advanced policy prune`.
- systemd templates' comments updated; constants.py retention-policy
block trimmed to a one-liner referencing Plan 0028.
Tests: 1100 unit tests pass, coverage 51.91 % (>= 40 % gate). New
TestConnect block in test_repository_manager.py pins the
connect()→apply_global_defaults() contract. test_backup_manager_policies.py
inverted: now asserts the per-path setters are GONE.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New tests (41 total):
- tests/unit/test_helpers/test_policy_state.py (15) — PolicyStateManager
round-trip, multi-profile isolation, corruption recovery,
deterministic hashing (sort-keys regardless of dict iteration order)
- tests/unit/test_cores/test_backup_manager_policies.py (15) —
_ensure_policies staging removal, smart-skip first-vs-second-run,
hash-not-recorded-on-failure, retention-change-triggers-reapply,
auto_prune_orphaned_policies safety (foreign host/user/prefix never
touched, snapshot paths preserved, flat snap['path'] key, pruned
targets removed from smart-skip state)
- tests/unit/test_cores/test_repository_manager.py (11 added) —
_get_rclone_args is rclone-only and honors config override,
_maybe_patch_repo_config_for_rclone is one-shot per process, no-op
for non-rclone backends, no-op when already at desired value, doesn't
raise on corrupt config files
Adjusted existing tests:
- test_backup_manager.py TestEnsurePolicies → TestEnsurePoliciesVolumes:
rewrote 5 stale tests that asserted on the removed static_targets
policies; deleted 2 tests that specifically verified the now-removed
staging path matching behavior. Coverage of the live behavior is in
test_backup_manager_policies.py.
- test_backup_manager.py + test_error_handling.py: helpers now
initialize manager.policy_state = Mock() so BackupManager's __new__
setup pattern still works.
- test_repository_manager.py make_repository(): set
_rclone_timeout_patched=True so _run()'s migration hook is a no-op in
unrelated tests; migration-specific tests use their own setup.
KopiaRepository._maybe_patch_repo_config_for_rclone now uses getattr()
for the flag — defensive against tests that bypass __init__.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
KopiaRepository.list_snapshots() returns flat dicts (snap["path"]),
not nested under "source". Both `advanced policy prune` and
doctor's `_check_policy_alignment` were reading
snap.get("source", {}).get("path", ""), producing an always-empty
snapshot-paths set: doctor over-reported orphans on healthy repos,
and policy prune would have deleted every per-path policy.
Fix is a 1-liner in each call site plus a regression test that
pins the correct flat-dict shape.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- KopiaCommandError: replaces bare RuntimeError in _run(), carries returncode + stderr_tail (UTF-8 safe)
- is_connected(force_refresh=True): bypasses 60s cache for pre-flight check; negative result also cached
- BackendUnreachableError: new exception in backends/base.py (subclass of ConnectionError)
- Pre-flight backend check in backup_unit() before container teardown; containers NOT stopped on failure
- BackupErrorDetail dataclass in types.py; BackupMetadata.error_details persisted in JSON (backward-compat)
- Verbose failure notifications: phase, exit code, stderr tail in fenced code block (Markdown-injection-safe)
- Markdown body_format for all Apprise sends; services without Markdown degrade gracefully
- send_connectivity_alert() and send_missed_backup_alert() added to NotificationManager
- MissedBackupChecker: time-based overdue detection using MetadataReader; per-unit threshold overrides
- Post-run missed-backup check after every backup_unit(); alert suppression via missed_state.json
- Doctor section 8 "Backup Freshness": age per unit, OVERDUE units highlighted
- Config: notifications.verbose, notifications.preflight_check, alerting.missed_backup.*
- Config templates and examples updated; migration guide in INSTALLATION.md
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Policy targets used relative paths (e.g. recipes/myunit) but snapshots
were created with absolute staging paths (/var/cache/kopi-docka/staging/
recipes/myunit). Kopia never matched these — retention was silently
broken for all non-volume snapshots. Docker-config had no policy at all.
Also adds a doctor check (Section 7) that detects policy-snapshot path
mismatches at runtime.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move volume backup logic (_backup_volume, _backup_volume_direct,
_backup_volume_tar) into dedicated BackupVolumeHandler class.
BackupManager delegates to self.volume_handler. Public API unchanged.
Step 1 of 7 in handler extraction refactoring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New `kopi-docka history` command to browse past backups from stored
metadata JSONs. Includes table view, detail panels, filters (--unit,
--failed, --last, --since), statistics (--stats), and JSON output
(--json). No root privileges required.
Adds MetadataReader helper and BackupMetadata.from_dict() for reusable
metadata deserialization. 43 new tests, 95%/99% coverage.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Plan 0020: Centralize all subprocess→kopia calls through KopiaRepository._run()
so Kopia CLI changes only need fixing in one file.
- Extended _run() with extra_env and config_file parameters
- DR-Manager: replaced 3× direct subprocess→kopia with self.repo.status()
- DR-Manager: replaced subprocess.run(["hostname"]) with socket.gethostname()
- KopiaRepository: routed set_repo_password(), verify_password(),
create_filesystem_repo_at_path() through _run()
- Documented repo_helper.detect_existing_cloud_repo() as intentional exception
- Updated tests to mock repo.status() instead of subprocess.run()
- 805 tests passing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add DockerRunBuilder helper to parse inspect.json and reconstruct docker run commands
- Integrate automatic container reconstruction in restore workflow
- Interactive prompts to start containers immediately after restore
- Support for all common Docker parameters (ports, volumes, env, networks, capabilities, etc.)
- Filter Docker-injected environment variables
- Check for existing containers before attempting start
- Add 30+ comprehensive unit tests
- Update documentation (CHANGELOG, USAGE, FEATURES)
- Bump version to 6.1.0
Closes#59
- Add ProcessLock helper using fcntl.flock() for global backup lock
- Lock file at /run/kopi-docka.lock with /tmp fallback
- Graceful skip with 'Backup already running (PID: X)' message
- Auto-release on process termination (kernel-managed)
- Fix ImportError in setup wizard for Kopia installation
- Remove dead cmd_install_deps import
- Add interactive 3-option menu for Kopia installation
- Support official Kopia installer script
- Add 16 unit tests for ProcessLock
- Bump version to 6.0.2
Closes#61
- test_version_command_format now searches for 'Kopi-Docka' line anywhere in output
- Config auto-creation message may appear before version output in fresh environments
- Added v6.0.0 to expected version list
- test_workflow.py: Update track_start/track_stop functions to accept
service_handler parameter (matches new _start_containers/_stop_containers
signatures in backup_manager.py)
- test_dependency_commands.py: Update test_check_no_config assertion to
accept either 'No configuration found' or 'repository not connected'
messages since behavior depends on context state
- test_disaster_recovery_manager.py: Mock run_command instead of
subprocess.run for _create_encrypted_archive tests (code now uses
run_command from ui_utils for process tracking)
- Fix vulture 'unused frame variable' in safe_exit_manager.py:100
by adding 'del frame' statement (signal handler signature requires
the parameter but it's unused)
- Fix test_backup_commands.py tests by adding DependencyManager mock
to prevent tests from failing due to missing docker/kopia in CI
environment - tests were hitting check_hard_gate() which checks
for real dependencies
- Add docker_config_snapshots field to RestorePoint type
- Update _find_restore_points() to recognize docker_config snapshots
- Add _get_backup_scope() method to read scope from snapshot tags
- Add _show_scope_warnings() method to display scope-specific warnings
- Display warning panel for MINIMAL scope restores
- Display info message when docker_config snapshots are present
- Integrate scope checking into _restore_unit() workflow
- Add 10 comprehensive unit tests for scope detection
- All restore_manager tests passing
Enables users to see warnings about backup limitations when restoring.
Minimal scope backups will show clear warnings about missing recipes.
Docker config backups show instructions for manual restore.
Related to plan_0008: Backup Scope FULL implementation - Task 4
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add _backup_docker_config() method to BackupManager
- Backs up /etc/docker/daemon.json when present
- Backs up /etc/systemd/system/docker.service.d/ when present
- Add docker_config_backed_up field to BackupMetadata
- Integrate docker_config backup in backup_unit() for FULL scope only
- Add 7 comprehensive unit tests for docker_config functionality
- Handles permission errors gracefully (non-fatal)
- All 88 backup_manager tests passing
Docker config backup is opt-in via --scope full flag. Errors are
logged but don't fail the backup. This enables full disaster recovery
including Docker daemon configuration.
Related to plan_0008: Backup Scope FULL implementation - Task 2
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add backup_scope parameter to _backup_volume(), _backup_recipes(), _backup_networks()
- Include "backup_scope" tag in all snapshot metadata
- Update backup_unit() to pass scope to all backup methods
- Add comprehensive unit tests for backup_scope tag verification
- Update existing tests to support new backup_scope parameter
This enables simple scope tracking for restore validation and
sets the foundation for implementing docker_config backup.
Related to plan_0008: Backup Scope FULL implementation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>