kopi-docka

mirror of https://github.com/TZERO78/kopi-docka.git synced 2026-06-19 07:37:12 +00:00

Author	SHA1	Message	Date
TZERO78	0e30400884	test(dr): mock kopia dependency in stream tests for CI CI runners don't have kopia binaries; the command's dependency-check exited before reaching the streaming branch, so both stream regression tests failed there even though the code path under test was correct. Mock DependencyHelper.exists() (and the DR manager) so the test focuses on the streaming-Console-print branch the v7.6.3 fix targets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 08:59:50 +00:00
TZERO78	6aa1ec2d49	release: v7.6.3 — fix DR --stream Console.print(err=True) crash Rich's Console.print() does not accept an `err=` kwarg; every `disaster-recovery export --stream` invocation blew up with TypeError before any ZIP byte hit stdout. Fix: bind a separate `Console(stderr=True)` for the 3 streaming-path messages. Reproduced over SSH against the testlab. Regression covered by 2 new tests in test_disaster_recovery_commands.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 08:56:29 +00:00
TZERO78	ff024a54d6	fix(sftp): canonical Kopia params + backend-dispatched repair (Plan 0038) Plan 0038 — finishes the Plan 0029 story for the direct SFTP wizard, which still shipped the broken `--path=user@host:path` shape and an invalid `--sftp-port` flag until now. Verified locally: Kopia rejects `--sftp-port` with `unknown long flag`; the bug never triggered because port 22 (default) skipped the branch. Single source of truth: new `helpers/backend_helper.py::build_sftp_kopia_params()` + `ensure_known_hosts()`. SFTP wizard, Tailscale wizard, and `rebuild_kopia_params()` repair hook all feed through it. `advanced config repair-kopia-params` is now backend-dispatched — `BackendBase.rebuild_kopia_params(credentials)` is the entrypoint, each backend overrides if it has a credentials-based rebuild path. New `MissingCredentialsError` lets each backend declare its required keys without the command hardcoding them. No `if backend == "sftp"` branches left in the command. SFTP wizard now also persists a `[credentials]` block so future configs are repairable through the same flow as Tailscale. Doctor sanity regex extended from `--path=…` to `--path[=\s]…` — the v7.0.0–v7.6.0 direct-SFTP wizard wrote the space form, which the Plan 0029 regex did not catch. 1244 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 06:54:24 +00:00
TZERO78	30a6ad0898	fix: ci — drop unused os import + migrate test mocks to sudo_helper boundary CI surfaced two issues my local lint missed: 1. After the sudo_helper migration, `os` is no longer used anywhere in `disaster_recovery_manager.py` (was only there for `os.environ.get` and `os.chown` — both now go through the helper). ruff F401 caught it. Dropped. 2. `test_dr_export.py::test_export_sets_ownership` patched `pwd.getpwnam` and `os.chown` directly — testing the OS-level implementation, not the new helper-mediated contract. Re-anchored the patches at the helper boundary (`kopi_docka.helpers.sudo_helper.os.chown`) and set SUDO_UID/SUDO_GID env-vars instead of mocking pwd. Same intent, new seam. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 17:28:45 +00:00
TZERO78	58f2b44625	refactor: extract sudo_helper, replace 11 duplicated SUDO_USER patterns Plan 0037. The SUDO_USER / SUDO_UID / SUDO_GID handling was duplicated at 11 sites across 4 files with three different validation niveaus — some checked SUDO_USER against a shell-injection regex, others didn't. New module kopi_docka/helpers/sudo_helper.py exposes one typed API: @dataclass SudoUserInfo: name, uid, gid, home, invoked_with_sudo def get_sudo_user_info() -> SudoUserInfo def chown_to_sudo_user(path) -> None def find_in_sudo_user_home(relative) -> Optional[Path] def sudo_user_home_path(relative) -> Optional[Path] All 11 sites migrated: - cores/disaster_recovery_manager.py (2 sites) - cores/restore_manager.py (2 sites) - helpers/file_operations.py (1 site) - backends/rclone.py (4 sites) Side benefit: the four previously-unvalidated SUDO_USER reads now go through the same shell-injection validation as the other sites, at zero cost for legitimate usernames. backends/rclone.py no longer needs `import os` either. +18 unit tests in tests/unit/test_helpers/test_sudo_helper.py covering under-sudo / no-sudo / shell-injection / path-traversal / garbage-UID / chown success+failure / find/path-build variants. All 1234 prior tests unchanged (behavioral parity). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 17:25:04 +00:00
TZERO78	a9daf7de63	chore: tz-aware datetime cleanup (mop-up after v7.5.2) Five remaining naive datetime.now().isoformat() call sites turned up in a read-only codebase scan after v7.5.2 fixed the snapshot-tag timestamps. All emit ISO strings that round-trip through datetime.fromisoformat(); without a timezone they parse as naive and break comparisons with tz-aware values downstream (same class of bug that crashed the restore wizard in v7.5.2). * cores/backup_manager.py:101 — BackupMetadata.timestamp goes into /backup/kopi-docka/metadata/.json cores/backup_volume_handler.py:125, 207 — TAR-mode snapshot tags (legacy backup path) * cores/disaster_recovery_manager.py:697 — created_at in DR-bundle recovery-info.json * cores/disaster_recovery_manager.py:1069 — timestamp in DR-bundle backup-status.json All five now emit datetime.now(timezone.utc).isoformat(). Pre-existing user-facing local-time strftime() calls used in filenames (config-backup-*, restore-rollback tarballs, DR-bundle output filenames) stay naive on purpose — they represent local wall-clock time, not machine-parseable timestamps. Two new regression tests cover the DR-bundle path; the existing v7.5.2 snapshot-timestamp tests cover the backup_manager paths. No user- visible behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 15:07:37 +00:00
TZERO78	5196d0acda	fix(repo): detect backend mismatch in initialize() and reconnect Editing kopia_params in /etc/kopi-docka.json from one backend to another (e.g. rclone -> sftp) followed by `advanced repo init` silently kept Kopia talking to the old backend: initialize() only checked is_connected() and treated any active connection as "we're done", regardless of which backend it pointed to. The user saw "Repository initialized successfully" while their new SFTP target stayed empty. Add _current_storage_type() that reads storage.type from Kopia's connect-config and _expected_storage_type() that takes the first token of kopia_params. In initialize(), compare both before the is_connected() shortcut; on mismatch, log a warning and call disconnect() so the subsequent create/connect actually retargets to the new backend. The check is intentionally limited to storage type — a path/host swap inside the same backend still hits the shortcut and needs a manual disconnect. That keeps the change minimal and focused on the observed regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 14:32:56 +00:00
TZERO78	17cbb99fe3	fix(restore): normalize snapshot timestamps to tz-aware UTC The restore wizard crashed with "can't compare offset-naive and offset-aware datetimes" whenever the snapshot list contained one tag-timestamp from pre-v7.5.2 backup_manager (naive, written via datetime.now().isoformat()) and one fallback timestamp from the restore_manager exception/default branch (aware, datetime.now( timezone.utc)). sort() then crashed before any restore points were shown. Introduce _parse_snapshot_timestamp() in restore_manager that always returns a tz-aware UTC datetime — naive legacy tags are treated as UTC, parse errors fall back to now(utc). Use it from both _find_restore_points() and _find_restore_points_for_machine(). In backup_manager, write new snapshot-tag timestamps and the networks metadata backup_timestamp as datetime.now(timezone.utc). isoformat() so future tags round-trip cleanly without ever needing the legacy-as-UTC assumption. Existing snapshots remain restorable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 14:32:47 +00:00
TZERO78	96b5d2c537	fix(dr): recover.sh handles SFTP backends without false-fail exit Two related issues in the disaster-recovery script generator: * The legacy generic `else` branch printed "Unsupported auto-connect for this repository scheme" and `exit 1` — even though the file-restore steps that ran before it had succeeded. Every SFTP / Tailscale recovery looked like a failure to the user. * recover.sh had no real SFTP branch at all. Now there's an explicit `elif repo_type == "sftp":` block that builds a non-interactive `kopia repository connect sftp --path=… --host=… --username=… --keyfile=… --known-hosts=…` from the connection info in recovery-info.json. The block also guards on $KEYFILE being readable before connecting — missing key prints a "install with mode 600 and re-run" warning and exits 0 (since the file-restore portion already succeeded). Unknown backends now also exit 0 with a manual-connect hint instead of failing hard. Also extends `_extract_repo_from_status` for SFTP to capture port/username/keyfile/knownHostsFile (not just host/path) so the generated connect call is complete. Adds module-level `sha256_file()` helper used by the SHA256 fingerprints in instructions / doctor. Plan 0030 / Phases 1 + 2. +11 unit tests covering the new branches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 14:04:39 +00:00
TZERO78	996da9e264	refactor(doctor): point Backend Sanity at repair-kopia-params command Section 5.1 used to print a generated `sudo sed -i 's#…#…#' /etc/...` line. Replaced with a one-line pointer to the new `advanced config repair-kopia-params` subcommand — easier to read, no shell-escaping foot-guns, no copy/paste mistakes. The `_build_sftp_migration_command` helper is removed entirely; the two unit tests that exercised it are replaced by a single TestBackendSanityHint case asserting doctor surfaces the new command (and no longer emits `sudo sed -i`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 13:07:50 +00:00
TZERO78	7c4d7dd149	feat(config): advanced config repair-kopia-params A one-command path to the v7.4.0 migration that v7.4.0 itself only handed out as a sed snippet. Reads the still-correct [credentials] section, rebuilds kopia_params in Kopia-SFTP's canonical --path / --host / --username / --keyfile / --known-hosts shape, and writes it back through the existing atomic Config.save(). Behavior: - Shows an old-vs-new diff and prompts for confirmation. - `--dry-run` to preview, `--yes` to skip the prompt. - Idempotent: a second run says "already in the canonical shape". - Refuses non-SFTP backends and configs missing remote_path / peer FQDN / ssh_key — those need the full wizard, not a parameter rebuild. Follow-up to Plan 0029. The Kopia repository itself is untouched — only the local config string changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 13:07:43 +00:00
TZERO78	18e100e7f2	feat(doctor): detect broken SFTP kopia_params and suggest migration New Section 5.1 "Backend Sanity" in `kopi-docka doctor`. Targets the three legacy-wizard fingerprints from v7.0.0 – v7.3.13: - --path=HOST:PATH (host embedded in path) - missing --username - missing --keyfile / --sftp-password When any are present, doctor prints a copy/paste-ready `sed` command populated with the user's actual peer FQDN, ssh user, key path, etc. from [credentials] — so the fix is one command, not a documentation trail. Section is silent on healthy configs and on non-SFTP backends. Plan 0029 / Phase 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 12:31:17 +00:00
TZERO78	5fd71a8927	fix(tailscale): correct kopia SFTP params + inode-aware Unraid mirror Two correctness fixes to the Tailscale wizard, both surfaced during a live rclone+GDrive → Tailscale-SFTP migration. * setup_interactive() now builds kopia_params as Kopia's SFTP CLI actually expects — separate --path / --host / --username / --keyfile / --known-hosts flags. The pre-v7.4 form shipped --path=HOST:PATH and omitted --username and --keyfile entirely; Kopia accepted the form at `repository connect` but every subsequent snapshot hung indefinitely. A new _ensure_known_hosts() helper runs ssh-keyscan up front so the very first systemd/cron-driven connect doesn't stall on a host-key prompt. * _mirror_key_to_persistent_path() classifies the remote's SSH layout via inode comparison (`stat -c '%d:%i' /root/.ssh /boot/config/ssh/root`) before writing. On Unraid 6.12+ those two paths share an inode (symlink/bind-mount) and the legacy mirror — `touch /boot/config/ssh/root` — actually treated the persistent directory as a file, which on real systems failed and would have clobbered /root, /known_hosts etc. if it ever succeeded. Cases handled: - unraid-modern-symlinked → no-op (already persistent) - unraid-modern-separate → write to /boot/config/ssh/root/authorized_keys - unraid-legacy → write to /boot/config/ssh/root (file) - standard-linux → no-op - unknown → log + skip Plan 0029 / Phase 1 + Phase 4. 13 new test cases cover both paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 12:31:09 +00:00
TZERO78	eddfd7693d	fix(backup): show rclone cold-start hint only for rclone backend ensure_repository() used to print "rclone cold-start can take 60-120 s on Google Drive" in the connect spinner for every backend, which is plainly misleading for Tailscale/SFTP runs that connect in well under a second. The hint is now gated on kopia_params starting with `rclone`. Plan 0029 / Phase 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 12:30:54 +00:00
TZERO78	7aa79c1fa3	release: v7.3.10 — backup: dry-run skip + cached preflight + spinner Live `--log-level debug` output on a real rclone+GDrive system showed that `kopi-docka backup --dry-run` paid 98 s for a cold `kopia repository status` round-trip just to render a simulation report, plus another 4 s on a `force_refresh=True` preflight that bypassed a cache populated 4 s earlier. And those 98-102 s of black terminal had no spinner — looked indistinguishable from a hang. Fixes: - Dry-run skips ensure_repository() entirely. A --dry-run is a pure simulation; consulting kopia repository status served no purpose beyond paying the rclone cold-start tax. Testlab: 98 s → 0.4 s. - Preflight in _run_backup() drops force_refresh=True. The cache TTL is 60 s and the call sequence is tight, so a cached read is exactly what we want — still verifies connectivity within this backup run, but doesn't pay double on slow backends. - Rich spinner around the is_connected() and connect() calls in ensure_repository() with an explicit "rclone cold-start can take 60-120 s on Google Drive" hint. The wait is physically unavoidable (kopia spawns rclone-serve, OAuth refresh, first GDrive API hit) but the user now sees that something is happening. The dry-run regression is pinned: test_backup_dry_run now sets mock_repo.is_connected.side_effect = AssertionError(...) so any future code that re-introduces the dry-run status call fails loudly. 1132 unit tests passing. Live verified on the testlab: dry-run in 0.4 s, real backup unchanged in shape (~3 min for one nginx unit on rclone), no force_refresh spam in --log-level debug output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 08:20:58 +00:00
TZERO78	e0743db073	release: v7.3.9 — apply_global_defaults / update_global_retention: read-before-write Plan 0028 (v7.3.0) made the global Kopia policy "idempotent on every connect". True at the Kopia layer — `kopia policy set --global` with identical values is a no-op for Kopia itself — but NOT at the network layer. On rclone backends `policy set` is a full repository metadata round-trip every time. A live system reported 296 SECONDS for a single retention re-write where every value already matched. apply_global_defaults() (called on every connect()) and update_global_retention() (called by `advanced snapshot retention set`) now both: 1. Call `kopia policy show --global --json` first — that's a read, ~1-2 s. 2. Compare every retention/compression value against what we'd write. 3. Skip the multi-minute write entirely if everything already matches. Live measured on the rclone+GDrive testlab: Before (v7.3.8): retention set with already-matching values → 286 s After (v7.3.9): retention set with already-matching values → 4 s The long write still happens — and only happens — when retention or compression actually drift between kopi-docka.json and the Kopia repo. That's the one case where the round-trip is worth it. Tests: 1132 unit tests passing. TestUpdateGlobalRetention rewritten with read+write side_effect arrays; 5 new cases pin the contract (skip when matches, write when drifted, write when show fails, write-failure still returns False, single-value drift triggers write). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 07:55:08 +00:00
TZERO78	ac00ad0090	fix: v7.3.8 — CRITICAL: don't silently create a second config when /etc one is unreadable A real data-loss footgun surfaced on a live system. Sequence: 1. User has a working /etc/kopi-docka.json (root-owned 0600, typical multi-user install with password_file pointing at another root-only file). 2. User runs e.g. `kopi-docka advanced snapshot retention set --latest 5` without sudo by accident. 3. _find_config_file() walks the default search order, hits /etc/kopi-docka.json, sees it exists, os.access(..., R_OK) returns False, logs a warning, and FALLS THROUGH. 4. The fall-through branch creates a brand-new ~/.config/kopi-docka/config.json from config_template.json — with the default password "kopia-docka" and the default repo path "filesystem --path /backup/kopia-repository". 5. From there on kopi-docka finds the user-scoped config first (search order is user → root) and never touches the /etc one again. Backups silently run against a nonexistent repo, the password is wrong, DR bundles export against the wrong config, and the user has two drifted configs without knowing. Fix: _find_config_file() collects unreadable existing paths across the search pass and raises PermissionError if any were found, instead of silently creating a second config. The error message lists the unreadable path(s) and — when one is under /etc/ — explicitly suggests running with sudo. The only way to get a second config is now to ask for it explicitly via --config. Two new tests in tests/unit/test_helpers/test_config.py pin the "raise instead of silent fallback" + the sudo-hint contracts. 1130 unit tests passing. Live reproduction verified: with a root:root 0600 file at /tmp and the default search paths monkeypatched at it, Config() now raises with the exact message from the docstring; no second file is created. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 07:33:25 +00:00
TZERO78	8f9ec3ec6d	release: v7.3.7 — retention set spinner + partial-update + safety prompt Two UX problems on a live rclone+GDrive system (where `kopia policy set --global` takes up to 5 minutes for one round-trip): 1. An empty `kopi-docka advanced snapshot retention set` would silently overwrite the existing config with Typer's hard-coded defaults (--latest 10 --daily 7 …), and burn a 30-90 s metadata round-trip doing it. Every flag is now Optional[int] defaulting to None; with no flags the command shows a yellow "Nothing to change" panel and exits. --force overrides for scripts that genuinely want the no-op re-write. 2. `retention set --daily 14` used to clobber the other five values with Typer defaults. Now an omitted flag means "keep what's in the config" — the effective values are template * user-flags. Plus: the long kopia round-trip now runs under a Rich spinner with an explicit "this typically takes 30-90 s on rclone backends" hint. On the user's GDrive setup the call took 286 s; without the spinner that looks identical to a hang. Tests: 1128 passing. Four new TestCmdRetentionSet cases (explicit args, kopia failure, partial args preserve, no-args without/with --force) plus three updated CLI-layer cases for the new Optional[int] signature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 07:27:52 +00:00
TZERO78	eb72ad4c7b	fix: v7.3.1 — connect() actually re-applies global defaults, retention show key Two bugs uncovered by the post-v7.3.0 E2E testlab run. 1) KopiaRepository.connect() short-circuit broke Plan 0028's "idempotent on every connect" promise. When is_connected() returned True (the normal case for any backup run after the first), apply_global_defaults was skipped. The config file's retention values and Kopia's actual global policy could drift indefinitely — a kopi-docka.json edit simply didn't reach the repo until a manual `kopia policy set`. Fix: also run apply_global_defaults() on the already-connected branch. Idempotent at the Kopia layer (`kopia policy set --global` with identical values is a no-op). 2) admin snapshot retention show reported "Kopia policy unavailable" on healthy repos. _display_retention looked up Kopia's global policy under the "retentionPolicy" key; kopia policy show --global --json puts it under the top-level "retention" key. Old key always returned None. Fix: read "retention" first, accept legacy "retentionPolicy" defensively. Verification (live testlab): - Before fix: config retention 3/0/7/4/6/1, Kopia repo 10/48/7/4/24/3. - kopi-docka backup --unit test-stack-nginx ran → apply_global_defaults actually fired this time → Kopia retention now matches config exactly. - admin snapshot retention show panel renders both rows with the same numbers; no more "Kopia policy unavailable" line. Tests: 1124 passing. New cases: - test_already_connected_still_reapplies_global_defaults (pins the connect() fix — replaces the old short-circuit assertion) - TestCmdRetentionShow.test_renders_kopia_values_from_retention_key - TestCmdRetentionShow.test_accepts_legacy_retentionPolicy_key_too - TestCmdRetentionShow.test_empty_policy_renders_unavailable Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 05:55:18 +00:00
TZERO78	de2b570e9d	docs+migration: align Plan 0028 surface — docs, templates, dead config fields Three loose ends after Plan 0028 finalize: 1. Docs - docs/CONFIGURATION.md: replace the old "Policy State Cache" / "smart-skip" section with a "Global Retention Policy (since v7.3.0)" explanation. Trim the historical v5.3.0 path-mismatch story to one paragraph (Plan 0028 makes the per-path-vs-virtual-path question moot anyway). - docs/ARCHITECTURE.md: update BackupManager step list, method table, and the backup sequence diagram to show _collect_backup_sources() + sequential repo.create_snapshots(), instead of the parallel ThreadPool + _ensure_policies path. - docs/diagrams/04_sequenceDiagram.mmd: regenerated to match. - docs/architecture_components.json: KopiaPolicyManager method list updated (list_policies / delete_* in, set_*_for_target out); KopiaRepository gains create_snapshots. 2. Migration - New _maybe_cleanup_legacy_state_files() in KopiaRepository removes the obsolete ~/.config/kopi-docka/policy_state.json (smart-skip cache from v7.2.0). Idempotent, runs on first _run() call after upgrade; OSError is downgraded to debug — housekeeping, not a hard requirement. - Live-verified on the rclone+GDrive testlab: created an empty policy_state.json, ran doctor → file removed with the expected INFO log line, doctor reports clean state. 3. Config templates / dead fields - kopi_docka/templates/config_template.json: drop parallel_workers and task_timeout (no longer consumed under Plan 0028). - kopi_docka/helpers/config.py: keep both Pydantic fields but mark them "Deprecated since v7.3.0 — ignored" in the description so existing kopi-docka.json files still validate. - BackupManager.__init__: remove the dead self.max_workers assignment that survived Phase 3. - DryRunReport: drop the "Parallel Workers" line from system info and config review sections. Tests: 1121 unit tests pass. New TestLegacyStateCleanup (3 tests) pins the file-removal contract. test_dry_run_manager.py updated to assert "Parallel Workers" is absent from the report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 05:22:43 +00:00
TZERO78	d45197a981	fix(policy-prune): become a true legacy cleanup (Plan 0028 UX gap) Before this patch, `advanced policy prune` only deleted orphaned per-path policies (entries whose path no longer matched any snapshot). With Plan 0028 making per-path policies obsolete, that left an awkward gap: doctor flagged every leftover per-path policy as "Legacy" and pointed users at `policy prune` — but prune refused to touch a policy as long as a snapshot still lived at that path. cmd_prune now removes every per-path entry on this host/user under a kopi-docka-managed prefix (/var/lib/docker/volumes/, /var/cache/kopi-docka/staging/), regardless of snapshot state. The Plan 0024 cross-host safety guards (host == socket.gethostname(), user == getpass.getuser(), known prefix) are kept — a foreign host's policies on a shared repo or a user's custom path are never touched and now surface in a separate "Skipped (safety)" table. Tests rewritten (TestPolicyPruneOrphanDetection → TestPolicyPruneLegacyCleanup, 6 tests): matching-snapshot path is now pruned; foreign host / unknown prefix stays. End-to-end verified on the rclone+GDrive testlab — one leftover legacy policy detected, pruned in one batch, doctor reports "Global-only — clean state". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 05:13:27 +00:00
TZERO78	d48c8a041f	refactor(backup): phase 3 - sequential create_snapshots() entry point, ready for upstream multi-path Plan 0028 Phase 3. The backup hot path now goes through a single KopiaRepository.create_snapshots(sources) call. - KopiaRepository.create_snapshots(sources) iterates a list of BackupSource sequentially, returns one snapshot ID per source (empty string for per-source failures so callers can map them back to a kind/volume). Docstring references kopia/kopia#1725 — when upstream ships native multi-path snapshot create, swap the body and all callers stay unchanged. - backup_unit() restructured: discovery via _collect_backup_sources() runs BEFORE containers are stopped (docker inspect needs them alive; failing early also avoids unnecessary stops), then the snapshot loop runs after stop in one create_snapshots() call. Volume metadata (volumes_backed_up, networks_backed_up, docker_config_backed_up, kopia_snapshot_ids) is reconstructed from the returned ID list zipped with source tags. - ThreadPoolExecutor and self.max_workers gone — sequential by user preference (Sicherheit/Logs > Performance on VPS-class hardware). Kopia's --max-parallel-snapshots is the right knob if parallelism is needed later. - TAR-mode volumes keep their legacy per-volume path via volume_handler.backup_volume because they pipe through stdin and can't be expressed as a BackupSource; the fallback runs sequentially too. Tests: 1115 unit tests pass, coverage 52.06 %. New test_create_snapshots.py pins the empty-input / partial-failure / sequential-order contract. test_workflow.py rewritten end-to-end against the new flow. test_backup_manager.py TestParallelBackup collapsed into TestSequentialSnapshotLoop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 04:51:23 +00:00
TZERO78	0a4f36af90	refactor(backup): phase 2 - decouple source discovery from kopia execution Plan 0028 Phase 2. Pure structural refactor — no behaviour change. - New BackupSource dataclass in types.py captures the (path, kind, tags, description) tuple kopia snapshot create needs, ahead of the snapshot loop itself. - BackupManager grows four _collect__sources() helpers: _collect_recipe_sources, _collect_network_sources, _collect_docker_config_sources, _collect_volume_sources. Each takes the side-effects that used to live inline in _backup_ (staging dir prep, docker inspect, secret redaction, network export) and returns the BackupSource it staged. - The legacy _backup_recipes / _backup_networks / _backup_docker_config wrappers now delegate to the collector and only own the repo.create_snapshot() call. Their return shape is unchanged so backup_unit() doesn't move yet. - Aggregate _collect_backup_sources() returns the full ordered source list backup_unit() would snapshot (recipes → networks → docker_config → volumes), gated on backup_scope exactly like the live code path. This is the entry point Phase 3 will plug into repo.create_snapshots(). Tests: 1112 unit tests pass. New test_backup_source_collection.py adds 12 focused tests for path / tag construction in isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 04:25:50 +00:00
TZERO78	c35f2da1f2	refactor(policy): phase 1 - global-only retention, drop per-path policy apparatus Plan 0028 Phase 1. The backup hot path now never writes per-path Kopia policies; the global policy at KopiaRepository.connect() / initialize() covers every snapshot via Kopia's policy inheritance tree. - KopiaRepository.connect() now calls apply_global_defaults() on every successful connect (idempotent — Kopia treats identical --global writes as a no-op). Retention changes in kopi-docka.json reach Kopia on the next run without a manual step. - BackupManager loses _ensure_policies, _apply_target_policy, auto_prune_orphaned_policies, the policy_state attribute, and all PolicyStateManager imports. - KopiaPolicyManager loses set_retention_for_target and set_compression_for_target — list_policies / delete_policy / delete_policies_batch stay for `advanced policy prune` legacy cleanup. - helpers/policy_state.py module deleted; smart-skip apparatus gone. - doctor `_check_policy_alignment` reports any remaining per-path policies as "Legacy" with a hint to run `kopi-docka advanced policy prune`. - systemd templates' comments updated; constants.py retention-policy block trimmed to a one-liner referencing Plan 0028. Tests: 1100 unit tests pass, coverage 51.91 % (>= 40 % gate). New TestConnect block in test_repository_manager.py pins the connect()→apply_global_defaults() contract. test_backup_manager_policies.py inverted: now asserts the per-path setters are GONE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 04:19:37 +00:00
TZERO78	466b5b7728	test: unit tests for Plan 0026 phases + adjust existing tests New tests (41 total): - tests/unit/test_helpers/test_policy_state.py (15) — PolicyStateManager round-trip, multi-profile isolation, corruption recovery, deterministic hashing (sort-keys regardless of dict iteration order) - tests/unit/test_cores/test_backup_manager_policies.py (15) — _ensure_policies staging removal, smart-skip first-vs-second-run, hash-not-recorded-on-failure, retention-change-triggers-reapply, auto_prune_orphaned_policies safety (foreign host/user/prefix never touched, snapshot paths preserved, flat snap['path'] key, pruned targets removed from smart-skip state) - tests/unit/test_cores/test_repository_manager.py (11 added) — _get_rclone_args is rclone-only and honors config override, _maybe_patch_repo_config_for_rclone is one-shot per process, no-op for non-rclone backends, no-op when already at desired value, doesn't raise on corrupt config files Adjusted existing tests: - test_backup_manager.py TestEnsurePolicies → TestEnsurePoliciesVolumes: rewrote 5 stale tests that asserted on the removed static_targets policies; deleted 2 tests that specifically verified the now-removed staging path matching behavior. Coverage of the live behavior is in test_backup_manager_policies.py. - test_backup_manager.py + test_error_handling.py: helpers now initialize manager.policy_state = Mock() so BackupManager's __new__ setup pattern still works. - test_repository_manager.py make_repository(): set _rclone_timeout_patched=True so _run()'s migration hook is a no-op in unrelated tests; migration-specific tests use their own setup. KopiaRepository._maybe_patch_repo_config_for_rclone now uses getattr() for the flag — defensive against tests that bypass __init__. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 18:54:27 +00:00
TZERO78	4c556d46a6	release: v7.1.5 — fix policy/doctor snapshot-path extraction KopiaRepository.list_snapshots() returns flat dicts (snap["path"]), not nested under "source". Both `advanced policy prune` and doctor's `_check_policy_alignment` were reading snap.get("source", {}).get("path", ""), producing an always-empty snapshot-paths set: doctor over-reported orphans on healthy repos, and policy prune would have deleted every per-path policy. Fix is a 1-liner in each call site plus a regression test that pins the correct flat-dict shape. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 14:03:53 +00:00
TZERO78	1d044b2220	feat: Alerting Overhaul — pre-flight check, verbose failures, missed-backup detection (Plan 0025) - KopiaCommandError: replaces bare RuntimeError in _run(), carries returncode + stderr_tail (UTF-8 safe) - is_connected(force_refresh=True): bypasses 60s cache for pre-flight check; negative result also cached - BackendUnreachableError: new exception in backends/base.py (subclass of ConnectionError) - Pre-flight backend check in backup_unit() before container teardown; containers NOT stopped on failure - BackupErrorDetail dataclass in types.py; BackupMetadata.error_details persisted in JSON (backward-compat) - Verbose failure notifications: phase, exit code, stderr tail in fenced code block (Markdown-injection-safe) - Markdown body_format for all Apprise sends; services without Markdown degrade gracefully - send_connectivity_alert() and send_missed_backup_alert() added to NotificationManager - MissedBackupChecker: time-based overdue detection using MetadataReader; per-unit threshold overrides - Post-run missed-backup check after every backup_unit(); alert suppression via missed_state.json - Doctor section 8 "Backup Freshness": age per unit, OVERDUE units highlighted - Config: notifications.verbose, notifications.preflight_check, alerting.missed_backup.* - Config templates and examples updated; migration guide in INSTALLATION.md Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-23 11:51:15 +00:00
Markus	4a30e75b5f	fix: Security hardening & documentation overhaul (Plan 0023) Squash merge of Plan 0023 — security hardening, robustness improvements, documentation overhaul. Phases: - 1A–1C: S1–S10 security fixes (shell injection, SUDO_USER, hook validation, fchmod race, sensitive stderr filtering, mkdtemp, KOPIA_PASSWORD) - 2A–2C: R1–R8 robustness (subprocess leak, JSON errors, bounds checks, SIGTERM grace) - 3A–3D: D1–D13 documentation overhaul Post-review fix: strip line-continuation formatting before shlex.split (Codex P1).	2026-04-11 12:51:47 +02:00
TZERO78	a746fa1907	fix: retention policies for recipes/networks/docker-config never applied Policy targets used relative paths (e.g. recipes/myunit) but snapshots were created with absolute staging paths (/var/cache/kopi-docka/staging/ recipes/myunit). Kopia never matched these — retention was silently broken for all non-volume snapshots. Docker-config had no policy at all. Also adds a doctor check (Section 7) that detects policy-snapshot path mismatches at runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 20:02:26 +00:00
TZERO78	6badef8ed4	refactor: extract BackupVolumeHandler from BackupManager Move volume backup logic (_backup_volume, _backup_volume_direct, _backup_volume_tar) into dedicated BackupVolumeHandler class. BackupManager delegates to self.volume_handler. Public API unchanged. Step 1 of 7 in handler extraction refactoring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 17:08:37 +00:00
TZERO78	ee762f0583	feat: add backup history command (Plan 0021) New `kopi-docka history` command to browse past backups from stored metadata JSONs. Includes table view, detail panels, filters (--unit, --failed, --last, --since), statistics (--stats), and JSON output (--json). No root privileges required. Adds MetadataReader helper and BackupMetadata.from_dict() for reusable metadata deserialization. 43 new tests, 95%/99% coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 17:10:35 +01:00
TZERO78	064c7fa515	fix: bypass cleanup — route all Kopia calls through KopiaRepository._run() (v6.2.3) Plan 0020: Centralize all subprocess→kopia calls through KopiaRepository._run() so Kopia CLI changes only need fixing in one file. - Extended _run() with extra_env and config_file parameters - DR-Manager: replaced 3× direct subprocess→kopia with self.repo.status() - DR-Manager: replaced subprocess.run(["hostname"]) with socket.gethostname() - KopiaRepository: routed set_repo_password(), verify_password(), create_filesystem_repo_at_path() through _run() - Documented repo_helper.detect_existing_cloud_repo() as intentional exception - Updated tests to mock repo.status() instead of subprocess.run() - 805 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 14:40:40 +01:00
TZERO78	4ee206edc0	feat: disaster recovery single-file encrypted ZIP export (#58 ) - New 'disaster-recovery export' subcommand with AES-256 encrypted ZIP - Single file replaces legacy 3-file bundle (tar.gz.enc + PASSWORD + README) - SSH stream mode (--stream) for zero-disk-footprint exports - Automatic passphrase generation (word-based or random) with confirmation - No external dependencies (pyzipper replaces tar/openssl) - Automatic file ownership (SUDO_USER) - Cross-platform extraction (7-Zip, WinZip, unzip) - Legacy command shows deprecation warning - 27 new unit tests for full coverage - Version bump to 6.2.0 Closes #58	2026-02-07 11:54:34 +01:00
Markus	0173e3925f	feat: automatic docker run reconstruction for standalone containers (#59 ) (#65 ) - Add DockerRunBuilder helper to parse inspect.json and reconstruct docker run commands - Integrate automatic container reconstruction in restore workflow - Interactive prompts to start containers immediately after restore - Support for all common Docker parameters (ports, volumes, env, networks, capabilities, etc.) - Filter Docker-injected environment variables - Check for existing containers before attempting start - Add 30+ comprehensive unit tests - Update documentation (CHANGELOG, USAGE, FEATURES) - Bump version to 6.1.0 Closes #59	2026-02-07 11:29:40 +01:00
TZERO78	e586c72583	merge: resolve conflict with main branch	2026-01-31 07:13:50 +01:00
TZERO78	1aa8f315d5	fix: prevent parallel backup execution + setup wizard Kopia install (#61 ) - Add ProcessLock helper using fcntl.flock() for global backup lock - Lock file at /run/kopi-docka.lock with /tmp fallback - Graceful skip with 'Backup already running (PID: X)' message - Auto-release on process termination (kernel-managed) - Fix ImportError in setup wizard for Kopia installation - Remove dead cmd_install_deps import - Add interactive 3-option menu for Kopia installation - Support official Kopia installer script - Add 16 unit tests for ProcessLock - Bump version to 6.0.2 Closes #61	2026-01-31 07:10:49 +01:00
Markus	38b70045e9	Merge pull request #60 from TZERO78/dev Release v6.0.1 - Bugfix Release	2026-01-05 09:28:22 +01:00
TZERO78	e10cec6f6a	fix(system): handle non-existent paths in disk space check (#57 ) - _disk_probe_base() now walks up directory tree to nearest existing parent - Prevents [Errno 2] crash on fresh installations - Handles edge cases: nested paths, remote URLs, filesystem root - Added 17 comprehensive unit tests	2026-01-05 09:21:12 +01:00
TZERO78	79309368e0	fix: version test handles config creation message in CI - test_version_command_format now searches for 'Kopi-Docka' line anywhere in output - Config auto-creation message may appear before version output in fresh environments - Added v6.0.0 to expected version list	2025-12-31 11:09:48 +01:00
TZERO78	d0912bc007	fix: ruff and test mock corrections for CI - Fix F821 undefined 'deps' -> 'dep_manager' in doctor_commands.py - Add TYPE_CHECKING import for MachineInfo in repository_manager.py - Auto-fix 67 f-strings without placeholders using ruff --fix - Fix F841 unused variables (lang, display_error, e) - Fix E722 bare except -> except OSError - Fix E731 lambda -> def function - Update test mocks: subprocess.run -> run_command - Fix _stop_containers/_start_containers signatures (add service_handler) - All 680 unit tests passing - ruff F401 and vulture checks passing	2025-12-31 11:03:09 +01:00
TZERO78	ccc28ad84c	fix: update test mocks for changed method signatures - test_workflow.py: Update track_start/track_stop functions to accept service_handler parameter (matches new _start_containers/_stop_containers signatures in backup_manager.py) - test_dependency_commands.py: Update test_check_no_config assertion to accept either 'No configuration found' or 'repository not connected' messages since behavior depends on context state - test_disaster_recovery_manager.py: Mock run_command instead of subprocess.run for _create_encrypted_archive tests (code now uses run_command from ui_utils for process tracking)	2025-12-31 10:35:25 +01:00
TZERO78	4d32f2f56e	fix: resolve CI failures for vulture and test mocks - Fix vulture 'unused frame variable' in safe_exit_manager.py:100 by adding 'del frame' statement (signal handler signature requires the parameter but it's unused) - Fix test_backup_commands.py tests by adding DependencyManager mock to prevent tests from failing due to missing docker/kopia in CI environment - tests were hitting check_hard_gate() which checks for real dependencies	2025-12-31 10:30:59 +01:00
TZERO78	6133001015	fix(ci): resolve failing checks - Remove unused imports (ruff F401): - config_commands.py: getpass, detect_existing_cloud_repo - repository_commands.py: get_backend_type, is_cloud_backend - Fix Mermaid parse error in 03_graph_LR.mmd: - Replace parentheses with bracket notation for nodes - Fix test_hooks_manager.py to mock run_command: - Tests were mocking subprocess.run but HooksManager uses run_command - Updated 5 tests to use correct mock target	2025-12-31 10:21:59 +01:00
TZERO78	5ec0d72982	fix: show kopia params in change-password panel	2025-12-31 00:01:58 +01:00
TZERO78	73c5c8e991	test(safety): add integration tests and manual testing guide Add 9 integration tests for real-world abort scenarios plus comprehensive manual testing guide for production validation. Integration Tests (9 tests): 1. TestBackupAbort (2 tests) - Container restart verification (real Docker container) - LIFO restart order validation 2. TestRestoreAbort (2 tests) - Containers stay stopped (data safety) - Temp directory cleanup verification 3. TestDisasterRecoveryAbort (2 tests) - Temp directory removal - Incomplete archive cleanup 4. TestProcessLayerTermination (2 tests) - Real subprocess SIGTERM termination - SIGKILL after SIGTERM timeout (process ignores SIGTERM) 5. TestSignalHandlerEndToEnd (1 test) - Full backup abort scenario with container restart Test Requirements: - Docker daemon running - Root access (sudo) - Tests skip gracefully without privileges Manual Testing Guide (8 procedures): 1. Backup Ctrl+C → container auto-restart 2. systemctl stop → SIGTERM handling (exit 143) 3. Restore Ctrl+C → containers stay stopped 4. DR Ctrl+C → temp cleanup 5. Zombie process prevention 6. Hook process termination 7. SIGKILL limitation (documented) 8. Double SIGINT → force exit Guide Includes: - Step-by-step test procedures - Expected results and verification steps - Troubleshooting guide - Integration test execution instructions - Success criteria checklist Files: - tests/integration/test_safe_exit_abort_scenarios.py (570 lines) - tests/MANUAL_TESTING_SAFE_EXIT.md (450 lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-30 20:28:55 +01:00
TZERO78	a75d597b35	test(safety): add comprehensive unit tests for SafeExitManager Add 47 unit tests covering all SafeExitManager components with 100% pass rate. Test Coverage: 1. SafeExitManager Core (5 tests) - Singleton pattern enforcement - Signal handler installation (SIGINT/SIGTERM) - Initial state validation - Instance reset for testing 2. Process Layer (8 tests) - Process register/unregister operations - SIGTERM → 5s wait → SIGKILL termination sequence - Process registry cleanup - Thread-safe concurrent register/unregister (200 processes, 4 threads) 3. Strategy Layer (6 tests) - Handler registration with priority sorting - Handler unregistration - Priority-based execution order - Error-tolerant handler execution (continues on failure) 4. ServiceContinuityHandler (7 tests) - Container tracking (register/unregister) - LIFO restart order verification - Error tolerance on container restart failure - Empty container list handling - Priority 10 verification 5. DataSafetyHandler (8 tests) - Temp directory tracking and cleanup - Stopped container tracking - Nonexistent directory handling - Error tolerance on cleanup failure - Priority 20 verification 6. CleanupHandler (7 tests) - Callback registration and execution - Custom names and priorities - Error tolerance on callback failure - Main callback execution - Priority 50 verification 7. Signal Handler Integration (4 tests) - SIGINT handling (exit code 130) - SIGTERM handling (exit code 143) - Nested cleanup prevention - Cleanup flag management 8. Thread Safety (2 tests) - Concurrent handler registration (200 handlers, 4 threads) - Concurrent process operations (100 processes, 4 threads) Test Results: ✅ 47/47 passing (100%) File: tests/unit/test_cores/test_safe_exit_manager.py (650 lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-30 20:28:38 +01:00
TZERO78	6df436fad4	chore(cleanup): dead code cleanup and quality improvements Major code cleanup initiative removing legacy code and improving quality: Removed (727 lines): - Delete kopi_docka/helpers/dependency_installer.py (493 lines) - Removed legacy DependencyInstaller class - Removed InstallStatus and InstallResult classes - Delete kopi_docka/helpers/os_detect.py (234 lines) - Removed orphaned OS detection module Code Quality: - Remove 6 vulture findings (unused imports and parameters) - Auto-fix 33 unused imports with ruff - Remove broken install-deps and show-deps commands Testing & Coverage: - Increase coverage threshold from 25% to 40% - Current coverage: 43.86% - All 646 tests passing Documentation: - Update docs/architecture_components.json - Remove DependencyInstaller entries - Remove auto_install and install_missing from DependencyManager CI/CD: - Add GitHub Actions workflow for tests and code quality - Add vulture dead code detection (min-confidence 80) - Add ruff unused import checks - Test on Python 3.10, 3.11, 3.12 Dependencies: - Add vulture>=2.0.0 to dev dependencies - Add ruff>=0.1.0 to dev dependencies This cleanup removes obsolete code from Plan 0007 (Hard/Soft Gate), enforces higher test coverage, and prevents future code quality issues through automated CI checks. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-29 22:01:10 +01:00
TZERO78	22614a26b3	feat(restore): add backup scope detection and warnings - Add docker_config_snapshots field to RestorePoint type - Update _find_restore_points() to recognize docker_config snapshots - Add _get_backup_scope() method to read scope from snapshot tags - Add _show_scope_warnings() method to display scope-specific warnings - Display warning panel for MINIMAL scope restores - Display info message when docker_config snapshots are present - Integrate scope checking into _restore_unit() workflow - Add 10 comprehensive unit tests for scope detection - All restore_manager tests passing Enables users to see warnings about backup limitations when restoring. Minimal scope backups will show clear warnings about missing recipes. Docker config backups show instructions for manual restore. Related to plan_0008: Backup Scope FULL implementation - Task 4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-28 20:57:35 +01:00
TZERO78	68853699e6	feat(backup): implement docker_config backup for FULL scope - Add _backup_docker_config() method to BackupManager - Backs up /etc/docker/daemon.json when present - Backs up /etc/systemd/system/docker.service.d/ when present - Add docker_config_backed_up field to BackupMetadata - Integrate docker_config backup in backup_unit() for FULL scope only - Add 7 comprehensive unit tests for docker_config functionality - Handles permission errors gracefully (non-fatal) - All 88 backup_manager tests passing Docker config backup is opt-in via --scope full flag. Errors are logged but don't fail the backup. This enables full disaster recovery including Docker daemon configuration. Related to plan_0008: Backup Scope FULL implementation - Task 2 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-28 20:43:17 +01:00
TZERO78	d4c8b84e0a	feat(backup): add backup_scope tag to all snapshots - Add backup_scope parameter to _backup_volume(), _backup_recipes(), _backup_networks() - Include "backup_scope" tag in all snapshot metadata - Update backup_unit() to pass scope to all backup methods - Add comprehensive unit tests for backup_scope tag verification - Update existing tests to support new backup_scope parameter This enables simple scope tracking for restore validation and sets the foundation for implementing docker_config backup. Related to plan_0008: Backup Scope FULL implementation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-28 20:26:51 +01:00

1 2

72 Commits