793 Commits

Author SHA1 Message Date
chevron7 56881120b7 fix(stream): don't kill paused streams -- reap dead peers via TCP keepalive instead
1.9.0 added server.timeout = 300s to reap dead mobile connections (B3). But
Node's socket timeout fires on INACTIVITY, and a paused audio stream is
inactive (no bytes flow while backpressured) -- so a pause longer than the
timeout had the server destroy the stream's connection, forcing a reconnect
on resume. On both web and iOS that surfaced as 'I pause, then have to focus
the app for it to play again' after a multi-minute pause; pre-1.9.0 had no
such timeout, so paused streams survived (the exact D1 risk the spec flagged).

Reap genuinely dead/half-open peers (mobile network gone without FIN/RST) via
TCP keepalive instead: server.timeout = 0, and each connection gets
setKeepAlive(true, 30s) so the OS drops a socket once probes fail while a
paused-but-alive stream keeps answering and stays connected.
v1.9.0
2026-06-17 12:36:34 -05:00
chevron7 690778fd5f fix(discover): upsert DiscoveryAlbum so re-processing a week doesn't lose records
Production showed 24 unique-constraint violations on
DiscoveryAlbum(userId, weekStartDate, rgMbid) in 18h: the scan-completion and
reconciliation paths can both create Discovery records for the same album in
the same week, so the second create threw, rolled back the transaction, and
dropped that album's DiscoveryTrack records. Upsert makes it idempotent --
an existing record is left untouched and the track loop fills any gaps.
2026-06-16 13:50:52 -05:00
chevron7 baa7ecd0bc chore(release): v1.9.0
Audio engine rewrite, audiobook session model, podcast auto-refresh recovery,
functional settings, and the stream/QoL hardening from this cycle. Full notes
in CHANGELOG.md.
2026-06-15 15:03:13 -05:00
chevron7 3bf7563ffa fix: review remediations -- scan overlap guard, Subsonic star best-effort, podcast upsert
- Library auto-sync cron skips enqueuing when a scan is already active/waiting,
  so it can't stack a redundant full rescan behind a manual or webhook scan.
- Subsonic star.view is now best-effort: it attempts every id, skips missing
  tracks (P2003), logs genuine failures, and never early-returns mid-loop
  (which left some tracks starred while reporting failure). It reports an error
  only when a real failure occurred and nothing got starred.
- refreshPodcastFeed upserts episodes on (podcastId, guid) instead of
  find-then-create, closing a TOCTOU race between the manual refresh route and
  the auto-refresh job that could throw on the unique constraint.
- Onboarding: rename the shadowing 'user' var in the recovery path for clarity.
2026-06-15 14:04:42 -05:00
chevron7 8d62f30151 fix(settings): Clear Caches uses an operational denylist, not a cache allowlist (review)
Review found the allowlist (9 prefixes) missed ~15 real cache namespaces
(homepage:, mixes:, search:, discovery:, colors:, preview:, fanart:, songlink:,
genres:, radio:, album:, artist:, playlists:, ...), so 'Clear Caches' was a
partial clear that would leave most caches stale. Inverted to a denylist that
spares only the operational namespaces (bull:, sess:, audio:, clap:,
enrichment, lock:, sse:) and clears everything else -- complete and drift-proof
as new caches are added. Verified read-only against production: clears 5210
cache keys, protects all 190 operational keys (queues, control plane).
2026-06-15 14:04:42 -05:00
chevron7 abfe4c82ba fix(enrichment): skip held jobId slots so the failed-job backoff actually retries (review)
Review found the 15-min grace was effectively inert: the phase parked an
entity as 'enriching'/'_queued' even when the add() no-op'd against a failed
jobId still held within the grace window -- removing it from selection until a
process restart, so the advertised auto-retry never happened. Each phase now
checks queue.getJob(jobId) and only enqueues + parks when the slot is actually
free; a held slot is skipped, leaving the entity selectable so it backs off
and genuinely retries once the grace clean frees the slot. Adds a test
asserting a held slot is skipped (no re-add, no park).
2026-06-15 14:04:42 -05:00
chevron7 8f1239709c fix(ux): resolve Soulseek search spinner wedge and onboarding 'already taken' dead-end
- Soulseek search relied solely on an SSE 'complete' event to clear its
  spinner; if that event was dropped (connection blip, backend never emits it)
  the search UI spun forever. Add a 45s fallback that force-completes the
  search so the user sees whatever results arrived; late results still stream
  in via the store subscription.
- Onboarding's 'username already taken' path told the user to refresh, which
  can't recover the half-created account (the token never persisted). Instead
  attempt a login with the same credentials and continue: resume at step 2 if
  onboarding is unfinished, route home if already complete, or send to the
  normal sign-in for a 2FA account. A genuine password mismatch now gets a
  clear 'sign in instead' message rather than a dead end.
2026-06-15 09:00:14 -05:00
chevron7 84dc5a934d fix: surface Subsonic write failures, guard podcast sort, de-spam analyzer log
- Subsonic star.view swallowed every error and returned success, so a
  third-party app could star a track that never saved. Now only a P2003 FK
  violation (track legitimately missing) is absorbed; any other error is
  logged and returns a Subsonic error. Scrobble play-log failures are logged
  instead of silently discarded.
- The podcasts page sorted by author/title with a raw localeCompare on an
  optional field, so one feed with no author crashed the whole page via the
  error boundary. Comparators are now null-guarded.
- The audio analyzer re-logged the same 'N tracks permanently failed' warning
  every idle cycle (~50s) forever; it now logs only when the count changes.
2026-06-15 08:34:08 -05:00
chevron7 dd9b346bc9 feat(settings): make the transcode-cache-size and auto-sync controls functional
Two settings the UI presented as working did nothing. The transcode cache
size slider was saved to the DB but only ever read from the TRANSCODE_CACHE_MAX_GB
env var, which the save path never wrote -- so the slider was inert even
across the restart its own hint told the user to perform. It's now written to
.env on save, matching the restart-required contract.

The 'Auto sync library' toggle had zero readers because no periodic library
scan existed at all (scans were webhook/manual only). Adds a library-sync cron
(every 6h, gated on the autoSync setting) that enqueues a full scan so music
added outside the download pipeline is picked up automatically.
2026-06-15 08:34:08 -05:00
chevron7 07031f315d fix(enrichment): extend failed-job dedup backoff to artist, track, and vibe queues
The podcast dedup-on-failure trap was live on three more queues. The artist
and mood-tags phases never cleaned their queues at all, so a failed job's
jobId marker blocked re-queue until BullMQ's 24h removeOnFail age expired --
far slower than the worker's documented intent to re-pick-up a failed track.
The admin vibe start/retry routes cleaned only completed jobs, so 'Retry
failed embeddings' silently dropped tracks with a lingering failed job.

Automatic phases now clean completed (grace 0, immediately reusable on
success) and failed (15-min grace, so a permanently-failing entity retries
on a backoff instead of every 5s cycle). The manual admin retry routes clean
failed immediately -- the user asked to retry now. Adds a 3-test regression
suite asserting the grace-0-completed / grace-positive-failed split.
2026-06-15 08:34:08 -05:00
chevron7 ebb488aa85 fix(settings): make Clear Caches actually work, and scope it safely
The Clear Caches button never did anything: the handler used the node-redis
v4 scan signature (options object + { cursor, keys } result) against our
ioredis client, whose scan takes positional args and returns [cursor, keys].
Every call threw and cleared nothing -- which is why clearing the cache did
not dislodge the wedged podcast jobs.

Even had it run, "delete every key except sess:" would have wiped live
BullMQ queue state (bull:*, 200+ keys) and the enrichment/audio/clap control
plane. Replace that with an allowlist of genuine rebuildable caches
(MusicBrainz, cover art, Last.fm, Wikidata, Deezer, iTunes, hero images) and
delete in chunks. Verified read-only against production: clears ~5130 cache
keys, preserves all bull:/audio:/enrichment:/clap:/sess: keys.
2026-06-14 23:48:30 -05:00
chevron7 34dc43977b fix(enrichment): clear failed jobs before re-queue so a failed job can't wedge refresh
BullMQ keeps the jobId dedup marker for failed jobs, not just completed
ones. The podcast and vibe refresh phases cleaned only "completed", so a
single failed (or Redis-corrupted, data-less) job kept its jobId marker
forever -- every later add() with that jobId silently no-op'd and the entity
never refreshed again. In production all 4 podcasts were frozen since a job
corruption event; the worker was throwing findUnique({ id: undefined }) on
data-less jobs.

Fix:
- podcast + vibe phases clean BOTH "completed" and "failed" so a failed
  job's jobId is reusable.
- podcast phase optimistically advances lastRefreshed for the selected feeds
  before queuing -- refreshPodcastFeed only advances it on success/304, so
  this gives a failing feed a real backoff window instead of being re-queued
  every cycle.
- podcast worker guards against corrupt/data-less jobs (clear error instead
  of a confusing Prisma undefined-id throw).

Adds a 5-test regression suite asserting the failed-set clean and the
claim-before-queue ordering. Production Redis cleared of the poisoned jobs.
2026-06-14 23:34:24 -05:00
chevron7 775cd6f57b fix(layout): mount ToastProvider above the audio providers
Phase B gave AudioControlsProvider a useToast() dependency, but ToastProvider
was nested inside ConditionalAudioProvider -- the hook threw on the first
authenticated render and AudioErrorBoundary blanked every page's content.
Caught by the nightly E2E suite (19 failures), invisible to tsc/lint/build/
prerender because none execute the authenticated client tree.
2026-06-11 16:32:13 -05:00
chevron7 550913bf29 feat(player): Phases D+E -- stream auth survival and opt-in diagnostics (audio remediation 4.2 + WS5)
A 24h-old session no longer kills playback: stream URL builders refresh the
token proactively when it expires within the hour, and a terminal code-4
media error is classified via a bounded /api/auth/me probe -- a stale token
refreshes and the stream reloads at position with no teardown, no skip, and
no logout. Token refresh now distinguishes a transient network failure
(tokens kept) from a server-rejected refresh (session cleared), so an
offline moment can never force a re-login (FE11/B9). The iOS trace logger
becomes opt-in (?ios_debug=1 flag only) with batched persistence instead of
a synchronous storage write per event (FE17).
2026-06-11 12:05:28 -05:00
chevron7 5138d1aa5c feat(player): Phase C playback engine -- one state machine, one recovery ladder (audio remediation 3.1-3.6)
The AudioController is rebuilt as a thin DOM shell around a pure policy
module (transition(snapshot, event) -> { snapshot, effects }, 199 unit
tests). One status drives all UI; the four overlapping recovery mechanisms
(3s watchdog, 10s stalled-grace, code-2 retry loop, AbortError reload) are
replaced by a single deadline-bounded ladder that distinguishes buffering
from stalling, parks instead of auto-playing while backgrounded, and resets
its attempt budget only after sustained progress. The transport is never
disabled: players accept taps in every state, and a wedged spinner is
structurally impossible (FE2). Native stalled events are ignored entirely
(1094/1094 were noise in the production trace). Truncated deliveries are
recovered at-position instead of advancing the queue (FE10). Lock-screen
pause now routes as a user pause (FE5); terminal network errors surface
uniformly with a working retry (FE7); ended->next keeps the synchronous
event-tail play for the iOS autoplay grant (FE13); mute uses audio.muted
(FE14); the dead prefetch hint and needs-resume plumbing are gone
(FE15/FE18). AudioContext bridge preserved verbatim.
2026-06-11 11:58:48 -05:00
chevron7 dc491dd22e fix(audiobook): Phase B session model -- BookSession + session-first controls (audio remediation 2.1-2.3)
Audiobooks now play through a BookSession with a required, verified track
map: every surface (play, chapter tap, seek, ended-advance, restore) does
its book-time math through one tested translator, and a book can never be
marked finished without an affirmed last file -- killing the multi-file
progress-wipe (FE1). Chapter taps start the book at the chapter via
playAudiobookAt riding load(seekTo) instead of racing React state (FE6).
The controller gains a generation-checked load(seekTo) + isTransitioning
shim (D5): start offsets ride their own load and can never leak across
media switches (FE8), and progress saves are suppressed during transitions
(FE9). Errors on audiobooks/podcasts save progress before teardown (FE12).
Same-src loads in flight are no longer restarted (FE16). The unsafe
kima_was_playing foreground auto-resume is removed (FE5 partial; intent
routing completes in Phase D). Adds vitest with a 33-test BookSession
suite (test:unit).
2026-06-11 11:39:51 -05:00
chevron7 51ee5bbd55 fix(stream): Phase A backend lifecycle and correctness (audio remediation 1.1-1.10)
Delete the per-user stream eviction that truncated actively-playing streams
(B1/B10); add server socket timeouts so dead peers cannot accumulate (B3);
run transcodes through the real queue with a 120s watchdog kill (B6); bound
the ABS proxy at 15s and cache track resolution for seeks (B2); replace the
1-year cache header with private/1h/must-revalidate plus conditional 304s
(B4); key the transcode cache on mtime equality + source size (B5); align
all range-serving surfaces on 416-or-ignore semantics per RFC 9110 (B8/B11);
fix the podcast stream rate-limit exemption (B7); release the play-log claim
on failed inserts (B12); cache audiobook track maps at sync time and expose
tracks/trackCount on list+series endpoints with an explicit tracksUnavailable
signal (FE1 backend half); fix the play-adjacent writer that left numTracks
NULL. Drop the never-read musicPath from AudioStreamingService.
2026-06-11 11:24:20 -05:00
chevron7 2bd27c6738 chore(release): v1.8.2
iOS audiobook progress survives backgrounding/updates, the unsafe
auto-resume-after-interruption was removed (device-trace proven), faster track
starts, audiobook 416/sync-guard fixes, and the #168 podcast preview hang.
2026-06-10 10:20:44 -05:00
chevron7 70d0d68fd9 fix(podcast): widen preview client timeout to 25s for headroom
The backend preview can take ~18s worst case on the Deezer path (Deezer fetch +
iTunes resolve + the 8s RSS bound). A 20s client abort left only a 2s margin, so
a slow but answerable feed could be aborted and shown as a false timeout. 25s
keeps a clear gap over the server while still bounding the #168 hang.
2026-06-10 10:15:44 -05:00
chevron7 2753bb752d fix(audiobook): clean 416 on out-of-range seek + skip mis-cataloged libraries
- A seek past a file whose stored size is wrong made Audiobookshelf return 416,
  which axios surfaced as a 500. The service now lets 416 through and the route
  sends a clean 416 (Content-Range forwarded, upstream stream destroyed) instead
  of piping the upstream error body into the audio element.
- Sync now skips items with more than 1000 audio files: those are mis-cataloged
  libraries imported as one book (the source of multi-thousand-hour, tens-of-GB
  records that broke seeking). Track count, not duration -- legitimate omnibus
  editions legitimately run 50-65h.
2026-06-10 10:15:44 -05:00
chevron7 da241b246a perf(stream): dedup play-logging and default to original quality
Two follow-ups from review of the critical-path trim:
- A synchronous in-process claim gates the now-background play-logging so two
  concurrent stream requests for the same track can't both insert a Play row
  inside the 30s window (the fire-and-forget change had widened that race).
- The no-settings-row quality fallback is now "original", matching the schema
  default, instead of "medium" -- a user without a settings row no longer gets a
  pointless first-play transcode.
2026-06-10 10:15:44 -05:00
chevron7 d4e20a963c perf(stream): take play-logging and the settings read off the critical path
Measured from real device traces: fresh track start was ~2.3s vs ~25ms to
resume an already-loaded track. Part of that was the stream route doing
sequential DB work before the first byte -- a recent-play lookup, a play insert,
and a settings read, all awaited up front.

Fetch the track row and the quality setting in parallel (one round-trip, not
two), and fire the play-history logging in the background instead of awaiting it.
Neither needs to gate playback. The bulk of the remaining latency is client-side
buffering of multi-hour audiobook files seeking to a saved offset, tracked
separately.
2026-06-09 15:31:54 -05:00
chevron7 419b8580a3 revert(ios): drop auto-resume after interruption -- the guard can't work on iOS
The device trace proved it unsafe: navigator.mediaDevices "devicechange" never
fires on the user's iPhone (0 events across the whole capture), so the
route-change guard that was supposed to stop an earbud unplug from auto-resuming
to the speaker is permanently inert -- sinceRouteMs is always ~Date.now(). That
is the v1.7.12 regression with no working brake, and the user reproduced audio
restarting after pulling earbuds.

The AudioContext statechange handler goes back to re-claiming the playback
session category only (safe), never calling audio.play(). Removed the now-dead
intendsToPlay flag, route-change tracking, and devicechange listener. The trace
auto-upload auth fix and the audiobook position-save fix are unaffected and stay.
2026-06-09 14:49:48 -05:00
chevron7 e110b6d77e fix(audiobook): checkpoint progress on a real timer, not the throttled timeupdate
Progress was saved every 30s off the "timeupdate" event, but iOS throttles and
suspends that event when the PWA is backgrounded (screen off) -- the normal way
people listen to audiobooks. So a long screen-off session was never
checkpointed, and an app update (or crash) reverted to the moment the screen was
locked. The saved data was never lost; it just stopped advancing in the
background.

Replace the timeupdate-driven save with a 15s wall-clock setInterval that runs
while playing (started on "play", stopped on "pause"/"ended"), independent of
the media event iOS throttles. saveAudiobookProgress already de-dupes an
unchanged position and the tick is gated on isPlaying(), so paused/stalled ticks
are no-ops. Applies to podcasts too.
2026-06-08 07:23:36 -05:00
chevron7 b1daaaccf9 fix(ios): auto-resume audio after an OS interruption + repair the trace upload
Playback that an iOS interruption (call/notification) pauses now resumes when
the interruption ends, the behaviour other apps have.

- Track play intent separately from audio.paused: set on play/tryResume/
  swapAndPlay, cleared only by explicit pause/stop/cleanup. The native "pause"
  event an interruption fires does NOT clear it.
- The AudioContext statechange listener resumes on an interrupted -> running
  transition when intent is set and the element is paused. Gated hard: only that
  transition (not the initial bridge resume or a background suspend), never
  within 1.5s of an audio-route change (the v1.7.12 unplug-to-speaker
  regression), and never while a stall reload owns the resume.
- Repair the trace auto-upload: it POSTed to a requireAuth route without the
  Bearer token and swallowed the 401, so no iOS trace was ever captured. It now
  sends the token, so device testing finally yields real event data.

Reviewed by Opus/Sonnet passes. Known limits to confirm on-device: only fires if
WebKit returns the context to "running" (a context stuck "interrupted" -- the
force-quit symptom -- is not addressed here).
2026-06-06 12:29:58 -05:00
chevron7 b490ae771b fix(podcast): bound preview fetch so a slow feed can't hang the UI (#168)
The preview hook only stops spinning when the request resolves or rejects. The
RSS parse had a 30s timeout and the client had none, so a slow/dead feed left
the spinner up 30s+ with no error -- the "infinite loading" in #168 (the v1.7.13
fix only handled the error path, not the hang).

- Frontend: previewPodcast aborts after 20s, surfacing the existing error UI.
- Backend: the two preview RSS parses are bounded to 8s (non-critical, already
  falls through to partial data), so a slow feed returns the podcast quickly.
2026-06-06 12:29:58 -05:00
chevron7 fb25b0823e chore(release): v1.8.1
Optional slskd backend for Soulseek (#205, by gossip31) and the audio-analyzer
crash-loop fix (#204, by gossip31, plus a retry-count backstop).
v1.8.1
2026-06-05 20:19:05 -05:00
Silly Susan b6046a3601 feat(soulseek): configurable slskd backend via soulseekMode (#205)
Adds a soulseekMode (p2p|slskd) setting to route Soulseek through an external slskd REST instance, so slskd mode needs no Kima-side Soulseek credentials. Includes the review fixes: https transport, reconnect on backend change, slskdUrl validation, mode-aware connection test, queue position, bounded size cache. Closes #164. By gossip31.
2026-06-05 20:02:39 -05:00
chevron7 8fb792d213 fix(audio-analyzer): count worker crashes as retries so any crash quarantines
Complements #204 (gossip31's pre-decode ffmpeg gate). The pre-decode gate
catches corrupt files that SIGSEGV the decoder, but a worker that dies on any
other native fault (e.g. an Essentia analysis crash after a clean decode) still
left the track in 'processing' and got re-queued by the stale-cleanup sweep
WITHOUT incrementing analysisRetryCount -- so it could loop forever and never
reach the mark-failed/quarantine path.

_cleanup_stale_processing now increments analysisRetryCount when it resets a
crashed track, and marks tracks that have passed MAX_RETRIES as 'failed' (with a
reason) so they quarantine and surface in the permanently-failed accounting
instead of sitting in 'processing' limbo. Defense in depth behind the gate.
2026-06-05 14:58:31 -05:00
Silly Susan be1519822d fix(audio-analyzer): pre-decode ffmpeg gate to stop Essentia segfault crash-loop (#204)
Adds an ffmpeg integrity probe before MonoLoader so corrupt files that SIGSEGV Essentia become a normal load failure (and flow into the existing retry/quarantine) instead of crash-looping the worker. By gossip31.
2026-06-05 14:56:39 -05:00
chevron7 fdc7b3cfa1 fix(build): harden ML model downloads against curl timeouts
The model-download layer failed three recent builds (06-04 x2, 06-05) with curl
exit 28: --max-time caps the whole operation including retries, so a slow
GitHub-runner transfer trips it, and --retry does not retry a timeout. Switched
all 12 downloads to --retry-all-errors (retries timeouts/transient HTTP),
stall-based abort (--speed-limit 1024 --speed-time 60) instead of a hard total
cap, 5 retries, and -f so a bad HTTP response fails fast instead of saving a
corrupt model. The transformers==5.8.1 pin is unaffected and confirmed building.
v1.8.0
2026-06-05 10:34:48 -05:00
chevron7 545a488c67 docs(changelog): drop refuted Redis-memory caveat from #197
The reporter's redis INFO shows a healthy instance (33MB used, no maxmemory
limit, noeviction, zero evictions/rejected connections), ruling out the
memory-pressure hypothesis. The connection-readiness race the fix addresses is
the actual cause, so the hedge is removed.
2026-06-05 08:51:19 -05:00
chevron7 eb5883f13d chore(release): v1.8.0
iOS playback reliability rebuilt from a stable baseline, Vibe search Redis
hardening (#197), Discover Weekly correctness Phase 1, transformers pin, and iOS
audio diagnostics groundwork. Folds the never-released 1.7.16 into 1.8.0.
2026-06-04 19:38:26 -05:00
chevron7 12ebc8c9a5 fix(vibe): harden text-embedding bridge against busy/reconnecting Redis (#197)
The first #197 fix only hardened the pub/sub subscriber; a 3-model review panel
found it incomplete. This closes the rest:

- publish() now runs on a dedicated soft-options connection (enableOfflineQueue,
  infinite retries) instead of the strict shared client -- that strict publish
  was still throwing the same "Stream isn't writeable" error under load.
- subscriber lifecycle: terminal "end" drops the cache, a failed psubscribe
  disconnects the half-open socket instead of leaking it; transient drops
  self-heal via auto-reconnect.
- both subscribe and publish are time-bounded so an unreachable Redis fails the
  request instead of hanging indefinitely.
- analyzer failures ({success:false, embedding:null}, no error field) are now
  rejected cleanly instead of passing null into the pgvector cast (500).
- the analyzer publishes a failure response on internal exceptions so the caller
  fails fast instead of waiting out the full 15s timeout.

Reviewed by Opus/Sonnet/Haiku panels twice (original confirmed INCOMPLETE,
rewrite SHIP-WITH-CHANGES); surviving findings applied, two rejected with reason
(no publisher churn on transient error; keep setMaxListeners(0) to not re-trigger
the warning flood).

The reporter's 200k-track failure may also involve Redis memory pressure or
Python-analyzer saturation, which this makes tolerable but does not itself
resolve -- pending their redis INFO.
2026-06-04 17:42:40 -05:00
chevron7 a9cf54d0f2 fix(build): pin transformers==5.8.1 to stop nightly breaking against torch 2.5.1
The scheduled nightly off main failed (2026-06-04, and 2026-06-02 the same
way): a transformers release newer than 5.8.1 references torch.float8_e8m0fnu
(a dtype added in torch 2.7) at import time, so `from transformers import
BertModel` crashes against the pinned torch==2.5.1 and the Dockerfile
fail-fast check exits 1. The unpinned `transformers>=4.30.0` let pip resolve
to that bad release. Recent branch builds only passed because BuildKit reused
a cached pip layer from before it published.

Pinned to 5.8.1 -- the exact version running in prod against torch 2.5.1+cpu.
Bump only alongside a torch bump.
2026-06-04 08:35:50 -05:00
chevron7 4fa327aa8e fix(ios): re-claim audio session on resume + AudioContext statechange recovery
On the clean 439fa68 bridge baseline (band-aids reverted), add the two
high-confidence stability fixes the resume bug actually needs:

- setAudioSessionPlayback gains a `force` arg; play() now re-claims the
  iOS "playback" session category on every explicit resume, not just the
  first. The one-time latch was why iOS, after an earbud/Control-Center
  interruption, left the session with whatever app grabbed it (a
  sleep-sounds app started playing through it).
- A statechange listener on the bridge AudioContext re-claims the session
  when the OS ends an interruption and the context returns to running. It
  never calls play() -- auto-resume on a route change is the v1.7.12
  earbud-unplug-to-speaker regression.

Reviewed by two independent passes; their findings fixed here: play() now
actually passes force=true (the reclaim was a no-op without it); the
statechange listener + AudioContext are torn down in destroy() (no leak);
em-dash normalized.

Deliberately NOT re-adding the silent-playback watchdog (part of the
reverted band-aid stack) -- the debug instrumentation will show whether an
interrupted-context resume is still silent, and any further recovery will
be a minimal targeted fix on evidence, not another speculative layer.
2026-06-03 20:06:17 -05:00
chevron7 a2dc14a1b0 revert(ios): strip resume band-aids back to AudioContext bridge baseline
Reverts the daf6210 -> 7be3322 -> 1a9f6f4 cascade that piled onto the
bridge. Root regression was daf6210: it awaited setupAudioContextBridge
and bailed play()/tryResume with needs-resume whenever the context was
not "running" -- which forfeited the iOS user-gesture token AND returned
before audio.play() ever ran. So earbud/lock-screen resume went silent
or dead-ended on a Tap-to-resume prompt the lock screen cannot show, and
iOS eventually handed the audio session to another app. 7be3322 and
1a9f6f4 were band-aids on that regression.

Keeps 439fa68 (the bridge) so backgrounded/screen-off playback still
survives, and keeps the debug ring-buffer instrumentation. play() and
tryResume return to the baseline: fire the context resume in parallel,
always attempt audio.play(), preserve the gesture.
2026-06-03 18:30:04 -05:00
chevron7 a7e3a85803 debug(ios): auto-capture + auto-upload the audio ring buffer in standalone PWA
Temporary diagnostic for the earbud-resume bug: the installed iOS PWA has no URL
bar to set ?ios_debug=1 or reach /debug/ios-log, so capture is enabled
unconditionally on iOS standalone and the buffer auto-POSTs (debounced 3s) to
/api/debug/ios-log after each event burst. Revert once the resume bug is fixed.
2026-06-03 16:30:45 -05:00
chevron7 eaeb0d3588 chore: release v1.7.16
iOS earbud/MediaSession resume fix, Vibe text-search Redis subscriber fix (#197),
and Discover Weekly Phase-1 correctness rework. Bump frontend+backend to 1.7.16.
2026-06-02 23:24:37 -05:00
chevron7 81ac1b5c17 fix(vibe): text-embedding Redis subscriber survives a cold/slow connection (#197)
ensureSubscriber duplicated the parent Redis client, inheriting
enableOfflineQueue:false + maxRetriesPerRequest:0, so psubscribe threw 'Stream
isn't writeable' when the subscriber socket wasn't connected yet -- and the
rejected promise was cached, breaking vibe text search permanently until restart
(worsens with library size). The subscriber now gets its own offline queue +
retries, resets the cached promise on rejection, and drops it on 'end' so the
next request reconnects.
2026-06-02 23:23:08 -05:00
chevron7 e9e1176eb7 Merge fix/ios-earbud-resume: iOS earbud/MediaSession resume + smoke hardening (v1.7.16) 2026-06-02 23:22:06 -05:00
chevron7 3db809c3b8 Merge discovery-weekly-overhaul: Phase 1 correctness fixes (v1.7.16) 2026-06-02 23:22:06 -05:00
chevron7 6b435e4167 test(e2e): wait for playback to start before asserting transport in smoke
The smoke spec asserted the play/pause button state immediately after Play all,
racing the first audio load on a cold container (player stayed 'Not Playing').
Poll audio currentTime > 0 first. Surfaced while running the suite pre-v1.7.16.
2026-06-02 18:44:29 -05:00
chevron7 1a9f6f418a fix(ios): earbud/MediaSession resume preserves the user-gesture grant
The MediaSession 'play' action called controller.play(), which awaits the
AudioContext bridge BEFORE audio.play(). That await forfeits the iOS
user-activation token from the earbud click, so an interrupted/suspended
AudioContext never resumes -- and play() then returns (not throws) on
ctx-not-running, so the handler's reloadAndPlay() fallback never fired. Result:
earbud resume produced no audio, no native 'playing' event, no playbackState
update, and after repeated no-audio play actions iOS reassigned the audio
session to the next app.

Adds resumeFromGesture(): fires the context resume without awaiting it, calls
audio.play() synchronously in the gesture tail (mirrors swapAndPlay), and on any
rejection reloads the source to re-grab the hardware session instead of a silent
needs-resume. Wired only into the explicit MediaSession 'play' action, so it
cannot auto-resume on an ambiguous pause/route-change (the v1.7.12 earbud-unplug
-> speaker regression stays fixed). play()/tryResume()/pause/silent-watchdog
untouched. Diagnosed via 4-lens + adversary review (SHIP-AS-IS).

Requires on-device confirmation (?ios_debug=1); cannot be unit-verified.
2026-06-02 18:30:50 -05:00
chevron7 2032de9e3c Merge frontend-quality-audit -> v1.7.15 v1.7.15 2026-06-01 13:04:31 -05:00
chevron7 de21f4d862 revert(player): remove mobile mini-player gesture hint
The one-time swipe hint crowded the bottom edge above the mini-player and read
as clutter; the swipe behavior is intentional and discoverable enough without it.
Removes the hint state, markup, the markHintSeen calls (swipe behavior unchanged),
the now-unused useCallback import, and the hint-in keyframe.
2026-06-01 11:46:58 -05:00
chevron7 099a58da53 chore: release v1.7.15
Bump frontend and backend to 1.7.15. Backfill CHANGELOG with the previously
unlogged 1.7.13 (iOS audio overhaul) and 1.7.14 (#81 podcast refresh) releases,
and add 1.7.15 (frontend quality + UX overhaul, iOS backgrounded-playback fix,
desktop settings-panel fix).
2026-06-01 08:04:05 -05:00
chevron7 c715cee7b1 fix(panel): render registered settings content in the desktop UnifiedPanel
The desktop sidebar was rebuilt as UnifiedPanel but the externally-registered
settings content (discover settings gear, lyrics) was never ported -- clicking
the discover gear opened the panel to the activity feed instead of the settings,
because UnifiedPanel never read settingsContent or handled set-activity-panel-tab.
It now listens for that event, renders the registered settingsContent (which
carries its own header + back button), and resets to the feed on collapse. Fixes
discover settings and lyrics on desktop. Pre-existing since the sidebar rewrite.
2026-05-31 23:01:52 -05:00
chevron7 ea363d677e fix(a11y): bump Refine panel sort/per-page buttons to 44px touch targets
Completes the audit's deferred item -- the sort and per-page option buttons were
py-2 (~32px), below the 44px standard the rest of the sweep adopted.
2026-05-31 20:46:44 -05:00
chevron7 274f784ec8 fix(playlist): re-measure virtualizer scrollMargin on reflow
The scrollMargin useLayoutEffect only ran on [rows.length], so the offset went
stale when layout above the list reflowed (e.g. the responsive hero crossing the
md breakpoint, ~52px). Masked today by the 12-row overscan, but wrong and fragile
if overscan is tuned. Added a ResizeObserver re-measure. (ultrareview bug_003)
2026-05-31 20:10:51 -05:00