Files
navidrome/persistence
Deluan Quintão 81a17f6bbb fix(search): normalization for non-NFKD Unicode letters (ø, æ, œ, ß) (#5413)
* fix(search): transliterate non-ASCII letters symmetrically in FTS5 path

Songs and artists with letters like ø, æ, œ, ß were unsearchable. The
query path in server/subsonic/searching.go transliterates with
sanitize.Accents (Øystein → Oystein), but the FTS5 tokenizer's
remove_diacritics 2 only strips NFKD-decomposable marks — atomic
letters with built-in strokes/ligatures survive tokenization, so the
query side and index side disagreed.

Apply sanitize.Accents on both sides:

- normalizeForFTS now also emits an ASCII-transliterated form for each
  word, so search_normalized contains the variant the query produces.
- buildFTS5Query transliterates the unquoted portion of the input so
  every caller (Subsonic, REST fullTextFilter) gets the same handling.
  Quoted phrases stay as typed, preserving phrase matches against the
  original title/artist columns.

Existing libraries pick up the fix as records are re-scanned; users
can trigger a manual full rescan to refresh older entries.

* fix(search): cache transliteration and add ß/quoted-phrase test coverage

Address review feedback: call sanitize.Accents once per word and reuse
the result for both the punct-stripped and accent-only paths. Add missing
test entries for ß→ss transliteration and quoted Unicode phrase preservation.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-25 20:27:38 -04:00
..
2026-02-08 09:57:30 -05:00
2026-02-08 09:57:30 -05:00