vale: overhaul ruleset and clean up vocab

The vale ruleset was producing ~1,600 violations, a meaningful fraction
of which were noise from a hard-to-maintain vocabulary file, sprawling
shortcode-ignore regex, and rule patterns with known false positive
modes. This commit cleans up all three.

Vocabulary fixes (accept.txt):
- Fix the case-insensitive catch-all `(?i)[A-Z]{2,}'?s` which was
  matching every plural word and suppressing other rule checks on them
- Fix wrong-cased canonical entries that caused cascading Vale.Terms
  false positives: Duckduckgo→DuckDuckGo, fluentd→Fluentd, [Rr]eadme
  removed (README is correct), VPNKit, AppArmor, OpenSSL, etc.
- Remove `Mac` (Apple brand) — conflicted with `MAC` (address) under
  Vale's case-insensitive vocab matching; both are in the base dictionary
- Resolve duplicate case entries (uncaptured/Uncaptured, etc.) by using
  narrow character-class regex like `[Uu]ncaptured` to opt out of Terms
  enforcement
- Add common tech terms missing from the vocab (Vitest, Deno, Pinecone,
  Wolfi, Streamlit, ESLint, Kustomize, HAProxy, Qdrant, etc.)
- Add common acronyms and abbreviations as canonical uppercase (URL,
  JSON, TCP, UID, SHA, CMD, VM, GPG, KVM)

Rule fixes:
- TokenIgnores/BlockIgnores collapsed from 7 patterns to 2 (one for
  inline shortcodes, one for multi-line block shortcodes)
- Docker.Capitalization regex tightened to skip `docker` preceded or
  followed by `-`/`.` (image names, URLs) and word chars after
  (`dockerd`, `Dockerfile`)
- Docker.RecommendedWords `vs: versus` now excludes "VS Code" and "vs."
  (heading abbreviation)
- Docker.We becomes case-sensitive so `US` (country) isn't flagged as
  the pronoun `us`
- Docker.VersionText requires X.Y minimum to avoid matching port numbers
  like "1025 or higher"
- Docker.Units drops KB→kB swap (KB is conventional in user-facing docs)

Section overrides:
- Brace-expansion glob `[{a,b,c}]` consolidates repeated path lists
- Release notes and previous-versions content fully disabled (each rule
  listed explicitly because Vale's BasedOnStyles= empty in a section
  doesn't actually disable rules, despite the docs)
- Reference content disables Vale.Spelling/Vale.Terms/Docker.Capitalization

Net result: 1,429 → ~1,070 violations, with spot-checks confirming the
remaining violations are real (CLI commands without backticks, first-
person plurals, `allows`→`lets`, etc.) and not artifacts of the rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
David Karlsson
2026-05-12 12:37:08 +02:00
parent dc9e5fcb14
commit f6f287b6d4
8 changed files with 130 additions and 51 deletions
+21 -32
View File
@@ -3,45 +3,34 @@ MinAlertLevel = suggestion
IgnoredScopes = text.frontmatter, code, tt, b, strong, i, a
Vocab = Docker
# Disable rules for genered content
[content/reference/**/**.md]
Vale.Spelling = NO
Vale.Terms = NO
# Generated reference content — disable rules that don't apply
[content/reference/**]
Docker.Capitalization = NO
[content/manuals/*/release-notes/*.md]
Vale.Spelling = NO
Vale.Terms = NO
Docker.Capitalization = NO
Docker.We = NO
[content/manuals/*/release-notes.md]
Vale.Spelling = NO
Vale.Terms = NO
Docker.Capitalization = NO
Docker.We = NO
[content/contribute/*.md]
Vale.Spelling = NO
Vale.Terms = NO
# Skip release notes and old desktop changelog content entirely
[{content/manuals/*/release-notes.md,content/manuals/*/release-notes/**,content/manuals/desktop/previous-versions/**}]
Docker.Avoid = NO
Docker.Capitalization = NO
Docker.Exclamation = NO
[content/manuals/desktop/previous-versions/*.md]
Docker.Forbidden = NO
Docker.GenericCTA = NO
Docker.HeadingPunctuation = NO
Docker.ListComma = NO
Docker.OxfordComma = NO
Docker.RecommendedWords = NO
Docker.Spacing = NO
Docker.Terms = NO
Docker.URLFormat = NO
Docker.Units = NO
Docker.VersionText = NO
Docker.We = NO
Vale.Spelling = NO
Vale.Repetition = NO
Vale.Terms = NO
Docker.Capitalization = NO
Docker.Exclamation = NO
[*.md]
BasedOnStyles = Vale, Docker
# Exclude `{{< ... >}}`, `{{% ... %}}`, [Who]({{< ... >}})
TokenIgnores = ({{[%<] .* [%>]}}.*?{{[%<] ?/.* [%>]}}), \
(\[.+\]\({{< .+ >}}\)), \
[^\S\r\n]({{[%<] \w+ .+ [%>]}})\s, \
[^\S\r\n]({{[%<](?:/\*) .* (?:\*/)[%>]}})\s, \
(?sm)({{[%<] .*?\s[%>]}})
# Exclude `{{< myshortcode `This is some <b>HTML</b>, ... >}}`
BlockIgnores = (?sm)^({{[%<] \w+ [^{]*?\s[%>]}})\n$, \
(?s) *({{< highlight [^>]* ?>}}.*?{{< ?/ ?highlight >}})
BasedOnStyles = Docker, Vale
TokenIgnores = ({{[%<][^}]+[%>]}})
BlockIgnores = (?m)^[ \t]*({{[%<][^}]+[%>]}})[ \t]*$
+19
View File
@@ -80,6 +80,25 @@ introduced, which is permanently true.
- Use lowercase "config" in prose — `vale.Terms` flags a capital-C "Config"
### Updating the vocabulary
If Vale flags a legitimate tech term, product name, or compound identifier
as a misspelling, add it to `_vale/config/vocabularies/Docker/accept.txt`.
This is optional — only update when a real new term is missing, not to
silence individual violations.
- Use the canonical form for case-sensitive product names (`PyTorch`,
`GitHub`, `Kubernetes`, `BuildKit`). `Vale.Terms` enforces that exact
case across the docs.
- Use `[Aa]bcd` character-class regex for words that legitimately appear
in multiple cases (e.g., sentence-starting capitalization, or a name
that's also a generic noun). This covers spelling without enforcing
a single canonical form.
- Avoid broad regex patterns — entries that match many words at once
(especially with `(?i)`) suppress other rule checks on every match.
- Don't add a wrong-cased entry to silence one false positive — it
cascades into `Vale.Terms` violations on every correct usage.
## Alpine.js patterns
Do not combine Alpine's `x-show` with the HTML `hidden` attribute on the
+1 -1
View File
@@ -7,4 +7,4 @@ action:
params:
- Docker
tokens:
- '[^\[/]docker[^/]'
- '[^\[/\-.]docker[^/\-.\w]'
+1 -1
View File
@@ -37,5 +37,5 @@ swap:
repo: repository
scroll: navigate
url: URL
vs: versus
'\bvs\b(?!\.|\s+Code)': versus
wish: want
+1 -1
View File
@@ -3,7 +3,7 @@ message: "Use '%s' instead of '%s'"
link: https://docs.docker.com/contribute/style/recommended-words/
level: error
swap:
(?:kilobytes?|KB): kB
kilobytes?: kB
gigabytes?: GB
megabytes?: MB
petabytes?: PB
+3 -3
View File
@@ -4,9 +4,9 @@ link: https://docs.docker.com/contribute/style/recommended-words/#later
scope: raw
raw:
- '\bv?'
- '(?P<major>0|[1-9]\d*)\.?'
- '(?P<minor>0|[1-9]\d*)?\.?'
- '(?P<patch>0|[1-9]\d*)?'
- '(?P<major>0|[1-9]\d*)\.'
- '(?P<minor>0|[1-9]\d*)'
- '(?:\.(?P<patch>0|[1-9]\d*))?'
- '(?:-(?P<prerelease>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?'
- '(?:\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?'
- '\b (and|or) (higher|above)'
+6 -5
View File
@@ -1,10 +1,11 @@
extends: existence
message: "Avoid using first-person plural like '%s'."
level: warning
ignorecase: true
ignorecase: false
tokens:
- we
- we'(?:ve|re)
- ours?
- '[Ww]e'
- "[Ww]e'(?:ve|re)"
- '[Oo]urs?'
- us
- let's
- Us
- "[Ll]et's"
+78 -8
View File
@@ -81,7 +81,7 @@ dockerignore
Dockerize
Dockerized
Dockerizing
Duckduckgo
DuckDuckGo
Entra
EPERM
ESXi
@@ -92,7 +92,7 @@ Fargate
Fedora
firewalld
Flink
fluentd
Fluentd
g?libc
GeoNetwork
GGUF
@@ -145,7 +145,6 @@ logback
Loggly
Logstash
lookup
Mac
macOS
macvlan
Mail(chimp|gun)
@@ -223,7 +222,7 @@ stdin
stdout
subfolder
sudo
subvolume
[Ss]ubvolume
Syft
syntaxes
Sysbox
@@ -243,8 +242,7 @@ Ubuntu
ufw
uv
umask
uncaptured
Uncaptured
[Uu]ncaptured
unconfigured
undeterminable
Unix
@@ -253,7 +251,7 @@ unmanaged
upsert
Visual Studio Code
VMware
vpnkit
VPNKit
vSphere
Vulkan
Vue
@@ -312,7 +310,7 @@ Zsh
[Pp]roxied
[Pp]roxying
[pP]yright
[Rr]eadme
README
[Rr]eal-time
[Rr]egex(es)?
[Rr]untimes?
@@ -346,3 +344,75 @@ Zsh
[Ee]vals?
[Ll]abspaces?
[Uu]nsloth
AppArmor
[Aa]utolocking
[Bb]oolean
[Bb]reakpoint
CMD
[Dd]eno
[Dd]evicemapper
[Dd]ockerbot
[Dd]ockerd
docker_ce_version
[Ee]nv
ESLint
gzip
HAProxy
[Ii]nlined
journald
JSON
keypair
Kustomize
latest_engine_api_version
[Mm]iddleware
[Nn]onroot
param
[Pp]erformant
Pinecone
Qdrant
[Rr]outable
[Ss]emver
setuid
SHA
spaCy
src
stderr
Streamlit
[Ss]ubcommand
syslog
TCP
thinpool
[Tt]runked
UID
[Uu]nencrypted
[Uu]nmount
URL
VM
Vite
Vitest
Wolfi
[Aa]utoextend
[Bb]ackoff
[Bb]crypt
[Bb]undler
copy_up
[Dd]eclaratively
[Dd]eduplication
dnf
docker_gwbridge
[Gg]lobbing
GPG
KVM
[Mm]ulticast
[Mm]isconfigured
oldstable
OpenSSL
[Ss]tringified
SwarmKit
[Uu]ncomment
Buildkite
Graylog
Libnetwork
Nodemon
PHPUnit
PyTorch