Choosing Readability, Conciseness, and Token Budget Metrics
Trade-offs and threshold guidance for readability, structure, length, and token budgets.
# Scope and disclaimer
This guide compares existing mdsmith rules that touch readability and length with token budget awareness. Any metric scores and trade-offs below are illustrative and focus on the rules that are currently implemented.
# What the current rules measure
| Rule | Measures | Default | What it misses |
|---|---|---|---|
MDS023
paragraph-readability | Complexity using ARI (characters per word, words per sentence) | max-index: 14.0, min-words: 20 | Wordiness and filler; short but dense paragraphs can be skipped |
MDS024
paragraph-structure | Shape and length of paragraphs (sentences per paragraph, words per sentence) | max-sentences: 6, max-words-per-sentence: 40 | Verbosity that fits within limits; dense but short prose |
MDS022
max-file-length | Lines per file | max: 300 | Token load and dense paragraphs |
MDS028
token-budget | Estimated token count per file (heuristic or tokenizer mode) | max: 8000, mode: heuristic | Exact model token parity; tokenizer mode is still approximate |
MDS001
line-length | Characters per line | max: 80 | Verbosity and paragraph complexity |
# Planned metrics (not implemented)
No additional metrics are planned at this time.
# What token budget awareness is trying to measure
Token budget awareness (MDS028
) focuses on file-level size in terms of tokens rather than lines or characters. It protects LLM context windows by warning when a file exceeds a configurable budget. heuristic mode multiplies word count by a tokens-per-word factor, which is fast but approximate. tokenizer mode uses tokenizer-aware splitting with a selected encoding for a closer estimate.
Tokenization happens before inference, so any LLM will read inputs as tokens. That means token budgets are only accurate when they use the same tokenizer as the target model. The trade-off is performance: exact tokenization is slower and needs vocab assets, while heuristic estimates are fast and model-agnostic.
# Example paragraphs (paragraph-level metrics)
# Example A
In order to make sure that we are all on the same page, it is important to note that the system is, in most cases, able to handle requests pretty well, and this is something we should keep in mind.
# Example B
The synchronization algorithm enforces linearizability via per-shard lease epochs and monotonic commit indices.
# Example C
We should update the onboarding guide so that new contributors can quickly find the build steps, understand the release checklist, and avoid common pitfalls without needing to ask in chat, which will reduce interruptions for everyone.
# Example D
The plan is straightforward. We will add a new rule. It will report issues. It will include guidance. It will ship this week. It will help teams. It will reduce noise. It will keep docs short.
# Example E
Basically, we just want to make sure the plan is pretty clear to everyone. It is really just a simple update, and we might adjust it later.
# How the rules score these examples
Notes: ARI values use mdsmith’s current formula. MDS023
skips paragraphs under min-words. MDS024
flags when sentences or words exceed limits. Conciseness scores below are illustrative heuristics, not an implemented rule. Token budget awareness is file-level; see the token budget examples right after this table.
| Example | Words | Sentences | ARI | MDS023 result | MDS024 result | Conciseness score (illustrative) |
|---|---|---|---|---|---|---|
| A | 40 | 1 | 16.6 | Fail (16.6 > 14.0) | Pass | 36.2 |
| B | 13 | 1 | 20.2 | Skipped (< 20 words) | Pass | 84.6 |
| C | 36 | 1 | 22.1 | Fail (22.1 > 14.0) | Pass | 63.9 |
| D | 36 | 8 | 0.3 | Pass | Fail (8 > 6 sentences) | 50.0 |
| E | 26 | 2 | 4.6 | Pass | Pass | 50.0 |
# Token budget examples (file-level)
These examples assume a tokens-per-word of 1.33 and a budget of 2,000 tokens.
- File F: 2,800 words -> ~3,724 tokens, flagged by token budget even if line count is below
max-file-length. - File G: 1,200 words with heavy code blocks -> estimate ~1,596 tokens, but actual tokens could be higher;
tokens-per-wordtuning or code weighting may be needed.
# Trade-offs by metric
| Metric | Strengths | Risks |
|---|---|---|
| Readability (MDS023 ) | Encourages simple, broadly accessible prose | Penalizes technical terms; misses wordiness; can skip short dense paragraphs |
| Structure (MDS024 ) | Enforces consistent paragraph shape with low false positives | Does not address filler or redundancy |
| Length (MDS022 , MDS001 ) | Prevents runaway size and formatting drift | Poor proxy for token load or verbosity |
| Token budget (MDS028 ) | Directly targets context window size | Estimation is noisy; code blocks and symbols can skew counts |
| Conciseness (proposed) | Targets verbosity and token waste | Heuristic; can penalize necessary qualifiers or legal language |
# How to choose limits
- Start with defaults for MDS023 and MDS024 to establish baseline structure and readability.
- Sample a representative set of documents and collect results before tightening thresholds.
- For token budgets, pick a target based on your context window and allocate a safe share per document (for example, reserve 20 to 30 percent of a prompt budget for a single doc). Choose an initial
tokens-per-wordvalue and adjust for code-heavy files. - For conciseness scoring, set an initial threshold that flags only the worst 10 to 20 percent of paragraphs, then adjust.
- Use path-based overrides to reflect different document types, such as onboarding guides vs architecture specs.
- Re-evaluate thresholds after major content changes or when onboarding new teams.
# When to use one measure instead of many
If you need a single metric to minimize complexity, choose the one that best matches your risk:
- Choose MDS024
paragraph-structurewhen you want predictable, low-noise enforcement. - Choose MDS023
paragraph-readabilitywhen broad comprehension is the highest priority. - Choose MDS028
token-budgetwhen context window limits are the dominant constraint and you want a file-level guardrail. - Choose conciseness scoring when token budget and drift are the main risks and you accept heuristic trade-offs.
# Recommendation for mdsmith users
Start with MDS023 and MDS024 enabled. Use MDS022 and MDS001 as baseline file and line controls. Add MDS028 when context limits matter, then add conciseness scoring only after calibrating its thresholds and confirming it improves signal without harming necessary precision.