mdsmith
Esc
    v0.52.0 GitHub

    Choosing Readability, Conciseness, and Token Budget Metrics

    Trade-offs and threshold guidance for readability, structure, length, and token budgets.

    # Scope and disclaimer

    This guide compares existing mdsmith rules that touch readability and length with token budget awareness. Any metric scores and trade-offs below are illustrative and focus on the rules that are currently implemented.

    # What the current rules measure

    RuleMeasuresDefaultWhat it misses
    MDS023 paragraph-readabilityComplexity using ARI (characters per word, words per sentence)max-index: 14.0, min-words: 20Wordiness and filler; short but dense paragraphs can be skipped
    MDS024 paragraph-structureShape and length of paragraphs (sentences per paragraph, words per sentence)max-sentences: 6, max-words-per-sentence: 40Verbosity that fits within limits; dense but short prose
    MDS022 max-file-lengthLines per filemax: 300Token load and dense paragraphs
    MDS028 token-budgetEstimated token count per file (heuristic or tokenizer mode)max: 8000, mode: heuristicExact model token parity; tokenizer mode is still approximate
    MDS001 line-lengthCharacters per linemax: 80Verbosity and paragraph complexity

    # Planned metrics (not implemented)

    No additional metrics are planned at this time.

    # What token budget awareness is trying to measure

    Token budget awareness (MDS028 ) focuses on file-level size in terms of tokens rather than lines or characters. It protects LLM context windows by warning when a file exceeds a configurable budget. heuristic mode multiplies word count by a tokens-per-word factor, which is fast but approximate. tokenizer mode uses tokenizer-aware splitting with a selected encoding for a closer estimate.

    Tokenization happens before inference, so any LLM will read inputs as tokens. That means token budgets are only accurate when they use the same tokenizer as the target model. The trade-off is performance: exact tokenization is slower and needs vocab assets, while heuristic estimates are fast and model-agnostic.

    # Example paragraphs (paragraph-level metrics)

    # Example A

    In order to make sure that we are all on the same page, it is important to note that the system is, in most cases, able to handle requests pretty well, and this is something we should keep in mind.

    # Example B

    The synchronization algorithm enforces linearizability via per-shard lease epochs and monotonic commit indices.

    # Example C

    We should update the onboarding guide so that new contributors can quickly find the build steps, understand the release checklist, and avoid common pitfalls without needing to ask in chat, which will reduce interruptions for everyone.

    # Example D

    The plan is straightforward. We will add a new rule. It will report issues. It will include guidance. It will ship this week. It will help teams. It will reduce noise. It will keep docs short.

    # Example E

    Basically, we just want to make sure the plan is pretty clear to everyone. It is really just a simple update, and we might adjust it later.

    # How the rules score these examples

    Notes: ARI values use mdsmith’s current formula. MDS023 skips paragraphs under min-words. MDS024 flags when sentences or words exceed limits. Conciseness scores below are illustrative heuristics, not an implemented rule. Token budget awareness is file-level; see the token budget examples right after this table.

    ExampleWordsSentencesARIMDS023 resultMDS024 resultConciseness score (illustrative)
    A40116.6Fail (16.6 > 14.0)Pass36.2
    B13120.2Skipped (< 20 words)Pass84.6
    C36122.1Fail (22.1 > 14.0)Pass63.9
    D3680.3PassFail (8 > 6 sentences)50.0
    E2624.6PassPass50.0

    # Token budget examples (file-level)

    These examples assume a tokens-per-word of 1.33 and a budget of 2,000 tokens.

    • File F: 2,800 words -> ~3,724 tokens, flagged by token budget even if line count is below max-file-length.
    • File G: 1,200 words with heavy code blocks -> estimate ~1,596 tokens, but actual tokens could be higher; tokens-per-word tuning or code weighting may be needed.

    # Trade-offs by metric

    MetricStrengthsRisks
    Readability (MDS023 )Encourages simple, broadly accessible prosePenalizes technical terms; misses wordiness; can skip short dense paragraphs
    Structure (MDS024 )Enforces consistent paragraph shape with low false positivesDoes not address filler or redundancy
    Length (MDS022 , MDS001 )Prevents runaway size and formatting driftPoor proxy for token load or verbosity
    Token budget (MDS028 )Directly targets context window sizeEstimation is noisy; code blocks and symbols can skew counts
    Conciseness (proposed)Targets verbosity and token wasteHeuristic; can penalize necessary qualifiers or legal language

    # How to choose limits

    1. Start with defaults for MDS023 and MDS024 to establish baseline structure and readability.
    2. Sample a representative set of documents and collect results before tightening thresholds.
    3. For token budgets, pick a target based on your context window and allocate a safe share per document (for example, reserve 20 to 30 percent of a prompt budget for a single doc). Choose an initial tokens-per-word value and adjust for code-heavy files.
    4. For conciseness scoring, set an initial threshold that flags only the worst 10 to 20 percent of paragraphs, then adjust.
    5. Use path-based overrides to reflect different document types, such as onboarding guides vs architecture specs.
    6. Re-evaluate thresholds after major content changes or when onboarding new teams.

    # When to use one measure instead of many

    If you need a single metric to minimize complexity, choose the one that best matches your risk:

    • Choose MDS024 paragraph-structure when you want predictable, low-noise enforcement.
    • Choose MDS023 paragraph-readability when broad comprehension is the highest priority.
    • Choose MDS028 token-budget when context window limits are the dominant constraint and you want a file-level guardrail.
    • Choose conciseness scoring when token budget and drift are the main risks and you accept heuristic trade-offs.

    # Recommendation for mdsmith users

    Start with MDS023 and MDS024 enabled. Use MDS022 and MDS001 as baseline file and line controls. Add MDS028 when context limits matter, then add conciseness scoring only after calibrating its thresholds and confirming it improves signal without harming necessary precision.