Skip to content

C# modifier order matters: readonly static field (:71 requires static\s+readonly) and static public method (:94 requires visibility before modifiers) are silently dropped while canonical orderings are captured #355

@Widthdom

Description

@Widthdom

Summary

The C# spec allows type-member modifiers to appear in any order — public static readonly int X; and public readonly static int X; are both syntactically valid. SymbolExtractor's static-readonly-field regex at :71 hard-codes the sequence [(visibility) (new)? static readonly], so the moment the user writes readonly before static, the regex fails and the field is silently dropped from the symbol index. The same rigidity affects more exotic but legal orderings like readonly new static and new readonly static.

A second, related, modifier-order failure on the method regex at :94 drops methods that use the legacy static public convention (visibility after modifier) — common in older codebases, in Main declarations, and in some style guides where static is seen as the more important annotation.

public class Svc
{
    // Captured — canonical order
    public static readonly int A = 1;

    // Dropped — :71 requires static then readonly in that order
    public readonly static int B = 2;

    // Dropped — modifier order [readonly new static] is legal but :71 requires [(new)? static readonly]
    public readonly new static int D = 4;

    // Dropped — :94 requires visibility before modifiers, but `static public` puts them in the other order
    static public int F() => 0;
}

definition B, definition D, and definition F all return zero hits, even though all three declarations are perfectly legal C#. inspect Svc under-reports the static surface, and unused / hotspots --kind function falsely conclude the dropped members are absent.

This is distinct from:

Repro

CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-modifier-order
cat > /tmp/dogfood/cs-modifier-order/M.cs <<'EOF'
namespace CsModifierOrder;

public class Svc
{
    // Canonical order — captured
    public static readonly int A = 1;

    // Reversed order — DROPPED (:71 requires static then readonly)
    public readonly static int B = 2;

    // Canonical order with `new` — captured
    public new static readonly int C = 3;

    // Reversed order with `new` — DROPPED
    public readonly new static int D = 4;

    // const — captured (canonical)
    public const int E = 5;

    // Method with visibility-after-modifier — DROPPED (:94 requires visibility before modifier)
    static public int F() => 0;

    // Property with required+override — both orders captured (:100 modifier slot uses `*`)
    public required override int P { get; set; }
    public override required int Q { get; set; }
}

public abstract class Base
{
    public virtual int P { get; set; }
    public virtual int Q { get; set; }
}
EOF
"$CDIDX" index /tmp/dogfood/cs-modifier-order --rebuild
"$CDIDX" symbols --db /tmp/dogfood/cs-modifier-order/.cdidx/codeindex.db

Observed:

function   A                                        M.cs:6     ← captured
function   C                                        M.cs:12    ← captured
function   E                                        M.cs:18    ← captured (const)
property   P                                        M.cs:23    ← captured (required override)
property   Q                                        M.cs:24    ← captured (override required)
class      Svc                                      M.cs:3-25
class      Base                                     M.cs:27-31
property   P                                        M.cs:29
property   Q                                        M.cs:30
namespace  CsModifierOrder                          M.cs:1

Missing:

  • B at M.cs:9 (public readonly static int B = 2;) — :71's rigid static\s+readonly order rejects.
  • D at M.cs:15 (public readonly new static int D = 4;) — :71's (?:(?:new)\s+)?static\s+readonly rejects — new is required to be between (visibility) and static, not after readonly.
  • F at M.cs:21 (static public int F() => 0;) — :94's (?:visibility\s+)?(?:modifier\s+)* requires visibility before modifier; static matched first, then public is consumed as returnType (greedy) and int becomes name, then \s*\( expects ( but sees F( after int's capture — actually let me re-trace: visibility group matches nothing, modifier loop eats static, returnType matches public, name matches int, \s*(?:<[^>]+>\s*)?\( requires ( but sees F( after int\s* eats space, then expects ( but sees F → fails. No match.

Note the property regex :100 correctly handles both required override and override required because its modifier list (?:(?:static|virtual|override|abstract|sealed|new|required)\s+)* uses * and includes both keywords; the regex is order-free for the modifier slot. The static-readonly-field regex :71 is the lone holdout that hard-codes a specific sequence.

Suspected root cause (from reading the source)

Row A — static readonly field at src/CodeIndex/Indexer/SymbolExtractor.cs:71:

new("function",  new Regex(
    @"^\s*(?:(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+)?"
  + @"(?:(?:new)\s+)?"
  + @"static\s+readonly\s+"
  + @"(?<returnType>[\w?.<>\[\],:\s]+?)\s+(?<name>\w+)\s*[=;]",
    RegexOptions.Compiled),
    BodyStyle.None, "visibility", "returnType"),

The structure (?:(?:new)\s+)?static\s+readonly\s+ enforces:

  • new must come immediately before static (or be absent).
  • static must come immediately before readonly (no other words between).
  • readonly cannot precede static.

C# allows any order: static readonly, readonly static, new static readonly, new readonly static, readonly new static, static new readonly. Today only the canonical [(visibility)? (new)? static readonly] is matched.

Trace for public readonly static int B = 2;:

  1. ^\s* eats whitespace.
  2. visibility matches public + space.
  3. (?:(?:new)\s+)?new not present, zero iterations.
  4. static\s+readonly\s+ — cursor at readonly static; literal static expected, but readonly is next → fails.
  5. Backtrack: visibility could be absent. Then (?:(?:new)\s+)? — try new: not present. Then static: expected, but public is next → fails.

No other row rescues:

  • :69 (const) requires const keyword → fails.
  • :94 (method) requires \s*\( after the name → no ( on the line → fails.
  • :100/:103 (property) require { or => → neither present → fails.

Row B — method at src/CodeIndex/Indexer/SymbolExtractor.cs:94:

new("function",  new Regex(
    @"^\s*(?!(?:await|return|...|goto)\b)"
  + @"(?:(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+)?"
  + @"(?:(?:static|sealed|partial|readonly|unsafe|extern|virtual|override|abstract|async|new|file)\s+)*"
  + @"(?<returnType>\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)"
  + @"\s+(?<name>\w+)\s*(?:<[^>]+>\s*)?\(",
    RegexOptions.Compiled),
    BodyStyle.Brace, "visibility", "returnType"),

The structure forces visibility to come before the modifier loop. C# allows the visibility token to appear anywhere among the modifier sequence. Trace for static public int F() => 0;:

  1. Negative lookahead passes (next token is static, not a statement keyword).
  2. Visibility group is optional and tries to match — at the cursor static, static is not in the visibility alternation → group matches zero (skipped).
  3. Modifier loop: static matches, eaten.
  4. Cursor at public int F() => 0;. Modifier loop tries public — not in the modifier alternation (it's a visibility keyword, not a modifier) → loop terminates.
  5. returnType greedy matches public (just an identifier per the char class).
  6. \s+ eats space.
  7. name matches int.
  8. \s*(?:<[^>]+>\s*)?\( — at F() => 0;. \s* eats space, then \( expects ( but sees Ffails.

Backtracking the returnType-vs-name split doesn't help: any split that puts int after the space requires ( next, which never appears.

Same issue would affect static internal, async public, override public, virtual public, etc. — any line where a modifier token appears before the visibility token.

Suggested direction

(A) Replace :71's hard-coded sequence with a flexible modifier slot that requires both static and readonly to appear (in any order, possibly with new interleaved):

// :71 — change
(?:(?:new)\s+)?static\s+readonly\s+
// to a slot that requires `static` and `readonly` and allows `new` in any position
(?=(?:(?:static|readonly|new)\s+)*static\s+)
(?=(?:(?:static|readonly|new)\s+)*readonly\s+)
(?:(?:static|readonly|new|volatile)\s+)+

The two leading lookaheads enforce both keywords; the trailing modifier loop consumes the run in any order. volatile is added defensively (legal on fields, mutually exclusive with readonly per spec but harmless here since the lookaheads still require readonly).

A simpler alternative is to enumerate the small set of legal pairings as alternatives:

(?:
    (?:new\s+)?static\s+readonly      // canonical
  | (?:new\s+)?readonly\s+static      // reversed
  | static\s+(?:new\s+)?readonly      // new sandwiched
  | readonly\s+(?:new\s+)?static      // ditto reversed
  | static\s+readonly\s+new           // new at end
  | readonly\s+static\s+new
)\s+

The lookahead form is more compact; the alternation form is easier to maintain. Either is fine.

(B) Allow modifier-then-visibility on :94 by promoting visibility into the modifier loop with a captured group:

Today :94 separates visibility from modifiers; the order is fixed (visibility)?(modifier)*. Change to a single loop that captures visibility wherever it appears:

// :94 — change
(?:(?<visibility>public|private|...)\s+)?(?:(?:static|...|file)\s+)*
// to
(?:
    (?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+
  | (?:static|sealed|partial|readonly|unsafe|extern|virtual|override|abstract|async|new|file)\s+
)*

This change applies the same fix to all peer regexes that need it: :97 (ctor), :100 / :103 (property), :105 (delegate), :107 (event), :118 (indexer). Each row's modifier-loop becomes a single alternation that includes visibility as one option.

For backward compatibility (preserving existing capture semantics), the (?<visibility>…) named group inside the alternation captures only when the visibility branch fires; otherwise the named group is empty, matching today's behavior on lines that omit visibility.

(C) Tests. Add fixtures to SymbolExtractorTests.cs covering:

  • public static readonly int A = 1; → captured (regression).
  • public readonly static int B = 2; → captured (this issue).
  • public new static readonly int C = 3; → captured (regression).
  • public readonly new static int D = 4; → captured (this issue).
  • public new readonly static int D2 = 4; → captured (defensive).
  • static public int F() => 0; → captured (this issue, method row).
  • static internal void G() { } → captured (defensive).
  • async public System.Threading.Tasks.Task H() { } → captured (defensive).
  • Regression: public static int I() => 0; still captured.

Why it matters

  • Modifier order is a style preference, not a correctness gate. Real codebases freely mix public static, static public, public readonly static, etc. — particularly older codebases, codebases following STL/native-style conventions, generated code, and Main-method declarations in tutorials. Silent drops in those codebases skew unused / hotspots and break definition lookups.
  • Tutorial / educational corpus is heavy on static public void Main (the first-method-students-see convention). An AI agent reviewing a learning project sees an empty symbol table for the entry point and concludes the project doesn't compile.
  • Main matters for map's entrypoint inference. The map command scores files by entrypoint heuristics (SymbolExtractorRepoMapBuilder). A static public void Main is invisible to the symbol table, so map falls back to the file-name heuristic — which works for Program.cs but not for Main.cs / custom-named files.
  • Silent. No warning, no phantom — uniform with the rest of the modifier-list-omission family.
  • Fix is bounded. (A) is a localized rewrite of one regex; (B) is a one-shot edit applied uniformly across the C# row set.

Cross-language note

  • C# only. The free modifier order for static, readonly, etc. is C#-specific grammar.
  • Java uses a similar free-order grammar (public static finalstatic public final are both legal). The Java row set in SymbolExtractor.cs should be spot-checked for the same family of bugs — out of scope for this filing.
  • Kotlin / Swift / Rust all enforce specific modifier orders syntactically; their regexes don't have this problem.

Scope

  • src/CodeIndex/Indexer/SymbolExtractor.cs:71 — static-readonly-field regex — replace hard-coded static\s+readonly with order-free modifier slot.
  • src/CodeIndex/Indexer/SymbolExtractor.cs:94 — method regex — allow visibility token to appear anywhere in the modifier sequence.
  • Optional: same fix on :97 (ctor), :100 / :103 (property), :105 (delegate), :107 (event), :118 (indexer) for uniform behavior.
  • tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures listed in (C).
  • DEVELOPER_GUIDE.md language-pattern reference table — note that modifier order is free for C# rows.

Related

Environment

  • cdidx: v1.10.0 (/root/.local/bin/cdidx).
  • Platform: linux-x64.
  • Fixture: /tmp/dogfood/cs-modifier-order/M.cs (inline in Repro).
  • Filed from a cloud Claude Code session per CLOUD_BOOTSTRAP_PROMPT.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions