Skip to content

C# constructors with wrapped leading modifier (static / public on its own line) are silently dropped #348

@Widthdom

Description

@Widthdom

Summary

C# constructors whose leading modifier keyword (static, public, private, protected, internal) is placed on a separate physical line from the identifier — a common Allman-style wrap for long attribute lists or line-length limits — are silently dropped by SymbolExtractor. Both static constructors and instance constructors are affected. Methods with the same wrapped-modifier shape are not affected, because the method regex anchors on returnType Name(...) which can be matched on the identifier line alone without seeing the prior modifier.

// DROPPED — static ctor, `static` on previous line
public class A
{
    static
    A() { _x = 1; }
}

// DROPPED — instance ctor, `public` on previous line
public class B
{
    public
    B() { _y = 1; }
}

// CAPTURED — method, modifiers on previous line
public class C
{
    public static
    int M() => 0;          // → function M at L25 ✓
}

definition A --exact and definition B --exact return No definitions found. — the class indexes, the field backing state indexes, but the constructor rows are gone.

Repro

CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-wrapped-ctor
cat > /tmp/dogfood/cs-wrapped-ctor/W.cs <<'EOF'
namespace WrappedCtor;

// Wrapped static ctor: static on one line, name on next
public class A
{
    static
    A() { _x = 1; }

    private static int _x;
}

// Regular wrapped ctor (non-static)
public class B
{
    public
    B() { _y = 1; }

    private int _y;
}

// Wrapped method signature (control — CAPTURED)
public class C
{
    public static
    int M() => 0;
}

// Wrapped static ctor with attribute
public class D
{
    [System.Obsolete]
    static
    D() { _z = 1; }

    private static int _z;
}

// Control: everything on one line (CAPTURED)
public class E
{
    static E() { _w = 1; }
    public E() { _v = 1; }

    private static int _w;
    private int _v;
}
EOF
"$CDIDX" index /tmp/dogfood/cs-wrapped-ctor --rebuild
"$CDIDX" symbols --db /tmp/dogfood/cs-wrapped-ctor/.cdidx/codeindex.db

Observed:

class      A                                        W.cs:4-10
class      B                                        W.cs:13-19
class      C                                        W.cs:22-26
class      D                                        W.cs:29-36
class      E                                        W.cs:39-47
function   E                                        W.cs:41    ← static E() (same-line)
function   E                                        W.cs:43    ← public E() (same-line)
function   M                                        W.cs:25    ← method with wrapped modifiers: CAPTURED
namespace  WrappedCtor                              W.cs:1
(9 symbols in 1 files)

Missing: the ctors in classes A, B, and D — 3 of 5 ctor rows silently lost. Class C's method M proves the per-line extractor happily recaptures wrapped-modifier methods, so the issue is specific to ctors.

Suspected root cause

src/CodeIndex/Indexer/SymbolExtractor.cs:94-97 and :120:

// Method — line 94. `visibility` and modifier run are OPTIONAL; returnType + name + `(` on same line suffice.
new("function",  new Regex(@"^\s*(?!(?:await|...)\b)(?:(?<visibility>public|private|...)\s+)?(?:(?:static|...)\s+)*(?<returnType>\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)\s+(?<name>\w+)\s*(?:<[^>]+>\s*)?\(", ...), BodyStyle.Brace, "visibility", "returnType"),

// Constructor — line 97. `visibility` is REQUIRED on the same line as the name.
new("function",  new Regex(@"^\s*(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+(?<name>\w+)\s*\(", ...), BodyStyle.Brace, "visibility"),

// Static constructor — line 120. `static` is REQUIRED on the same line as the name.
new("function",  new Regex(@"^\s*static\s+(?<name>\w+)\s*\(\s*\)\s*\{?", ...), BodyStyle.Brace),

The extractor feeds the patterns one physical line at a time (SymbolExtractor.cs:441-452). For a wrapped ctor such as:

    static
    A() { _x = 1; }
  • Line static — no identifier + (, nothing matches.
  • Line A() { _x = 1; } — the static-ctor regex (:120) requires static\s+ at the start and fails; the instance-ctor regex (:97) requires (?<visibility>public|...)\s+ at the start and fails; the method regex (:94) requires returnType\s+name (two tokens before () and sees only A(, which fails (A matches returnType, but there's no second \w+ before \().

No pattern claims the line. Silent drop. No warning emitted.

Methods don't drop because the method regex makes visibility+modifier run OPTIONAL — on the identifier line int M() => 0; the regex still matches int as returnType and M as name. Ctors have no returnType, so they can't lean on the same fallback.

Suggested direction

Two approaches, either sufficient:

(A) Teach the per-line extractor to concatenate "modifier-only" lines with the following non-empty line, then reapply all C# patterns to the concatenated candidate. Precedent exists in the repo for 2-line peek-ahead concepts mentioned in #229 and #345's suggested fixes — introduce a small helper that, when a line looks like ^\s*(?:public|private|protected|internal|static|partial|readonly|abstract|sealed|virtual|override|async|new|file|unsafe|extern)(?:\s+(?:public|private|...))*\s*$, joins it with the next non-empty line before matching. This fixes wrapped ctors, wrapped static ctors, and — as a side-effect — any future wrapped-shape symbol that only has an identifier-line anchor.

(B) Add name-only candidate rows for ctors that check the previous non-empty line for the required modifier. For static ctor: a row matching ^\s*(?<name>\w+)\s*\(\s*\)\s*\{? with a post-check that the previous non-empty line ends with static. For instance ctor: same shape, post-check that previous line ends with a visibility keyword. Slightly more surgical than (A) but adds two rows and a stateful back-peek that isn't used elsewhere today.

Preferred: (A). It's a one-time helper change, covers ctors and every other symbol shape in the same category, and mirrors the "line-join" scheme that several adjacent issues (#229 wrapped property brace, #345 wrapped property arrow) already converge on.

Regardless of approach, add a regression guard that methods with wrapped modifiers (public static\nint M() => 0;) keep being captured via the method regex — the fix shouldn't reroute them through a new ctor path and accidentally lose the returnType.

Why it matters

  • Allman-style wrapping for long attribute lists is common in C# style guides that cap line length at 100-120 chars. [ModuleInitializer]\nstatic\nFoo() { ... } is the canonical shape for attributed module initializers when the attribute itself is long.
  • Silent drop. Navigation tools (definition, outline, inspect, callers) can't find a ctor that exists in the source. An AI agent asked "where is A's static ctor?" gets zero hits.
  • unused and hotspots undercount. A file with five ctors, two of which are wrapped, shows only three in symbol counts.
  • Breaks inspect/analyze_symbol trust. Asking for symbol info on a class whose ctor was wrapped returns metadata without the ctor, which looks correct on its face.

Cross-language note

  • C# — documented here; both static and instance ctors affected.
  • Java — Java has the same extractor family and the ctor regex shape is likely similar. Worth a spot-check. The Java symbol row at SymbolExtractor.cs:162+ — I did not verify exhaustively in this session, so treat as "suspected-also-affected" rather than confirmed.
  • Kotlin / Swift / Rust — don't have the same ctor shape (primary constructor syntax). Not affected.

Scope

  • src/CodeIndex/Indexer/SymbolExtractor.cs:97 — instance ctor regex.
  • src/CodeIndex/Indexer/SymbolExtractor.cs:120 — static ctor regex.
  • src/CodeIndex/Indexer/SymbolExtractor.cs:441-452 — per-line extraction loop; approach (A) hooks here with a 1-line look-ahead buffer.
  • tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures for wrapped-modifier static ctor, wrapped-visibility instance ctor, wrapped-static-ctor-with-attribute, and regression for wrapped-modifier method (still captures via method row, not via ctor row).

Related

Environment

  • cdidx: v1.10.0 (/root/.local/bin/cdidx).
  • Platform: linux-x64.
  • Fixture: /tmp/dogfood/cs-wrapped-ctor/W.cs.
  • Filed from a cloud Claude Code session per CLOUD_BOOTSTRAP_PROMPT.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions