Skip to content

feat(evals): add C++ multi-language retrieval eval and fix namespace-dropped C++ caller qns#556

Merged
vitali87 merged 4 commits into
mainfrom
eval/cpp-multilang-retrieval
Jul 1, 2026
Merged

feat(evals): add C++ multi-language retrieval eval and fix namespace-dropped C++ caller qns#556
vitali87 merged 4 commits into
mainfrom
eval/cpp-multilang-retrieval

Conversation

@vitali87

@vitali87 vitali87 commented Jun 30, 2026

Copy link
Copy Markdown
Owner

Summary

Adds the C++ multi-language retrieval eval (cgr's C++ CALLS vs an independent libclang oracle) and fixes the dominant cgr bug it surfaced: the call pass dropped the enclosing namespace from C++ caller qualified names.

The eval

For each first-party C++ function/member function, which files call it. cgr's C++ CALLS edges (reduced to (caller_file, callee_simple_name)) are graded against call sites extracted by libclang, over the same first-party name universe. cgr parses C++ with tree-sitter by default (CPP_FRONTEND=libclang off), so libclang is an independent oracle. No compile_commands.json is needed: each source is parsed directly with the SDK sysroot, the SDK's libc++ headers, and first-party include dirs; a TU that still errors abstains (held out of the graded covered set on both sides). Both the C and C++ oracles now pin a system libclang whose clang version matches the SDK's libc++ (the pip wheel's older clang cannot parse current C++ standard headers).

The bug (root cause)

The definition pass binds a C++ free function or class inside a namespace to a namespaced qn (module.ns.fn, module.ns.Class), but the call pass built the enclosing caller's qn without the namespace (module.fn, module.Class.method). Every such CALLS edge's source pointed at a node that does not exist, so the call never attached. On leveldb (all in namespace leveldb), 904 of 1227 C++ call sources dangled.

Fix: route both the free-function qn (_build_nested_qualified_name ignored namespace_definition ancestors) and the class qn through the same cpp_utils.build_qualified_name the definition pass uses, so caller and node qns always agree. Same family as the Go/Java/Rust caller-qn fixes.

RED → GREEN: test_cpp_namespace_call_caller_qn.py asserts a namespaced free function and a namespaced inline method attribute their calls to the namespaced caller node; it failed before the fix.

Result on leveldb (40/42 core sources parse cleanly)

before after
precision 0.99 0.96
recall 0.54 0.82
F1 0.70 0.88

Dangling C++ call sources: 904 → 251. The oracle grades a call only when libclang resolves its callee to a first-party declaration (child.referenced), so std:: calls whose simple name collides with a first-party method are not counted.

Remaining tail (documented in evals/README.md, not scoped away)

  • Operator overloads (operator=, operator[]): libclang counts them as method calls; cgr models them as builtin.cpp.* — a metric difference, not a misresolution.
  • Trie-fallback misresolution (the ~30 FP: size, data, empty, clear, begin, end): cgr's name-only fallback binds an external std:: call to a same-named first-party method; the oracle correctly treats it as external.
  • Receiver-type method dispatch and out-of-line static methods (DB::Open): needs C++ receiver type inference (C++ is not yet in the typed-language set), the same deeper gap as the Go/Java/Rust tails. Follow-on.

Full suite: 4242 passed, 12 skipped. ruff/ty clean.

@vitali87

Copy link
Copy Markdown
Owner Author

@greptile review

@greptile-apps

greptile-apps Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a C++ retrieval eval and fixes C++ namespaced caller attribution. The main changes are:

  • New C++ CALLS retrieval eval comparing cgr output with a libclang oracle.
  • First-party callee filtering in the C++ oracle using resolved libclang references.
  • C++ caller qualified-name construction updated to preserve enclosing namespaces.
  • Tests for namespaced C++ call attribution and eval scoring.

Confidence Score: 5/5

The changes appear safe to merge based on the focused eval and caller attribution updates.

No blocking correctness issues were identified in the reviewed changes, and the PR includes targeted regression coverage for the namespace caller-name behavior.

T-Rex T-Rex Logs

What T-Rex did

  • Validated the CALLERS_OF_CALLEE change after the patch by comparing pre-change values (cpp_ns_calls.sample.free_caller and cpp_ns_calls.sample.K.method with no .acme., and FREE_ATTACHED=False, METHOD_ATTACHED=False) to post-change values (cpp_ns_calls.sample.acme.free_caller and cpp_ns_calls.sample.acme.K.method with .acme. and FREE_ATTACHED=True, METHOD_ATTACHED=True).
  • Reviewed the cpp-retrieval evaluation setup and found blockers that prevent running the eval: the before/after artifacts show the environment cannot resolve Python 3.12 type alias syntax in codebase_rag/graph_updater.py, with Python 3.11.6 used in the after state and pytest not installed.
  • Checked the oracle-filtering artifacts and confirmed libclang is unavailable, causing the oracle checks to be skipped and no oracle semantics mismatch to be reported.

View all artifacts

T-Rex Ran code and verified through T-Rex

Reviews (5): Last reviewed commit: "fix(evals): grade C++ calls only when li..." | Re-trigger Greptile

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a C++ call-graph retrieval evaluation harness comparing tree-sitter-based call resolution against a libclang oracle, and fixes a bug where namespaces were dropped from C++ caller qualified names. A review comment was kept which identifies an issue in the libclang initialization loop where a failure to load a candidate prevents trying subsequent candidates.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread evals/oracles/cpp_oracle.py Outdated
@greptile-apps

greptile-apps Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

T-Rex pricing update — T-Rex was free through June 2026. Effective July 1, 2026, T-Rex adds 2 credits on top of the standard 1-credit review (3 total). T-Rex settings

@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a C++ retrieval eval and fixes namespaced C++ caller attribution. The main changes are:

  • Adds a libclang-backed C++ CALLS oracle and retrieval scoring CLI.
  • Filters cgr C++ CALLS eval results to first-party names and cleanly parsed oracle files.
  • Updates C++ caller qualified-name construction so namespaced free functions and inline methods match definition nodes.
  • Adds tests for namespaced C++ CALLS attribution and the C++ retrieval eval fixture.
  • Documents the C++ eval setup, observed leveldb results, and remaining known gaps.

Confidence Score: 5/5

The changes are well-scoped to C++ call attribution and evaluation tooling, with targeted tests covering the namespace caller-name fix and eval fixture behavior.

No code issues were identified in the reviewed changes, and the implementation aligns the call pass with the definition pass while keeping the new eval logic documented and tested.

T-Rex T-Rex Logs

What T-Rex did

  • The head change was validated by confirming that the namespaced CALLS sources exist and caller_node_exists is True for both cpp_ns_calls.sample.acme.free_caller and cpp_ns_calls.sample.acme.K.method, and the head pytest passed.
  • The cpp retrieval evaluation was inspected across before and after; the after state includes added files and a header-free namespaced fixture, but the test run is blocked by Python 3.11 SyntaxError requiring Python >=3.12.
  • The libclang oracle configuration was compared between before and after; the head shows that libclang is available but the expected C++ direct-call oracle and pinning/include helpers are absent.

View all artifacts

T-Rex Ran code and verified through T-Rex

Comments Outside Diff (1)

  1. General comment

    P1 C++ direct-call oracle is not exposed or implemented on head

    • Bug
      • The PR contract expects evals.oracles.run_cpp_call_oracle to be exported and usable with an extra_defines tuple to prove first-party include directories, --define macros, and abstention on broken translation units. In the executed head validation, has_run_cpp_call_oracle=False; direct inspection also shows evals/oracles/__init__.py imports only cpp_available, run_c_call_oracle, and run_cpp_oracle from cpp_oracle.py, while cpp_oracle.py contains no run_cpp_call_oracle symbol. Because libclang is available in this environment (cpp_available()=True), this is not an environment-only absence; the expected C++ oracle entry point is missing.
    • Cause
      • The working-tree version of evals/oracles/cpp_oracle.py appears to include the C call oracle but not the C++ direct-call oracle/pinning/include helper implementation described by the PR validation objective, and evals/oracles/__init__.py does not export run_cpp_call_oracle.
    • Fix
      • Add run_cpp_call_oracle(target: Path, extra_defines: tuple[str, ...] = ...) to evals/oracles/cpp_oracle.py, route it through the shared pre-parse libclang pinning path, include C++ language/std/system/resource/first-party include arguments and -D user macros, abstain on TUs with error diagnostics, and export it from evals/oracles/__init__.py/__all__.

    T-Rex Ran code and verified through T-Rex

Reviews (2): Last reviewed commit: "feat(evals): add C++ multi-language retr..." | Re-trigger Greptile

@vitali87

vitali87 commented Jul 1, 2026

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread evals/oracles/cpp_oracle.py
@vitali87

vitali87 commented Jul 1, 2026

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread evals/oracles/cpp_oracle.py Outdated
@vitali87

vitali87 commented Jul 1, 2026

Copy link
Copy Markdown
Owner Author

@greptile review

@vitali87 vitali87 merged commit 96d2cbd into main Jul 1, 2026
2 of 15 checks passed
@vitali87 vitali87 deleted the eval/cpp-multilang-retrieval branch July 1, 2026 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant