patch_review: Implement LLM-based CheckpatchFixer and SparseFixer with execution ordering by ananghos-qti · Pull Request #90 · qualcomm/PatchWise

ananghos-qti · 2026-05-15T13:38:00Z

Add two new AiReview-derived agents (CheckpatchFixer, SparseFixer) that consume static analysis output and apply LLM-powered fixes. Implement priority-based execution ordering in review_commit() to ensure static analyzers run before fixers, enabling proper data flow via kwargs.

Technical Implementation:

Created patch_review/ai_fix/ module with two fixer agents
Modified review_commit() to sort reviews by priority:
- Priority 0: Static analyzers (Checkpatch, Sparse)
- Priority 1: Fixers (CheckpatchFixer, SparseFixer)
- Priority 2: All other reviews (AiCodeReview, LLMCommitAudit, DtCheck, DtbsCheck, Coccicheck)
Implemented selective kwargs filtering to pass only relevant static analysis output to corresponding fixers (checkpatch_output to CheckpatchFixer, sparse_output to SparseFixer)
Both fixers override init to accept **kwargs and call run_agent_loop(messages) for LLM interaction

CheckpatchFixer Architecture:

Two-stage fixing pipeline: script-based deterministic fixes followed by LLM-powered intelligent corrections
Stage 1 - Script Fixes:
- Trailing whitespace removal via rstrip()
- DOS line ending normalization (\r\n → \n)
- SPDX license identifier format correction (// for .c, /* */ for .h)
- Space-before-tab elimination
- Invokes checkpatch.pl --fix when available for automated fixes
Stage 2 - LLM Fixes:
- Parses checkpatch output using regex: r':(\d+):\s+(ERROR|WARNING|CHECK):'
- Extracts line numbers, severity levels, and error messages
- Constructs few-shot prompt with concrete before/after examples:
  - Whitespace fixes (trailing spaces, DOS endings, tabs vs spaces)
  - SPDX license corrections
  - K&R brace placement
  - Macro argument parenthesization
  - func usage instead of hardcoded function names
- System prompt emphasizes patch integrity preservation and minimal changes using Chain-of-Thought reasoning
- Limits to top 20 issues to prevent context overflow
- Post-processes LLM output to strip markdown code blocks

SparseFixer Architecture:

Single-stage LLM-based fixing with comprehensive issue classification
Parsing:
- Regex: r'([^:]+.c):(\d+):(\d+):\s+(error|warning):\s+(.+)'
- Captures both sparse errors and warnings (not just warnings)
- Extracts file path, line number, column, severity, and message
Classification:
- CONTEXT_IMBALANCE: Lock acquisition/release annotations
- TYPE_MISMATCH: Incompatible type assignments
- ADDRESS_SPACE: __user, __iomem, __percpu annotations
- ENDIANNESS: __be16/__be32/__le16/__le32 annotations
- STATIC: File-local scope declarations
- CAST: Type casting issues
- OTHER: Unclassified sparse issues
Prompting Strategy:
- Few-shot examples for address space annotations (__iomem placement)
- Few-shot examples for endianness corrections
- Few-shot examples for context balance (__acquires/__releases)
- System prompt uses Chain-of-Thought reasoning to distinguish between safe fixes and behavior-changing modifications
- Groups warnings by file and limits to 10 per file for context management
- Emphasizes kernel maintainer-level judgment over mechanical fixes
Post-processing:
- Strips markdown code blocks from LLM response
- Searches for patch start markers (diff --git, --- a/) to extract clean unified diff

Prompting Methodology:

Both fixers use structured few-shot prompting with concrete examples
System prompts employ Chain-of-Thought reasoning patterns:
- "Think like a Linux kernel maintainer, not a linter"
- "If a fix is obvious to a kernel developer, apply it"
- "If a fix would risk semantic change, DO NOT apply it"
Explicit safety constraints in prompts:
- Preserve diff structure (@@, +, -, headers)
- No logic or behavior changes
- Minimal, surgical modifications only
Output format strictly enforced: unified diff only, no explanations

Docker Integration:

CheckpatchFixer.Dockerfile: Extends base kernel environment
SparseFixer.Dockerfile: Extends base kernel environment
Both registered with @register_llm_review and @register_long_review decorators for proper lifecycle management

Testing:

tests/ai_fix/test_fixers.py: Unit tests for initialization, output parsing, kwargs handling, and issue classification

Execution Flow:

User runs: patchwise --reviews Checkpatch CheckpatchFixer Sparse SparseFixer
Priority sorting ensures: Checkpatch → CheckpatchFixer → Sparse → SparseFixer
Checkpatch runs, stores output in results['Checkpatch']
CheckpatchFixer receives checkpatch_output via kwargs, applies fixes
Sparse runs, stores output in results['Sparse']
SparseFixer receives sparse_output via kwargs, applies fixes
All other reviews run in priority 2

aditya-qti · 2026-05-18T04:48:32Z

+────────────────────────────────────────
+OUTPUT FORMAT (STRICT)
+────────────────────────────────────────
+1. Corrected git diff patch (valid unified diff)
+2. Corrected commit message (if modified)
+3. Safety checklist:
+   - [x] checkpatch issues addressed
+   - [x] No functional change
+   - [x] Patch structure preserved
+   - [x] Upstream-ready
+
+DO NOT INCLUDE:
+- Explanations
+- Reasoning
+- Analysis
+- checkpatch logs
+- Commentary
+
+Only final results.


AI generated patches fail to apply with git am/git apply most of the time no matter what the prompt has. They are just bad at that.

A better design that actually works would be to let the AI do targeted edits to the files using a tool like write_file and then generate the patch using git diff HEAD~1. This patch is guaranteed to be git am-able on top of the commit being reviewed. Note that the tree in docker container is reset to the commit being reviewed.

I'd put this on hold until I upstream the AiPatchFix that fixes the issues reported by AiCodeReview. You can then use that as a reference.

aditya-qti · 2026-05-18T04:51:31Z

+
+@register_llm_review
+@register_long_review
+class SparseFixer(AiReview):


AiReview has no tools!

Your fixer would be a single LLM call. There is no guarantee that the next sparse run will succeed.

AI should be given a tool to run sparse on the go so it can verify that the issues indeed have been fixed.

aditya-qti · 2026-05-18T04:52:46Z

+
+@register_llm_review
+@register_long_review
+class CheckpatchFixer(AiReview):


Same here, AI should be given a tool to run checkpatch so that it can verify it passes.

aditya-qti · 2026-05-18T04:57:35Z

+@register_llm_review
+@register_long_review


We should do something like register_fixer(Checkpatch) that automatically routes the right review to each registered fixer.

Also run the fixers only when --fix is passed, just like checkpatch.

aditya-qti · 2026-05-18T04:59:33Z


+    # Sort reviews to ensure static analysis runs before fixers
+    # This ensures CheckpatchFixer gets Checkpatch output, SparseFixer gets Sparse output
+    def review_sort_key(review_class):


All reviews should run first, and only then should fixers run. This is to make sure we don't stall any reviews in the pipeline and ensure consistency in the code.

aditya-qti · 2026-05-18T05:06:36Z

+# ============================================================================
+# Integration Tests
+# ============================================================================
+
+class TestIntegration:
+    """Integration tests for fixers with PatchWise workflow."""
+
+    def test_checkpatch_fixer_registration(self):
+        """Test that CheckpatchFixer is properly registered."""
+        from patchwise.patch_review.decorators import AVAILABLE_PATCH_REVIEWS, LLM_REVIEWS
+
+        review_names = [cls.__name__ for cls in AVAILABLE_PATCH_REVIEWS]
+        assert 'CheckpatchFixer' in review_names
+
+        llm_review_names = [cls.__name__ for cls in LLM_REVIEWS]
+        assert 'CheckpatchFixer' in llm_review_names
+
+    def test_sparse_fixer_registration(self):
+        """Test that SparseFixer is properly registered."""
+        from patchwise.patch_review.decorators import AVAILABLE_PATCH_REVIEWS, LLM_REVIEWS
+
+        review_names = [cls.__name__ for cls in AVAILABLE_PATCH_REVIEWS]
+        assert 'SparseFixer' in review_names
+
+        llm_review_names = [cls.__name__ for cls in LLM_REVIEWS]
+        assert 'SparseFixer' in llm_review_names
+
+    def test_fixers_extend_ai_review(self):
+        """Test that fixers properly extend AiReview base class."""
+        from patchwise.patch_review.ai_review.ai_review import AiReview
+
+        assert issubclass(CheckpatchFixer, AiReview)
+        assert issubclass(SparseFixer, AiReview)
+
+    def test_fixers_have_required_methods(self):
+        """Test that fixers implement required methods."""
+        # CheckpatchFixer
+        assert hasattr(CheckpatchFixer, 'setup')
+        assert hasattr(CheckpatchFixer, 'run')
+        assert hasattr(CheckpatchFixer, 'get_system_prompt')
+        assert hasattr(CheckpatchFixer, 'get_user_prompt')
+
+        # SparseFixer
+        assert hasattr(SparseFixer, 'setup')
+        assert hasattr(SparseFixer, 'run')
+        assert hasattr(SparseFixer, 'get_system_prompt')
+        assert hasattr(SparseFixer, 'get_user_prompt')


These tests are trivial and unnecessary. Please remove.

aditya-qti · 2026-05-18T05:07:56Z

+        assert hasattr(SparseFixer, 'get_user_prompt')
+
+
+if __name__ == "__main__":


main() is not needed, The tests are run as:

pytest tests -v -s

aditya-qti · 2026-05-18T05:17:13Z

+# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+# SPDX-License-Identifier: BSD-3-Clause
+"""
+Unit tests for CheckpatchFixer and SparseFixer agents.


Unit tests are a burden to maintain when the project is in active development where refactors can happen frequently. Please write system tests instead.

Add something like:

tests/sparse/patches/p1.patch
tests/sparse/patches/p2.patch
tests/sparse/test_sparse_fixer.py

tests/checkpatch/patches/p1.patch
tests/checkpatch/test_checkpatch_fixer.py

Each fixer test should:

Apply a patch

Run the review

Run the fixer

Assert the fixer output

aditya-qti · 2026-05-18T05:23:12Z

Remove since this does not add anything over base,Dockerfile.

aditya-qti · 2026-05-18T05:23:18Z

Remove since this does not add anything over base,Dockerfile.

aditya-qti · 2026-05-18T05:26:42Z

Thank you for this PR! Great idea to let patchwise fix the static analysis errors.

We need a stronger "Fixer" framework that can be used across all the fixers. I'd put this on hold until I upstream the AiPatchFix that fixes the issues reported by AiCodeReview so you can take that as a reference.

aditya-qti · 2026-05-22T16:48:39Z

Please use this as a reference:

https://github.com/qualcomm/PatchWise/blob/3ba79496e8242c52a49cacd6e0548d829d39b132/patchwise/patch_review/ai_fix/ai_patch_fix.py

aditya-qti · 2026-05-22T16:52:03Z

Please use this as a reference:

https://github.com/qualcomm/PatchWise/blob/3ba79496e8242c52a49cacd6e0548d829d39b132/patchwise/patch_review/ai_fix/ai_patch_fix.py

Probably need to move _generate_git_patch() into AiFix so you can use it.

…gned to ai_patch_fix Signed-off-by: Ananya Ghosh <ananghos@qti.qualcomm.com>

Signed-off-by: ananghos-qti <ananghos@qti.qualcomm.com>

aditya-qti · 2026-06-08T09:35:09Z

The overall design is in the right direction, but a few cleanups are necessary. Please take a look and fix these issues one-by-one:

1. Improve the commit messages

The PR description explains the design well, but the commits themselves do not carry that context.

Current commits such as:

Delete patchwise/pyproject.toml
Update pyproject.toml with chromadb

are too terse for review/history. Please either fold the dependency/config-only commits into the main feature commit, or add proper commit bodies explaining the reason for each change.

The main feature commit should include a concise explanation of:

what CheckpatchFixer and SparseFixer add
why RAG / chromadb is needed
how the fixers integrate with the existing --fix flow
any limitations or follow-up work

2. Move custom tools into the existing Agent tool model

CheckpatchFixer and SparseFixer correctly register as fixers via @register_fix(...), which addresses the earlier fixer-routing feedback.

The remaining issue is the verification tools. The earlier review asked that the AI be able to run checkpatch/sparse during fixing, but the new run_checkpatch and run_sparse tools are currently added through _add_checkpatch_tool() / _add_sparse_tool() by appending to self.agent.tools and self.agent.tool_handlers.

That does not match the current Agent architecture. Existing tools are defined globally in tool_definitions.py and dispatched centrally in agent.py.

Please move these tools into the existing model:

add run_checkpatch and run_sparse schemas to patchwise/patch_review/ai_agent/tool_definitions.py
add their handlers in patchwise/patch_review/ai_agent/agent.py
remove the per-fixer _add_*_tool() registration logic

This keeps all tool definitions and dispatch mappings in one place, consistent with the rest of PatchWise.

3. Keep AI-requested file access inside the container

For run_checkpatch / run_sparse, please follow the existing tool convention: accept only kernel-relative paths, validate them with Agent._validate_existing_kernel_path(), then convert them with Agent._container_kernel_path() before running Docker commands.

Avoid direct host-side filesystem access such as open(...), os.path.exists(...), or tempfile.NamedTemporaryFile(dir=kernel_dir) on AI-supplied or container-derived paths. Letting the AI cause host file reads/writes/checks is a security risk; LLM-triggered file operations should stay inside the reviewed kernel tree in the container.

4. Move duplicated fixer helpers to `AiFix`

There is duplicated code across multiple fixer classes, especially:

_strip_trailers()
_generate_git_patch()

These are common fixer behaviors and should live in the AiFix superclass instead of being copied into each subclass. This matches the earlier comment that _generate_git_patch() should probably move into AiFix.

This would make the new fixers smaller, easier to review, and consistent with the existing AiPatchFix flow.

5. Check container-vs-host file access

Some new code appears to use normal host filesystem APIs on paths that are container-side paths, such as /home/patchwise/kernel.

Examples to review:

os.path.exists(...)
open(...)
tempfile.NamedTemporaryFile(dir=kernel_dir)
direct RAG indexing of kernel_dir / Documentation

Please double-check that these operations happen in the intended filesystem. Code running on the host should not assume Docker container paths exist on the host. Where the target is the reviewed kernel tree inside Docker, use the existing Docker/container mechanisms.

6. Keep the PR description aligned with the diff

The PR description mentions review ordering, kwargs filtering, and tests, but I do not see all of that reflected in the diff.

Please either update the PR description to match the actual implementation, or include the missing pieces in this PR. If tests are added, please follow the earlier feedback and prefer system-style fixer tests over trivial unit tests: apply a patch, run the review, run the fixer, and assert the fixer output.

7. Remove or justify unused Dockerfile changes

The earlier comments asked to remove fixer Dockerfiles that do not add anything over the base image. The current diff no longer has CheckpatchFixer.Dockerfile / SparseFixer.Dockerfile, which is good.

However, the PR now adds KernelDocRag.Dockerfile. Please remove it if it is not actually used by the container selection flow, or explain how it is built and selected. If chromadb is already added as a Python dependency, this Dockerfile may not be needed.

ananghos-qti requested a review from aditya-qti as a code owner May 15, 2026 13:38

aditya-qti reviewed May 18, 2026

View reviewed changes

patch_review: Implement LLM-based CheckpatchFixer and SparseFixer ali…

8dec548

…gned to ai_patch_fix Signed-off-by: Ananya Ghosh <ananghos@qti.qualcomm.com>

ananghos-qti force-pushed the ai_fix branch 2 times, most recently from 20915c9 to 8dec548 Compare June 4, 2026 10:41

ananghos-qti added 2 commits June 4, 2026 16:15

Delete patchwise/pyproject.toml

4d58fb2

Signed-off-by: ananghos-qti <ananghos@qti.qualcomm.com>

Update pyproject.toml with chromadb

2215610

Signed-off-by: ananghos-qti <ananghos@qti.qualcomm.com>

		assert hasattr(SparseFixer, 'get_user_prompt')


		if __name__ == "__main__":

		@register_llm_review
		@register_long_review

Conversation

ananghos-qti commented May 15, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aditya-qti commented May 18, 2026

Uh oh!

aditya-qti commented May 22, 2026

Uh oh!

aditya-qti commented May 22, 2026

Uh oh!

aditya-qti commented Jun 8, 2026

1. Improve the commit messages

2. Move custom tools into the existing Agent tool model

3. Keep AI-requested file access inside the container

4. Move duplicated fixer helpers to AiFix

5. Check container-vs-host file access

6. Keep the PR description aligned with the diff

7. Remove or justify unused Dockerfile changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

4. Move duplicated fixer helpers to `AiFix`