feat: add LLMAISmell checker for AI-generated requirement document detection by e06084 · Pull Request #442 · MigoXLab/dingo

e06084 · 2026-06-18T10:41:07Z

Summary

Adds LLMAISmell, a new LLM-based checker that detects AI-generated writing patterns in requirement documents.

Background

Requirement docs written (or heavily assisted) by AI tend to share recognizable patterns: hollow truisms, repetitive rephrasing, inflated claims, lack of concrete detail, and buzzword overuse. This checker quantifies those patterns so reviewers can flag and push back on low-quality PRDs.

What's New

dingo/model/llm/llm_ai_smell.py — main checker, registered as LLMAISmell

Evaluates 5 dimensions (each scored 0–10):

Dimension	CN	Description
`correct_nonsense`	💊 正确的废话指数	Hollow truisms: "In today's rapidly evolving..."
`infinite_mirror`	🪞 无限镜像感	Same point rephrased multiple times
`rainbow_fart`	🌈 彩虹屁密度	Inflated claims without data: "revolutionize", "industry-leading"
`detail_vacuum`	🧩 细节真空度	Structurally complete but nothing is actionable
`adjective_violence`	✨ 形容词暴力指数	Buzzword overload: 赋能/闭环/颗粒度/抓手/降本增效

Scoring:

Weighted overall score (detail_vacuum ×0.3, correct_nonsense ×0.25, adjective_violence ×0.2, infinite_mirror ×0.15, rainbow_fart ×0.1)
Score ≥ 6 → AI_SMELL_DETECTED
Score < 6 → AI_SMELL_CLEAN
Output includes per-dimension scores, evidence quotes from the document, and a one-line verdict

Only requires content field — drop in any document text and it works.

Files

File	Purpose
`dingo/model/llm/llm_ai_smell.py`	Checker implementation
`test/scripts/model/llm/test_llm_ai_smell.py`	17 unit tests (pass/fail logic, score normalization, reason content, markdown cleanup, error handling)
`examples/llm_ai_smell_example.py`	Usage example with high/low AI smell sample documents

Usage

# Register name: "LLMAISmell"
# Required field: content (document text)
# Output labels: AI_SMELL_DETECTED / AI_SMELL_CLEAN

gemini-code-assist

Code Review

This pull request introduces an AI Smell Detector (LLMAISmell) for requirement documents, evaluating them across five dimensions using an LLM, and includes an example script and unit tests. The review feedback focuses on enhancing the robustness of the LLM response parser, specifically by safely casting scores and dictionary fields to handle string or null values, stripping whitespace before markdown cleanup, validating that the parsed JSON is a dictionary, clamping progress bar scores, and ensuring the example script actually executes the evaluation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-18T10:42:40Z

+        if response.startswith("```json"):
+            response = response[7:]
+        elif response.startswith("```"):
+            response = response[3:]
+        if response.endswith("```"):
+            response = response[:-3]
+        response = response.strip()


The markdown code block stripping logic is executed before response.strip(). If the LLM response contains leading or trailing whitespace/newlines (e.g., '\njson\n...\n\n'), the startswith and endswith checks will fail to detect and strip the markdown code blocks. Stripping the response before checking resolves this issue.

Suggested change

if response.startswith("```json"):

response = response[7:]

elif response.startswith("```"):

response = response[3:]

if response.endswith("```"):

response = response[:-3]

response = response.strip()

response = response.strip()

if response.startswith(chr(96) * 3 + "json"):

response = response[7:]

elif response.startswith(chr(96) * 3):

response = response[3:]

if response.endswith(chr(96) * 3):

response = response[:-3]

response = response.strip()

gemini-code-assist · 2026-06-18T10:42:40Z

+        try:
+            data = json.loads(response)
+        except json.JSONDecodeError:
+            raise ConvertJsonError(f"Failed to parse AI smell response as JSON: {response[:200]}")


If the LLM returns a valid JSON that is not an object/dictionary (e.g., a list or a string), json.loads(response) will succeed but subsequent .get() calls on data will raise an AttributeError. It is safer to explicitly verify that the parsed JSON is a dictionary.

Suggested change

try:

data = json.loads(response)

except json.JSONDecodeError:

raise ConvertJsonError(f"Failed to parse AI smell response as JSON: {response[:200]}")

try:

data = json.loads(response)

if not isinstance(data, dict):

raise ConvertJsonError(f"Parsed JSON is not a dictionary: {type(data)}")

except json.JSONDecodeError:

raise ConvertJsonError(f"Failed to parse AI smell response as JSON: {response[:200]}")

gemini-code-assist · 2026-06-18T10:42:40Z

+    def _score_bar(cls, score: int, width: int = 10) -> str:
+        """Generate a simple ASCII progress bar for a 0-10 score."""
+        filled = round(score)
+        empty = width - filled
+        return f"[{'█' * filled}{'░' * empty}]"


If the LLM returns a score outside the expected 0-10 range (e.g., negative or greater than 10), _score_bar can produce malformed progress bars or raise errors. Clamping the score to [0, width] ensures the progress bar is always rendered correctly.

Suggested change

def _score_bar(cls, score: int, width: int = 10) -> str:

"""Generate a simple ASCII progress bar for a 0-10 score."""

filled = round(score)

empty = width - filled

return f"[{'█' * filled}{'░' * empty}]"

@classmethod

def _score_bar(cls, score: float, width: int = 10) -> str:

"""Generate a simple ASCII progress bar for a 0-10 score."""

filled = max(0, min(width, round(score)))

empty = width - filled

return f"[{'█' * filled}{'░' * empty}]"

gemini-code-assist · 2026-06-18T10:42:40Z

+    # Use executor.eval_text for quick single-text evaluation
+    # This is a simplified example showing the checker's usage
+    print("\nDocument snippet (high AI smell):")
+    print(SAMPLE_DOC_HIGH_AI_SMELL[:200] + "...")
+    print("\nExpected: AI_SMELL_DETECTED with high scores on adjective_violence and detail_vacuum")


The example script instantiates LocalExecutor but never actually calls it to run the evaluation on the sample documents. To make the example fully functional and demonstrate how to use the checker, you should include the actual execution calls (even if commented out or wrapped in a try-except block).

Suggested change

# Use executor.eval_text for quick single-text evaluation

# This is a simplified example showing the checker's usage

print("\nDocument snippet (high AI smell):")

print(SAMPLE_DOC_HIGH_AI_SMELL[:200] + "...")

print("\nExpected: AI_SMELL_DETECTED with high scores on adjective_violence and detail_vacuum")

# Use executor.eval_text for quick single-text evaluation

# This is a simplified example showing the checker's usage

print("\nDocument snippet (high AI smell):")

print(SAMPLE_DOC_HIGH_AI_SMELL[:200] + "...")

print("\nExpected: AI_SMELL_DETECTED with high scores on adjective_violence and detail_vacuum")

# To run the evaluation (requires a valid API key):

# try:

# result = executor.eval_text(SAMPLE_DOC_HIGH_AI_SMELL)

# print("\nActual Evaluation Result:")

# print(result.reason[0])

# except Exception as e:

# print(f"\nCould not run evaluation: {e}")

Adopt Gemini code-assist suggestions: - Cast total_score to float() with ValueError/TypeError fallback - Use 'or {}' for dimensions/evidence to handle null values - Use str() for verdict to handle null - Cast per-dimension scores to float() before comparisons - Clamp _score_bar input with int(round()) + max/min guard

e06084 · 2026-06-18T11:38:56Z

Thanks for the thorough review!

All suggestions have been addressed in the latest commit:

High priority (both fixed):

total_score now cast to float() with ValueError/TypeError fallback
dimensions/evidence use or {} pattern; verdict wrapped in str(... or "")
Per-dimension scores also cast to float() before _score_bar and >= 5 comparison

Medium priority (all fixed):

response.strip() now runs before the markdown code-block stripping
Added isinstance(data, dict) guard after json.loads()
_score_bar signature updated to float, clamp uses max(0, min(width, int(round(score))))
Example script now includes commented-out actual execution calls

e06084 added 3 commits June 18, 2026 18:39

feat: add LLMAISmell checker

a58d415

test: add unit tests for LLMAISmell

dc34d93

docs: add usage example for LLMAISmell

cd56a03

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

e06084 added 4 commits June 18, 2026 18:47

fix: resolve flake8 F841 and isort in example file

a661573

fix: address Gemini medium-priority suggestions

afef054

fix: address Gemini medium-priority suggestions

3edf8ba

e06084 added 2 commits June 18, 2026 19:45

fix: format scores as int in reason output, fix test assertion

4d0bf9f

fix: format scores as int in reason output, fix test assertion

1d62008

e06084 merged commit 5e84a18 into dev Jun 18, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add LLMAISmell checker for AI-generated requirement document detection#442

feat: add LLMAISmell checker for AI-generated requirement document detection#442
e06084 merged 9 commits into
devfrom
feature/ai-smell-checker

e06084 commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

e06084 commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

e06084 commented Jun 18, 2026

Summary

Background

What's New

Files

Usage

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

e06084 commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant