feat: add content-aware first-person experiential voice mode by erhanurgun · Pull Request #103 · blader/humanizer

erhanurgun · 2026-04-25T12:17:30Z

Summary

Adds a new section FIRST-PERSON EXPERIENTIAL VOICE (content-aware) to SKILL.md, placed after PERSONALITY AND SOUL.
Extends the frontmatter description with the new mode and its triggers.
Adds two steps to the Process list: a perspective-mode decision step at the top and a first-person self-check before final output.
No version bump; deferring to coordinate with the other open PRs (feat: enforce absolute ban on em dashes and en dashes #96, feat: AI-iness density pre-check for adaptive pass strength (v2.6.0) #98) that also touch the version.

Motivation

Voice Calibration (#64) teaches voice by example. PERSONALITY AND SOUL covers tone (opinions, rhythm, soul). Neither tells the rewrite when the perspective itself should shift.

A large slice of suitable input (blogs, tutorials, retros, opinion pieces, personal guides) reads better when the rewrite speaks as the author recounting lived experience, not as a third party summarizing them. The existing "use 'I' when it fits" hint is too thin to do this consistently; it produces neutral sentences with "I" pasted on, not memory and judgment.

This PR adds an explicit, content-aware mode for that case, and an explicit list of where it should NOT run (encyclopedic, academic, technical reference, neutral journalism, legal/policy text).

Changes

SKILL.md:

Frontmatter description: append a paragraph describing the first-person experiential mode and its triggers (explicit phrases + content-type auto-detection).
New section ## FIRST-PERSON EXPERIENTIAL VOICE (content-aware) after PERSONALITY AND SOUL. Contents:
- When to apply (auto + explicit) and when not to.
- Six transformation rules with before/after examples (lived moments, path-to-claim, honest reactions, real time markers, owned judgments, visible mind-changes).
- Anti-patterns (fake humility, padding, Reddit voice, universalizing, fabricated specifics, first-person on someone else's behalf).
- Note on calibrating against a writing sample when one is provided.
- Quick before/after example.
Process list: insert perspective-mode decision as step 2, and a first-person self-check as the new step 10 (active only in that mode).
version unchanged at 2.5.1.

Net change: additive. No existing behavior is altered for content where the mode is not triggered.

Test plan

Run humanizer on a sample blog post; verify first-person mode auto-triggers and the rewrite reads as lived experience, not a summary with "I" attached.
Run humanizer on a Wikipedia-style entry; verify mode does NOT trigger and output stays third person.
Run humanizer on a tutorial with the explicit phrase "make it personal"; verify mode triggers.
Confirm the new section contains zero em dashes and zero en dashes.
Confirm frontmatter still parses (name, version, description, license, compatibility, allowed-tools intact).

Notes

Open PRs #96 and #98 both bump version to 2.6.0. This PR intentionally leaves version untouched so they can be coordinated. Happy to rebase and bump if you'd prefer it bundled.

Adds FIRST-PERSON EXPERIENTIAL VOICE section after PERSONALITY AND SOUL with auto/explicit triggers, transformation rules, and anti-patterns. Updates Process with perspective-mode decision and first-person self-check. Frontmatter description extended; SKILL version unchanged (defer to maintainer for next coordinated bump).

blader · 2026-05-27T02:56:19Z

Closing — this is off the skill's goal. Humanizer removes AI tells while preserving meaning; rewriting text into invented first-person lived experience ('I sat there refreshing the page...') fabricates content. That's a different tool. Thanks for the thorough PR regardless.

5 OQs from the seed-catalogue extraction answered + appended to docs/de-seed-catalogue.md as a binding decisions log for Task 5 + Task 6: - OQ1: build DE blader#7 via Opus + Wikipedia-AI-Cleanup-Editor manual curation - OQ2: include all 3 DE-only patterns (blader#102 Konjunktiv II, blader#103 Anglizismen, blader#104 Nominalstil) - OQ3: exclude all 6 Wikipedia-context-only entries from patterns/de.md - OQ4: include EN-PARALLEL blader#8 with DE forms (gilt als / dient als / etc.); verify in Task 5 mining - OQ5: defer universal pattern DE-token extensions to Task 6 follow-up Numbering reshuffle: blader#100 + blader#101 reserved for prose-applicable DE-only patterns surfaced during Task 5; blader#102 + blader#103 + blader#104 firm assignments per OQ2. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ources (Phase 2 Task 4) All $0 sources per maintainer /goal 'minimize Phase 2 Task 4 budget'. Source A (Wikipedia DE AI-Cleanup tagged articles, 30 docs, CC-BY-SA-3.0): Real-world DE prose flagged by humans as AI-suspected via the Vorlage:KI-generiert template. Fetched via embeddedin API (50+ tagged articles available; sampled 30 with fixed seed 42). Mix of substantial AI tells and borderline cases — human-verified suspect baseline. Examples: Sara Noxx, Digitales Schlafmonitoring, Synthetische Daten, Verband evangelischer Pfarrerinnen und Pfarrer, Moonton, Hybridtechnik. Source B (Claude CLI subscription generation, 90 docs, MIT): 6 domains × 5 topics × 3 models (sonnet/haiku/opus) = 90 samples via `claude -p` subscription ($0). DE prompt templates ask for stereotypical AI-style content. Cross-model variation for intra-Anthropic idiolect diversity. ANTHROPIC_API_KEY stripped from subprocess env per _shared.run_skill convention. Source C (Opus main-thread inline synthesis, 12 docs, MIT): 2 samples per domain × 6 domains. Engineered to exercise specific DE tells: blader#7 AI vocabulary, blader#102 Konjunktiv II stacking, blader#103 Anglizismen-Leakage, blader#104 Nominalstil-Inflation, plus EN-parallels blader#22/blader#10/blader#15/blader#16/blader#23/blader#24/ blader#32/blader#36/blader#37. Act as calibration anchors for per-pattern eval testing. Total: 132 docs / 728 KB. Comfortably exceeds plan target of 75-100. Sufficient signal volume for Task 5 mine_patterns.py LLR scoring against the 46-doc human corpus (340 KB). AI/human ratio 2.1× by docs, 2.1× by KB. All redistributable license (CC-BY-SA-3.0 for Wikipedia, MIT for Anthropic- generated + Opus inline). No fair-use research-only content needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…up consensus Mined the DE corpus (132 AI / 46 human docs) via mine_patterns.py LLR scoring. Top 100 candidates routed through 3-voice writer persona panel (Academic + Marketing Copywriter + Journalist) for ✓/✗/◐ vote per ngram. Consolidated keep-list at docs/de-mined-patterns.md (saved this commit): Strong consensus (unanimous ✓) — into patterns/de.md Task 6: blader#7 DE AI Vocabulary additions: darüber hinaus, zusammenfassend, ganzheitliche, vorliegenden, der vorliegenden, umfassende (cluster only), darstellt blader#100 (NEW reserved DE-only): Anchor: 'im Rahmen der vorliegenden [Arbeit/Studie/Untersuchung]' DE academic-frame boilerplate — no EN equivalent, highest LLR among DE-only candidates (rank blader#32, LLR 31.97, 31:0 ratio) blader#101 (NEW reserved DE-only): Anchor: '[es/zusammenfassend] lässt sich [sagen/feststellen/festhalten]' DE impersonal-reflexive AI hedge — no EN equivalent, multiple high-LLR forms (lässt sich rank blader#5 LLR 93.73; zusammenfassend lässt sich sagen rank blader#29 LLR 33.00) blader#12 DE meta-commentary extensions: zusammenfassend, wichtig zu (beachten/betonen), full blader#101 family Cluster-only (◐ ADJUST) — flag with threshold logic: zentrale Rolle, umfassende, implementierung (non-tech), überzeugt (unanchored) Skip (artifacts + common DE): hedging (metadata), queens (Wikipedia bleed), substantivketten / übergänge (Source C body refs), dass / es / sich / ich / meine / bin / mich (common DE function words / first-person genre artifact) Mining-script bug noted for v3.6.0: YAML frontmatter strip catches headers but Opus inline synthesis demonstrably references metadata terms in body (tells_targeted leaks via prose). Workaround: drop tells_targeted from synthesis frontmatter next pass. OQ assignments updated: blader#100 reassigned from 'first prose-applicable DE-only Wiki pattern surfaced during mining' (per maintainer doc) to the academic-frame boilerplate discovered as highest-LLR DE-only signal blader#101 reassigned to impersonal-reflexive Nominalstil sub-pattern blader#102/blader#103/blader#104 remain Konjunktiv II / Anglizismen / Nominalstil per maintainer OQ2 decision (still open for Task 6 implementation) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

878-line German pattern pack mirroring patterns/en.md structure + extending with 5 DE-only patterns (blader#100-blader#104) per maintainer decisions + Task 5 mining consensus. EN-PARALLEL patterns (translated to DE with DE-specific trigger words + before/after examples): blader#1, blader#2, blader#3, blader#4, blader#5, blader#7, blader#8, blader#9, blader#10, blader#11, blader#12, blader#13, blader#16, blader#20, blader#21, blader#22, blader#23, blader#24, blader#27, blader#28, blader#30, blader#31, blader#32, blader#33, blader#34, blader#35, blader#36, blader#37 (28 patterns). DE-only patterns (blader#100-blader#104, no EN equivalent): blader#100 Akademische Rahmen-Floskel — 'im Rahmen der vorliegenden [Arbeit/Studie/Untersuchung/Analyse]' bureaucratic self-reference. Mining-derived (LLR 31.97, 31:0 AI:human). blader#101 Impersonales Reflexiv — '[es/zusammenfassend] lässt sich [sagen/feststellen/festhalten/zeigen]' AI hedge construction. Mining-derived (LLR 93.73 bigram + 33.00 four-gram). blader#102 Konjunktiv II Stacking — 3+ würde/wäre/hätte/könnte forms in close proximity for vague hedging. Per maintainer OQ2. blader#103 Anglizismen-Leakage — denglisch business buzzwords (insight, deliver, leveragen, Pain Points, ganzheitliche Customer Journey). Per maintainer OQ2. blader#104 Nominalstil-Inflation — noun-heavy bureaucratic verbing ('die Durchführung der Analyse' vs 'analysieren'). Per maintainer OQ2. DE PERSONALITY AND SOUL section mirrors EN with DE-appropriate register notes. Critical addition: domain note excludes DE career writing from soul-adding (DE Anschreiben register is formal-modest, opposite of US/UK puffery — adding soul makes them weaker). blader#7 DE AI Vocabulary trigger list: 33 phrases combining mined tokens (darüber hinaus, zusammenfassend, ganzheitlich, vorliegenden, umfassende, darstellt) with manually curated additions (vielfältig, facettenreich, nachhaltig, innovativ, zukunftsweisend, transformativ, ganzheitlich, intuitiv, nahtlos, robust, im Hinblick auf, vor diesem Hintergrund, es ist wichtig zu betonen, zentrale Rolle spielen, etc.) per OQ1. Excluded per OQ3: 6 Wikipedia-context-only DE-only entries flagged by DE Wiki AI-Cleanup project (productivity spikes, citation format, non-existent categories) — not applicable to general prose. Header documents the exclusion so future contributors don't re-add them. Tests: 207 → 211 passes (+4 DE pack tests: existence, expected pattern IDs, PERSONALITY section presence, no overlap with universal pack). Maintainer flagged for future review: blader#11 Elegant Variation — DE synonym system richer than EN, less sharp blader#34 Trailing Emphasis Fragments — less common in DE, signal stronger when present blader#36 Conditional Frame Stacking — overlaps with blader#102 Konjunktiv II; cross-referenced blader#8 Copula Avoidance — 'gilt als' legitimate legal term of art, apply lightly in legal domain blader#13 Passive Voice — DE academic uses passive more heavily than EN; SKIP in academic AND legal domains (will be enforced in Task 7 domains/de_overrides.md) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…reer register (Phase 2 Task 7) 245-line DE override file mirroring domains/en_overrides.md schema + extending with 5 DE-only pattern rows (blader#100-blader#104) + DE career section (475 words) + DACH cultural-register inversion note. Override matrix (22 rows × 6 columns: Pattern + 5 non-casual domains): - All EN-PARALLEL pattern overrides translated to DE - 5 maintainer-flagged DE-specific adaptations applied: - blader#11 Elegant Variation: light across all domains (DE richer synonym system; sharper signal lost when applied strictly) - blader#13 Passive Voice: SKIP in academic AND legal (DE academic uses passive MORE than EN; was SKIP only in academic for EN) - blader#8 Copula Avoidance: light in legal ('gilt als' legitimate legal term of art) - blader#34 Trailing Fragments: kept strict where EN was strict (less common in DE, signal stronger when present) - blader#36 / blader#102 cross-reference (Konjunktiv overlaps in academic + legal) - 5 DE-only pattern rows: - blader#100 Akademische Rahmen-Floskel: strict everywhere (even academic) - blader#101 Impersonales Reflexiv: light in academic + legal, strict elsewhere - blader#102 Konjunktiv II: light in academic + legal, strict elsewhere - blader#103 Anglizismen-Leakage: light in technical + marketing, strict elsewhere - blader#104 Nominalstil-Inflation: SKIP in legal (DE Behördendeutsch), light in academic, strict elsewhere DE-specific domain guidance paragraphs: - academic: DE passive + Nominalstil heavier than EN; blader#101 + blader#104 softened - legal: Konjunktiv II for indirect speech is standard; blader#102 softened; 'gilt als'/'fungiert als' can be legal terms of art - technical: Anglizismen-Leakage softened (English tech terms unavoidable); flag denglisch verb constructions strictly - marketing: Denglisch in marketing is register marker (softened); DE buzzword-in-phrase rule + brand-tier audit step + 5-point preserve- everything checklist with DE examples - career: DACH formal-modest register (INVERSE of US/UK assertive self-promotion that EN career assumes). DE-specific AI tells: 'Mit großem Interesse', 'leidenschaftlich', 'ergebnisorientiert', 'ganzheitlich denkend', 'es würde mich außerordentlich freuen', etc. All 5 career preserve rules in DE (Metriken sind heilig, Eigennamen + Daten + Titel, Fachvokabular, Stellenausschreibungs-Schlüsselphrasen, konkrete Achievement-Aussagen). - casual: 'Ich' more weighty in DE; 'Man' constructions acceptable. Critical casual constraint (concept-noun preservation) translated. Tests: 211 -> 214 (+3 DE override tests: existence, table+guidance, pattern ID validity). Maintainer flags for review: - blader#11 'light' across ALL domains is broader than EN; relax to strict in career if FP eval over-softens - blader#15 included for EN symmetry, not strictly required by Task 7 spec - blader#36/blader#102 cross-reference is in a trailing blockquote (not inline); may want inline academic + legal mentions for visibility Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

One case per DE domain at evals/corpus/de/e2e/ai_<domain>_01.json: casual — KI-Coding-Tools blog opening; tests concept-noun preservation (Iterationsgeschwindigkeit, Zusammenarbeit, Kreativität, organisatorische Agilität must survive humanization) academic — Transformer-Architektur abstract; tests blader#100/blader#101/blader#104 in academic register (per de_overrides, blader#101 + blader#104 light, blader#100 strict everywhere); preserves multilingual corpora + cross-lingual transfer learning + low-resource fine-tuning legal — Datenschutzklausel; tests DE legal register (blader#13 + blader#24 + blader#104 SKIP per de_overrides); preserves DSGVO compliance, 72h Meldepflicht, Drittstaatenübermittlung, Art. 33 DSGVO technical — DataFlow CLI README intro; tests blader#15/blader#16 SKIP + blader#103 Anglizismen light + fabrication check (don't invent 'exponential' backoff); preserves Kubernetes/Go/PostgreSQL + transiente Fehler + Backoff-Strategie marketing — AuraSound One smart speaker landing copy; tests blader#4 SKIP + blader#32 light + blader#103 light; preserves product name + 360°- Surround + Smart-Home-Integration + dimming + Premium tier + 'für deinen Alltag' lifestyle hook (5-point checklist) career — DE Anschreiben for Senior Software Engineer; tests INVERSE register (formal-modest 'Sie', NOT US/UK puffery); preserves metrics-are-sacred (18mo migration, 40% p99 latency, 3x scale, Kubernetes/Go/PostgreSQL stack, Stellentitel, Firmenname, DSGVO contribution); strips chatbot opener + sycophancy + AI-CV-clichés (leidenschaftlich, ergebnisorientiert, ganzheitliches Verständnis) Each case engineered to exercise specific DE patterns + maintainer-flagged register handling per docs/de-corpus-sources.md + domains/de_overrides.md. Comparable to EN E2E (6 cases at v3.4.1 incl. career). Volume target met. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

blader closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add content-aware first-person experiential voice mode#103

feat: add content-aware first-person experiential voice mode#103
erhanurgun wants to merge 1 commit into
blader:mainfrom
erhanurgun:feat/first-person-experiential-voice

erhanurgun commented Apr 25, 2026

Uh oh!

blader commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erhanurgun commented Apr 25, 2026

Summary

Motivation

Changes

Test plan

Notes

Uh oh!

blader commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants