Humanize AI Text
Comprehensive CLI for detecting and transforming AI-generated text to bypass detectors. Based on Wikipedia's Signs of AI Writing.
Quick Start
# Detect AI patterns
python scripts/detect.py text.txt
# Transform to human-like
python scripts/transform.py text.txt -o clean.txt
# Compare before/after
python scripts/compare.py text.txt -o clean.txt
Detection Categories
The analyzer checks for 16 pattern categories from Wikipedia's guide:
Critical (Immediate AI Detection)
| Category | Examples |
|---|---|
| Citation Bugs | oaicite, turn0search, contentReference |
| Knowledge Cutoff | "as of my last training", "based on available information" |
| Chatbot Artifacts | "I hope this helps", "Great question!", "As an AI" |
| Markdown | **bold**, ## headers, code blocks |
High Signal
| Category | Examples |
|---|---|
| AI Vocabulary | delve, tapestry, landscape, pivotal, underscore, foster |
| Significance Inflation | "serves as a testament", "pivotal moment", "indelible mark" |
| Promotional Language | vibrant, groundbreaking, nestled, breathtaking |
| Copula Avoidance | "serves as" instead of "is", "boasts" instead of "has" |
Medium Signal
| Category | Examples |
|---|---|
| Superficial -ing | "highlighting the importance", "fostering collaboration" |
| Filler Phrases | "in order to", "due to the fact that", "Additionally," |
| Vague Attributions | "experts believe", "industry reports suggest" |
| Challenges Formula | "Despite these challenges", "Future outlook" |
Style Signal
| Category | Examples |
|---|---|
| Curly Quotes | "" instead of "" (ChatGPT signature) |
| Em Dash Overuse | Excessive use of — for emphasis |
| Negative Parallelisms | "Not only... but also", "It's not just... it's" |
| Rule of Three | Forced triplets like "innovation, inspiration, and insight" |
Scripts
detect.py — Scan for AI Patterns
python scripts/detect.py essay.txt
python scripts/detect.py essay.txt -j # JSON output
python scripts/detect.py essay.txt -s # score only
echo "text" | python scripts/detect.py
Output:
- Issue count and word count
- AI probability (low/medium/high/very high)
- Breakdown by category
- Auto-fixable patterns marked
transform.py — Rewrite Text
python scripts/transform.py essay.txt
python scripts/transform.py essay.txt -o output.txt
python scripts/transform.py essay.txt -a # aggressive
python scripts/transform.py essay.txt -q # quiet
Auto-fixes:
- Citation bugs (oaicite, turn0search)
- Markdown (**, ##, ```)
- Chatbot sentences
- Copula avoidance → "is/has"
- Filler phrases → simpler forms
- Curly → straight quotes
Aggressive (-a):
- Simplifies -ing clauses
- Reduces em dashes
compare.py — Before/After Analysis
python scripts/compare.py essay.txt
python scripts/compare.py essay.txt -a -o clean.txt
Shows side-by-side detection scores before and after transformation
Workflow
Scan for detection risk:
python scripts/detect.py document.txtTransform with comparison:
python scripts/compare.py document.txt -o document_v2.txtVerify improvement:
python scripts/detect.py document_v2.txt -sManual review for AI vocabulary and promotional language (requires judgment)
AI Probability Scoring
| Rating | Criteria |
|---|---|
| Very High | Citation bugs, knowledge cutoff, or chatbot artifacts present |
| High | >30 issues OR >5% issue density |
| Medium | >15 issues OR >2% issue density |
| Low | <15 issues AND <2% density |
Customizing Patterns
Edit scripts/patterns.json to add/modify:
ai_vocabulary— words to flagsignificance_inflation— puffery phrasespromotional_language— marketing speakcopula_avoidance— phrase → replacementfiller_replacements— phrase → simpler formchatbot_artifacts— phrases triggering sentence removal
Batch Processing
# Scan all files
for f in *.txt; do
echo "=== $f ==="
python scripts/detect.py "$f" -s
done
# Transform all markdown
for f in *.md; do
python scripts/transform.py "$f" -a -o "${f%.md}_clean.md" -q
done
Reference
Based on Wikipedia's Signs of AI Writing, maintained by WikiProject AI Cleanup. Patterns documented from thousands of AI-generated text examples.
Key insight: "LLMs use statistical algorithms to guess what should come next. The result tends toward the most statistically likely result that applies to the widest variety of cases."