ChatGPT vs Claude vs Gemini: Which AI Writes Most Like a Human?
We measured GPT-5.2, Claude Sonnet 4.5, and Gemini 3 Pro across 6 writing style dimensions using 320 samples. See which AI writes most like you in 2026.
Every week someone publishes a "ChatGPT vs Claude vs Gemini" comparison. Most of them test which model answers trivia questions better, writes longer code, or follows instructions more precisely.
Nobody measures which one writes most like a human.
We did. With data, not opinions.
What We Measured
We generated 320 writing samples across five AI models, eight prompt types, and four languages. For this comparison, we're focusing on the three models most professionals choose between daily: GPT-5.2 (ChatGPT), Claude Sonnet 4.5 (Claude), and Gemini 3 Pro (Gemini).
Each sample was analyzed using computational stylometry — deterministic formulas measuring six independent dimensions of writing style. The same formulas we apply to human writing in Writing DNA Snapshots, so the comparison is apples-to-apples.
For the full methodology: How We Measure "Average AI". For a deep dive into all five models: How Every AI Model Writes.
Head-to-Head: The Six Dimensions
Sentence Complexity
| Model | Score |
|---|---|
| Gemini 3 Pro | Highest |
| Claude Sonnet 4.5 | Middle |
| GPT-5.2 | Lowest of the three |
Gemini writes the most structurally complex sentences. Its output tends toward nested clauses, qualifiers, and multi-part structures that read like well-edited reports. Claude Sonnet sits in the middle — complex enough for professional contexts, simple enough for readability. GPT-5.2 produces the most readable sentence structures of the three, favoring clarity over density.
What this means for you: If you naturally write short, direct sentences (complexity below 50), GPT-5.2's output will be closest to your style on this axis. If you write densely (above 70), Gemini is the closer match. Most professionals fall somewhere between, where Sonnet sits.
Vocabulary Richness
| Model | Score |
|---|---|
| Claude Sonnet 4.5 | Highest |
| GPT-5.2 | Middle |
| Gemini 3 Pro | Lowest |
Claude Sonnet deploys the broadest vocabulary of the three, choosing specific terms over generic ones more often. GPT-5.2 falls in the middle — accessible but not repetitive. Gemini 3 Pro reuses vocabulary more heavily, favoring consistency of terminology over lexical variety.
What this means for you: Writers with specialized vocabularies — technical writers, academics, domain experts — will find the largest gap with Gemini on this axis. Generalist communicators may not notice the difference between models.
Expressiveness
| Model | Score |
|---|---|
| GPT-5.2 | Highest (by a wide margin) |
| Claude Sonnet 4.5 | Middle |
| Gemini 3 Pro | Lowest |
This is where the models diverge most dramatically. GPT-5.2 is the most expressive AI writer by a significant margin. It uses more rhetorical questions, more exclamation marks, more attitude markers ("Importantly," "Fascinatingly"), and more emphatic punctuation than either competitor.
Claude Sonnet is moderately expressive — engaged without being effusive. Gemini 3 Pro is the most restrained, producing prose that's informative rather than energetic.
What this means for you: This axis often determines which model "feels" right to users before they can articulate why. If ChatGPT output feels too enthusiastic for your professional context, the data confirms your intuition — its expressiveness score significantly exceeds the other two models. If Claude output feels measured and balanced, that's also measurable. If Gemini output feels dry, same story.
Formality
| Model | Score |
|---|---|
| Gemini 3 Pro | Highest |
| Claude Sonnet 4.5 | Middle-high |
| GPT-5.2 | Lowest |
Gemini writes the most formally — heavy function word usage, careful hedging, semicolons. Claude Sonnet maintains professional formality without stiffness. GPT-5.2 leans conversational, especially in prompt types that invite it.
What this means for you: Legal, financial, and executive communication typically requires higher formality. Gemini's defaults are closest to those registers. Marketing, sales, and team communication typically works better at lower formality — GPT-5.2 is the closer match. Claude Sonnet splits the difference.
Consistency
| Model | Score |
|---|---|
| Claude Sonnet 4.5 | Highest |
| Gemini 3 Pro | Middle |
| GPT-5.2 | Lowest |
Claude Sonnet produces the most uniform sentence lengths — a steady, predictable cadence. GPT-5.2 varies the most, alternating between short punchy sentences and longer explanatory ones.
What this means for you: If your writing has a predictable rhythm (as legal and technical writing often does), Claude Sonnet's consistency is the closest match. If your writing is "bursty" — mixing short and long for effect — GPT-5.2's variability is a better starting point.
Conciseness
| Model | Score |
|---|---|
| GPT-5.2 | Highest (but still below 50) |
| Claude Sonnet 4.5 | Middle |
| Gemini 3 Pro | Lowest |
No model writes concisely. This is worth stating clearly: every major AI model produces longer output than most professionals would write in the same context. The overall average is 42 out of 100, and no model significantly exceeds that.
GPT-5.2 is slightly more concise than the others — its conversational style naturally produces shorter sentences. Gemini is the least concise, matching its high complexity with high sentence length.
What this means for you: If you're a concise writer (and many professionals are), every model will need significant calibration on this axis. Model choice barely moves the needle.
The Scorecard
AI Model Writing Style Comparison
Comparing Claude Opus 4.6 vs GPT-5.2 in English
Let's tally the wins:
| Dimension | Winner |
|---|---|
| Sentence Complexity | Depends on your style |
| Vocabulary Richness | Claude Sonnet 4.5 |
| Expressiveness | Depends on your style |
| Formality | Depends on your style |
| Consistency | Claude Sonnet 4.5 |
| Conciseness | GPT-5.2 (barely) |
Notice the pattern: three dimensions have no universal winner because the "best" score depends entirely on where you fall on that axis. High complexity isn't better than low complexity. High expressiveness isn't better than low expressiveness. High formality isn't better than low formality.
The question isn't "which model writes best?" It's "which model's defaults are closest to my writing on each dimension?"
So Which Model Writes Most Like a Human?
The honest answer: none of them, and all of them.
None of them, because every model converges toward a statistical center that no real human occupies. The Median User Problem affects all three equally. Their outputs are more similar to each other than any of them are to a distinctive human writer — which is why AI writing sounds generic regardless of which model you pick.
All of them, because each model has dimensions where it approximates certain human styles:
- GPT-5.2 writes most like humans who are expressive, conversational, and rhythmically varied. Think marketing leaders, salespeople, community managers. (See: ChatGPT for work)
- Claude Sonnet 4.5 writes most like humans who are balanced, professional, and consistent. Think project managers, consultants, operations leaders. (See: Make Claude sound like you)
- Gemini 3 Pro writes most like humans who are formal, structured, and thorough. Think lawyers, analysts, executive communicators. (See: Make Gemini write like you)
But even these matches are partial. GPT-5.2 might match a marketer's expressiveness but completely miss their conciseness. Claude Sonnet might match a consultant's formality but overcomplicate their sentence structures. The match is never complete across all six dimensions. For full personality profiles of each model, see Writing Profiles for Every AI Model.
The Real Answer: Stop Picking and Start Calibrating
Here's the counterintuitive conclusion from the data: model choice matters less than model calibration.
The maximum difference between any two models on any axis is roughly 16 points. The typical gap between a model's default and a human writer's actual style is 20-40 points on multiple axes simultaneously.
Switching from ChatGPT to Claude might move you 10 points closer on formality. But you're still 25 points away on conciseness, 30 points away on expressiveness, and 15 points away on consistency. You've optimized one dimension while leaving four uncalibrated.
A Style Profile calibrates all six dimensions at once, regardless of which model you use. It measures your actual writing, calculates the delta from the model's defaults, and gives the AI specific targets for each dimension.
The model becomes a canvas. Your style profile becomes the instructions. Any canvas will do — what matters is the precision of the instructions.
What About Different Languages?
This comparison focused on English. The model dynamics shift in other languages — sometimes dramatically. Japanese AI output looks very different from English AI output, and the relative strengths of each model change with the language.
For the cross-language story, see How AI Writes Differently Across Languages. And for per-language model recommendations, see Which AI Model Writes Best in Each Language.
Find Your Closest Match
Want to see which model's defaults are closest to your writing? Try your free Writing DNA Snapshot — it maps your style across all six dimensions and shows you the gap from Average AI.
The snapshot won't tell you which model to use. It'll tell you something more useful: exactly what any model needs to change to write like you.