Extended Thinking: When Claude actually thinks before it talks

23 Apr 26
\
Benjamin Igna
\
24
 mins
 read

Extended Thinking didn't change my workflow as dramatically as Memory or Artifacts did. Those features changed what Claude is to me. Extended Thinking changed how well it performs at the things it was already doing. But "how well" matters. A lot.

When Claude Actually Thinks Before It Talks

I was reviewing a client's R&D operating model — the kind with three organizational layers, cross-functional dependencies, and a shared services team that touches everything. I asked Claude to analyze where the coordination bottlenecks were and suggest a restructured flow.

Claude answered in about two seconds. The answer was fine. Competent. The kind of response you'd get from a smart intern who skimmed the brief.

Then I turned on Extended Thinking and asked the same question.

Claude paused. Not the usual instant response. An actual pause — maybe fifteen seconds. And then it delivered an analysis that identified a dependency I hadn't seen, proposed a phased rollout I hadn't considered, and flagged a risk with the shared services queue that turned out to be exactly the problem we ran into two weeks later.

Same model. Same prompt. Completely different quality of output. The only difference was that I told Claude to think before it talked.

That's Extended Thinking. And once you understand when to use it and when not to, it changes how you work with Claude. Most of what follows comes from Anthropic's learning platform and the technical documentation, filtered through daily use in real consulting engagements.

What Extended Thinking Actually Does

Normally, Claude generates its response token by token, in one pass. It's fast. It's also how you get those answers that are technically correct but miss the nuance — the response that addresses what you said but not what you meant.

Extended Thinking adds a step before the final response. Claude gets a "thinking budget" — a dedicated space where it reasons through the problem step by step before it commits to an answer. You can see this thinking process in the interface. It shows up as an expandable block above the actual response.

Think of it like this: normal Claude is the colleague who answers your question in the hallway. Extended Thinking Claude is the colleague who says "let me think about that for a minute," goes back to their desk, works through the problem, and then comes back with a proper answer.

The thinking isn't decorative. Claude uses it to break down complex problems, consider alternatives, catch its own errors, and build a more structured argument before writing the response you actually see.

When It Matters and When It Doesn't

Extended Thinking is not a "make everything better" button. For simple questions, it adds latency without improving the answer. For complex ones, the difference can be dramatic.

Task Type Extended Thinking? Why
Simple factual questions No No benefit. "What's the capital of France" doesn't need a reasoning chain.
Drafting emails, quick copy No Speed matters more than depth. Normal mode is fine.
Multi-step analysis Yes Dependencies, trade-offs, second-order effects. This is where thinking pays off.
Strategy and planning Yes Claude catches edge cases and considers alternatives it would skip in normal mode.
Code review / debugging Yes Step-by-step reasoning through logic finds bugs that pattern matching misses.
Math and formal reasoning Yes The original use case. Dramatically better accuracy on anything requiring precise logic.
Creative writing Sometimes Useful for structure and plot. Not necessary for freeform generation.
Summarization No Standard mode handles this well. Thinking adds time without meaningful improvement.

My rule of thumb: if I'd need more than thirty seconds to think through the answer myself, I turn on Extended Thinking for Claude too.

How It Works Under the Hood

You don't need to be a developer to understand the mechanics, and knowing them helps you use the feature better.

When Extended Thinking is enabled, Claude gets a token budget for its internal reasoning. In the API, you set this explicitly. In the chat interface, it's handled automatically — Claude decides how much to think based on the complexity of the question. Anthropic calls this "adaptive thinking" on the latest models.

The thinking happens before the response. Claude works through the problem in its thinking block, then writes a final answer informed by that reasoning. On newer models, you see a summary of the thinking process rather than the raw stream of consciousness. The summary preserves the key reasoning steps without the noise.

A few things worth knowing:

Thinking takes time. A complex analysis might take 15–30 seconds of thinking before Claude starts writing. This is normal. It's the point.

The thinking is real. This isn't a UI trick. The quality difference between a response with and without thinking is measurable, especially on tasks that require multi-step reasoning, mathematical logic, or strategic analysis.

You're paying for the thinking. On the API side, thinking tokens count as output tokens. In the chat interface with a Pro or Team plan, they count toward your usage. More thinking means higher cost but better output — a trade-off worth understanding.

Extended Thinking with Tools

This is where it gets interesting for power users. When Extended Thinking is combined with tool use — Claude searching the web, reading your Google Drive, querying your Notion — something called "interleaved thinking" kicks in.

Instead of thinking once and then acting, Claude thinks between each step. It searches your Drive, thinks about what it found, decides what to search next, thinks again, and builds its analysis iteratively. Each tool result feeds back into the reasoning chain.

Mode How Claude Works Result Quality
Normal + Tools Uses tool → reads result → responds Fast but shallow. Takes tool results at face value.
Thinking + Tools Thinks → uses tool → thinks about result → uses another tool → thinks again → responds Slower but much deeper. Synthesizes across multiple sources. Catches contradictions.

In practice, this means asking Claude to "find the relevant project documents in my Drive and analyze whether our current sprint structure supports the delivery timeline" produces a genuinely different quality of answer with Extended Thinking on. Claude doesn't just retrieve documents — it reasons about what they mean in context.

Practical Tips

Don't leave it on for everything. Extended Thinking adds latency. For quick tasks — drafting a Slack message, formatting a table, simple questions — turn it off. Save it for the work that actually benefits from deeper reasoning.

Give it something to think about. A vague prompt with Extended Thinking enabled just produces a longer thinking block before a mediocre answer. The feature amplifies the quality of your prompt. A well-structured, specific question with thinking enabled produces exceptional output. A lazy question with thinking enabled produces a slow, lazy answer.

Read the thinking. The thinking block isn't just overhead — it's a window into Claude's reasoning. When the final answer seems off, expand the thinking and you'll often find exactly where it went sideways. This makes debugging your prompts much faster.

Use it for high-stakes work. Client proposals. Strategic analyses. Anything where being 80% right isn't good enough. The extra 15–30 seconds of thinking regularly catches errors, edge cases, and missed considerations that would cost you hours to fix later.

Pair it with your project context. Extended Thinking works best when Claude has rich context to reason about. A well-configured project with uploaded reference documents and clear instructions, combined with Extended Thinking, is the closest thing I've found to having a genuine senior analyst on call.

The Difference It Makes

I'll be direct about this: Extended Thinking didn't change my workflow as dramatically as Memory or Artifacts did. Those features changed what Claude is to me. Extended Thinking changed how well it performs at the things it was already doing.

But "how well" matters. A lot.

The analysis that catches a dependency I missed. The strategy document that anticipates the objection my client will raise. The code review that finds the edge case before it hits production. These aren't nice-to-haves. These are the moments where good consulting becomes great consulting.

Extended Thinking is the difference between Claude giving you the obvious answer and Claude giving you the right answer. Sometimes those are the same thing. When they're not, you'll be glad you gave it a moment to think.

Normal Claude is your colleague answering in the hallway. Extended Thinking Claude is the one who goes back to their desk, works through it properly, and comes back with something you can actually use. Know which one you need for the job in front of you.

Not Sure Where to Start?

Warp Speed Workshop

In this one-off interactive, gamified workshop, we’ll simulate real-world work scenarios at your organisation via a board game, helping you identify and eliminate bottlenecks, inefficient processes, and unhelpful feedback loops.

Close Cookie Popup
Cookie Preferences
By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Cookies helping us understand how this website performs, how visitors interact with the site, and whether there may be technical issues.
Cookies used to deliver advertising that is more relevant to you and your interests.
Cookies allowing the website to remember choices you make (such as your user name, language, or the region you are in).