In this one-off interactive, gamified workshop, we’ll simulate real-world work scenarios at your organisation via a board game, helping you identify and eliminate bottlenecks, inefficient processes, and unhelpful feedback loops.
Workshop Details

This is the single most useful mental model: treat Claude like a brilliant person on their first day at your company. They're sharp. They're eager. But they have zero context on how you do things.
The golden rule: if you handed your prompt to a colleague who knows nothing about your project, would they know what to do? If not, neither does Claude.
"Create an analytics dashboard" is lazy prompting. "Create an analytics dashboard with user engagement metrics, filterable by date range — go beyond basics, I want drill-downs, export functionality, responsive layout" is actually useful. The difference isn't length. It's specificity.
Underrated technique. Instead of barking "NEVER use ellipses" (which sounds like a weird corporate mandate), try: "Your response will be read aloud by a text-to-speech engine, so never use ellipses since the TTS engine won't know how to pronounce them."
Claude generalizes from context. When it understands why a rule exists, it applies the principle beyond just the letter of the instruction. This is the difference between a rule-follower and someone who actually gets it.
Every. Single. Time.
If you want Claude to produce output in a specific format, tone, or level of detail — show it. Three to five examples, wrapped in XML tags so Claude doesn't confuse them with instructions. Make them diverse enough that Claude doesn't just pattern-match on some accidental quirk of your first example.
This is called few-shot prompting and it's the most reliable lever you have. More reliable than paragraphs of instructions. More reliable than all-caps directives. Just show the model what good looks like.
If your prompt has instructions, context, examples, and variable inputs all mashed together, Claude has to figure out what's what. XML tags eliminate the ambiguity. Wrap your data in <documents>, your instructions in <instructions>, your examples in <examples>. Be consistent. Nest them when there's a natural hierarchy.
Unglamorous. Works.
One sentence in the system prompt — "You are a coding assistant specializing in Python" — changes the output meaningfully. It focuses tone, vocabulary, and the depth of domain knowledge Claude draws on. Don't overthink it, but don't skip it either.
When you're feeding Claude big documents, structure matters more than cleverness.
Put your documents at the top. Your actual question goes at the bottom. Anthropic's own tests show up to 30% better response quality when queries come after the documents, especially with complex multi-document inputs.
Tag your documents. If you're passing multiple files, wrap each one in XML tags with source metadata. Claude can then reference specific documents by name instead of guessing which blob of text you're asking about.
Ask Claude to quote before answering. For long-doc tasks, tell Claude to first pull relevant quotes from the source material, then synthesize its answer. This grounds the response in actual content instead of letting the model drift into plausible-sounding hallucination.
Good news: the latest models are more direct, more conversational, and less prone to writing you a novel when you asked for a paragraph. Claude may skip verbal summaries after tool calls and just move on. Which is usually what you want.
If you do want more visibility, just say so: "After completing a task with tool use, provide a quick summary of the work you've done."
Tell Claude what to do, not what not to do. "Your response should be composed of smoothly flowing prose paragraphs" beats "Do not use markdown" every time. Positive instructions are clearer instructions.
Subtle one: the formatting style of your prompt influences the formatting style of the response. If your prompt is full of markdown, expect markdown back. Strip it if you don't want it.
Starting with 4.6, prefilled responses on the last assistant turn are gone. The models are smart enough that you don't need to start their response for them. If you were using prefills to force JSON output, use structured outputs. If you were using them to skip preambles, just say "respond directly without preamble" in the system prompt. If you were using them to continue interrupted responses, tell Claude "your previous response was interrupted and ended with [text] — continue from where you left off."
This trips people up. "Can you suggest some changes to improve this function?" gets you suggestions. If you wanted Claude to actually make the changes, say "change this function to improve its performance."
Claude takes your words at face value. "Suggest" means suggest. "Make" means make. "Edit" means edit. Choose your verbs deliberately.
And one more thing — if your prompts from older models had aggressive all-caps language to trigger tool use ("CRITICAL: You MUST use this tool when..."), dial it way back. Opus 4.5 and 4.6 are far more responsive to the system prompt now. That energy causes overtriggering. Normal language works.
The 4.6 models are aggressive about running tool calls in parallel. Multiple file reads, concurrent searches, parallel commands. Great for speed. You can push it further ("make all independent calls in parallel") or slow it down ("execute operations sequentially") depending on your needs.
Opus 4.6 thinks a lot. At higher effort settings, it does extensive upfront exploration — reading files, exploring multiple approaches, gathering context you didn't ask for. Sometimes brilliant. Sometimes a token bonfire.
If your previous prompts encouraged thoroughness ("if in doubt, use [tool]"), strip that out. The model is naturally thorough now. Those instructions cause overtriggering.
For cases where it's still too aggressive, lower the effort parameter or tell it: "Choose an approach and commit to it. Avoid revisiting decisions unless you encounter information that directly contradicts your reasoning."
Claude 4.6 dynamically decides when and how much to think based on query complexity. This replaces the old manual thinking-budget approach. In Anthropic's internal evals, adaptive thinking consistently outperforms fixed-budget extended thinking.
You can guide it. "After receiving tool results, carefully reflect on their quality before proceeding" encourages deeper post-tool reasoning. "Extended thinking adds latency and should only be used when it will meaningfully improve answer quality" dials it back.
Pro tip: you can put example thinking patterns inside your few-shot examples. Claude generalizes the reasoning style well.
This is where 4.6 genuinely shines. The models maintain orientation across extended sessions — tracking progress, managing state, working incrementally across multiple context windows. This isn't "do one thing and stop." This is "work on a complex task for hours, save state, continue in a fresh context window."
The key insight: Claude works best when it focuses on incremental progress. A few things at a time, steady advances, tracked in structured formats.
For tasks that span multiple context windows, the pattern that works:
First window: Set up the framework. Write tests. Create setup scripts. Establish structured state files for tracking progress.
Subsequent windows: Start fresh (not compacted) and have Claude rediscover state from the filesystem. The 4.6 models are excellent at picking up where they left off from filesystem artifacts — git logs, progress notes, test results.
Always: Use git for state tracking. It gives Claude a log of what's been done and checkpoints it can restore. This is arguably the single best practice for multi-session agentic work.
And emphasize test preservation: "It is unacceptable to remove or edit tests." Claude has a tendency to make tests pass rather than make the code correct. This guardrail matters.
Without guardrails, Opus 4.6 will delete files, force-push, and post to external services if it thinks that's the right move. Proactive to a fault.
Tell it: "Take local, reversible actions freely. For destructive operations or operations visible to others, ask before proceeding." Simple, effective.
The 4.6 models spawn subagents proactively — they recognize when a task benefits from delegation and do it without being asked. Powerful, but also a bit like giving a junior engineer access to Kubernetes for the first time. They'll use it for everything.
If you're seeing excessive subagent spawning, add: "Use subagents when tasks can run in parallel or require isolated context. For simple tasks or tasks where you need to maintain context across steps, work directly."
4.5 and 4.6 love to add abstractions nobody asked for. Extra files, unnecessary error handling for scenarios that can't happen, helper utilities for one-time operations, defensive coding against threats that don't exist.
The antidote: "Only make changes that are directly requested or clearly necessary. A bug fix doesn't need surrounding code cleaned up. A simple feature doesn't need extra configurability. Don't design for hypothetical future requirements."
The models are better about this, but you can push them further: "Never speculate about code you have not opened. If the user references a specific file, read it before answering."
Forces grounded responses. Claude explores rather than confabulates.
The models have meaningfully better vision — multi-image context, UI element recognition, video analysis via frame extraction. A neat trick: give Claude a crop tool so it can zoom into relevant image regions. Anthropic's own tests show consistent uplift.
On frontend design, the models can build impressive web applications, but without guidance they default to what I'd call the "AI slop aesthetic" — Inter font, purple gradients, predictable layouts. The output equivalent of "we used AI for this and it shows."
Fight it: specify fonts that aren't generic. Commit to a cohesive color theme with dominant colors and sharp accents. Use animations. Create atmosphere with layered backgrounds. Have opinions. The model will follow them.
The big change from Sonnet 4.5 is effort levels. Sonnet 4.6 defaults to high effort, which means higher latency if you don't set it explicitly.
For most apps, medium effort is the sweet spot. For high-volume or latency-sensitive work, low effort gives you similar-or-better performance to Sonnet 4.5. For coding and agentic work, start at medium and adjust.
When to just use Opus: large-scale code migrations, deep research, extended autonomous work. The hardest, longest-horizon problems. Sonnet is optimized for speed and cost. Opus is optimized for being right.
The 4.6 models are substantially more capable, which means the prompting game has shifted from "how do I get the model to do things" to "how do I keep it focused and efficient." Less hand-holding, more guardrails. Less encouragement, more precision.
The models that needed coaxing are gone. The models that need directing are here. Prompt accordingly.
All the sources to this article are available at https://www.anthropic.com/learn
Not Sure Where to Start?
In this one-off interactive, gamified workshop, we’ll simulate real-world work scenarios at your organisation via a board game, helping you identify and eliminate bottlenecks, inefficient processes, and unhelpful feedback loops.
Workshop Details