HNPWA with Next.js

Claude Code: Best practices for agentic coding

sqs | 614 points

The "ultrathink" thing is pretty funny:

> We recommend using the word "think" to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: "think" < "think hard" < "think harder" < "ultrathink." Each level allocates progressively more thinking budget for Claude to use.

I had a poke around and it's not a feature of the Claude model, it's specific to Claude Code. There's a "megathink" option too - it uses code that looks like this:

  let B = W.message.content.toLowerCase();
  if (
    B.includes("think harder") ||
    B.includes("think intensely") ||
    B.includes("think longer") ||
    B.includes("think really hard") ||
    B.includes("think super hard") ||
    B.includes("think very hard") ||
    B.includes("ultrathink")
  )
    return (
      l1("tengu_thinking", { tokenCount: 31999, messageId: Z, provider: G }),
      31999
    );
  if (
    B.includes("think about it") ||
    B.includes("think a lot") ||
    B.includes("think deeply") ||
    B.includes("think hard") ||
    B.includes("think more") ||
    B.includes("megathink")
  )
    return (
      l1("tengu_thinking", { tokenCount: 1e4, messageId: Z, provider: G }), 1e4
    );

Notes on how I found that here: https://simonwillison.net/2025/Apr/19/claude-code-best-pract...

simonw | 3 months ago

Surprised that "controlling cost" isn't a section in this post. Here's my attempt.

---

If you get a hang of controlling costs, it's much cheaper. If you're exhausting the context window, I would not be surprised if you're seeing high cost.

Be aware of the "cache".

Tell it to read specific files (and only those!), if you don't, it'll read unnecessary files, or repeatedly read sections of files or even search through files.

Avoid letting it search - even halt it. Find / rg can have a thousands of tokens of output depending on the search.

Never edit files manually during a session (that'll bust cache). THIS INCLUDES LINT.

The cache also goes away after 5-15 minutes or so (not sure) - so avoid leaving sessions open and coming back later.

Never use /compact (that'll bust cache, if you need to, you're going back and forth too much or using too many files at once).

Don't let files get too big (it's good hygiene too) to keep the context window sizes smaller.

Have a clear goal in mind and keep sessions to as few messages as possible.

Write / generate markdown files with needed documentation using claude.ai, and save those as files in the repo and tell it to read that file as part of a question. I'm at about ~$0.5-0.75 for most "tasks" I give it. I'm not a super heavy user, but it definitely helps me (it's like having a super focused smart intern that makes dumb mistakes).

If i need to feed it a ton of docs etc. for some task, it'll be more in the few $, rather than < $1. But I really only do this to try some prototype with a library claude doesn't know about (or is outdated). For hobby stuff, it adds up - totally.

For a company, massively worth it. Insanely cheap productivity boost (if developers are responsible / don't get lazy / don't misuse it).

jasonjmcghee | 3 months ago

So I have been using Cursor a lot more in a vibe code way lately and I have been coming across what a lot of people report: sometimes the model will rewrite perfectly working code that I didn't ask it to touch and break it.

In most cases, it is because I am asking the model to do too much at once. Which is fine, I am learning the right level of abstraction/instruction where the model is effective consistently.

But when I read these best practices, I can't help but think of the cost. The multiple CLAUDE.md files, the files of context, the urls to documentation, the planning steps, the tests. And then the iteration on the code until it passes the test, then fixing up linter errors, then running an adversarial model as a code review, then generating the PR.

It makes me want to find a way to work at Anthropic so I can learn to do all of that without spending $100 per PR. Each of the steps in that last paragraph is an expensive API call for us ISV and each requires experimentation to get the right level of abstraction/instruction.

I want to advocate to Anthropic for a scholarship program for devs (I'd volunteer, lol) where they give credits to Claude in exchange for public usage. This would be structured similar to creator programs for image/audio/video gen-ai companies (e.g. runway, kling, midjourney) where they bring on heavy users that also post to social media (e.g. X, TikTok, Twitch) and they get heavily discounted (or even free) usage in exchange for promoting the product.

zoogeny | 3 months ago

I've developed a new mental model of the LLM codebase automation solutions. These are effectively identical to outsourcing your product to someone like Infosys. From an information theory perspective, you need to communicate approximately the same amount of things in either case.

Tweaking claude.md files until the desired result is achieved is similar to a back and forth email chain with the contractor. The difference being that the contractor can be held accountable in our human legal system and can be made to follow their "prompt" very strictly. The LLM has its own advantages, but they seem to be a subset since the human contractor can also utilize an LLM.

Those who get a lot of uplift out of the models are almost certainly using them in a cybernetic manner wherein the model is an integral part of an expert's thinking loop regarding the program/problem. Defining a pile of policies and having the LLM apply them to a codebase automatically is a significantly less impactful use of the technology than having a skilled human developer leverage it for immediate questions and code snippets as part of their normal iterative development flow.

If you've got so much code that you need to automate eyeballs over it, you are probably in a death spiral already. The LLM doesn't care about the terrain warnings. It can't "pull up".

bob1029 | 3 months ago

So I feel like a grandpa reading this. I gave Claude code a solid shot. Had some wins but costs started blowing up. I switched to Gemini AI where I only upload files I want it to work on and make sure to refactor often so modularity remains fairly high. It's an amazing experience. If this is any measure - I've been averaging about 5-6 "small features" per 10k tokens. And I totally suck at fe coding!! The other interesting aspect of doing it this way is being able to break up problems and concerns. For example in this case I only worked on fe without any backend and flushed it out before starting on an backend.

flashgordon | 3 months ago

Claude Code works fairly well, but Anthropic has lost the plot on the state of market competition. OpenAI tried to buy Cursor and now Windsurf because they know they need to win market share, Gemini 2.5 pro is better at coding than their Sonnet models, has huge context and runs on their TPU stack, but somehow Anthropic is expecting people to pay $200 in API costs per functional PR costs to vibe code. Ok.

bugglebeetle | 3 months ago

The issue with many of these tips is that they require you use to claude code (or codex cli, doesn't matter) to spend way more time in it, feed it more info, generate more outputs --> pay more money to the LLM provider.

I find LLM-based tools helpful, and use them quite regularly but not 20 bucks+, let alone 100+ per month that claude code would require to be used effectively.

sbszllr | 3 months ago

The most interesting part of this article for me was:

> Have multiple checkouts of your repo

I don’t know why this never occurred to me probably because it feels wrong to have multiple checkouts, but it makes sense so that you can keep each AI instance running at full speed. While LLM‘s are fast, this is one of the annoying parts of just waiting for an instance of Aider or Claude Code to finish something.

Also, I had never heard of git worktrees, that’s pretty interesting as well and seems like a good way to accomplish effectively having multiple checkouts.

joshstrange | 3 months ago

What's the Gemini equivalent of Claude Code and OpenAI's Codex? I've found projects like reugn/gemini-cli, but Gemini Code Assist seems limited to VS Code?

remoquete | 3 months ago

I mostly work in neovim, but I'll open cursor to write boilerplate code. I'd love to use something cli based like Claude Code or Codex, but neither of them implement semantic indexing (vector embeddings) the way Cursor does. It should be possible to implement an MCP server which does this, but I haven't found a good one.

0x696C6961 | 3 months ago

Isn't this bad that every model company is making their own version of the IDE level tool?

Wasn't it clearly bad when facebook would get real close to buying another company... then decide naw, we got developers out the ass lets just steal the idea and put them out of business

beefnugs | 3 months ago

I use Claude Code. I read the discussion here, and given the criticism, proceeded to try some of the other solutions that people recommended.

After spending a couple of hours trying to get aider and plandex to run (and then with Google Gemini 2.5 pro), my conclusion is that these tools have a long way to go until they are usable. The breakage is all over the place. Sure, there is promise, but today I simply can't get them to work reasonably. And my time is expensive.

Claude Code just works. I run it (even in a slightly unsupported way, in a Docker container on my mac) and it works. It does stuff.

PS: what is it with all "modern" tools asking you to "curl somewhere.com/somescript.sh | bash". Seriously? Ship it in a docker container if you can't manage your dependencies.

jwr | 3 months ago

I'm wondering how much of the techniques described in this blog post can be used in an IDE like Windsurf or Cursor with Claude Sonnet?

My 2 cents on value for money and effectiveness of Claude vs Gemini for coding:

I've been using Windsurf, VS Code and the new Firebase Studio. The Windsurf subscription allowance for $15 per month seems adequate for reasonable every day use. I find Claude Sonnet 3.7 performs better for me than Gemini 2.5 pro experimental.

I still like VS Code and its way of doing things, you can do a lot with the standard free plan.

With Firebase Studio, my take is that it should good for building and deploying simple things that don't require much developer handholding.

fallinditch | 3 months ago

I recently wrote a big blog post on my experience spending about $200 with Claude Code to "vibecode" some major feature enhancements for my image gallery site mood.site

https://kylekukshtel.com/vibecoding-claude-code-cline-sonnet...

Would definitely recommend people reading it for some insight into hands on experience with the tool.

kkukshtel | 3 months ago

well, the best practice is to use gemini 2.5 pro instead :)

m00dy | 3 months ago

I’m too scared of the cost to use this.

andrewstuart | 3 months ago

I love Claude Code. It just gets the job done where Cursor (even with Claude Sonnet 3.7) will get lost in changing files without results.

Did anyone have equal results with the „unofficial“ fork „Anon Kode“? Or with Roo Code with Gemini Pro 2.5?

submeta | 3 months ago

>Use Claude to interact with git

Are they saying Claude needs to do the git interaction in order to work and/or will generate better code if it does?

panny | 3 months ago

What are these "subagents" this doc refers to?

ccarse | 3 months ago

> Use /clear to keep context focused

The only problem is that this loss is permanent! As far as I can tell, there's no way to go back to the old conversation after a `/clear`.

I had one session last week where Claude Code seemed to have become amazingly capable and was implementing entire new features and fixing bugs in one-shot, and then I ran `/clear` (by accident no less) and it suddenly became very dumb.

Wowfunhappy | 3 months ago

Why do people use Claude Code over e.g. Cursor or Windsurf?

imafish | 3 months ago

This is so helpful!

LADev | 3 months ago

This is a pretty desperate post imho.

babuloseo | 3 months ago

If anyone from Anthropic is reading this, your billing for Claude Code is hostile to your users.

Why doesn’t Claude Code usage count against the same plan that usage of Claude.ai and Claude Desktop are billed against?

I upgraded to the $200/month plan because I really like Claude Code but then was so annoyed to find that this upgrade didn’t even apply to my usage of Claude Code.

So now I’m not using Claude Code so much.

zomglings | 3 months ago

[dead]

curtisszmania | 3 months ago