You are working against LLM attention. A LLM looks at a conversation and focuses on its attention points. Usually the start and end. Your previous work falls into the out of attention space and gets nuked.
If your asking how to have everything attention we currently can't.
My code (that ChatGPT writes for me is from 500 to 1000 lines). Every 5-7 versions, it starts messing things up.
I keep the working versions on a Word file on a Landscape, A3, 3 columns (version number, comment/changelog, the_code)(yes, cheap, scalable, easy).
So, every 5-7 versions, I start a new chat. I ask ChatGPT to read/write a summary/description of the code, and then I proceed to ask it for new changes/enhancements.
can you explain a bit more what do you mean by burning down? and what do you use .md files for? Documenting the code?
One tip I have found - start new conversations windows when changing focus, so it doesn’t refer to history and make wild assumptions.
[dead]
I'm using an editor called Zed and it has an option to create a "new thread from summary" It also shows at the top of the screen how many tokens I have used out of total available so with the combination of that I think it is best to create a new "chat" periodically with a summary.