Show HN: Yet another memory system for LLMs

blackmanta | 165 points

How do you use this in your workflow? Please give some examples because it’s not clear to me what this is for.

retreatguru | 18 days ago

Reviewing the prompts, looks like you are using this CAS tool as a global context data manager, supporting primarily a code use case. There are a number of extant MCP-capable code understanding tools (Serena and others), but what I am lacking in my CLI toolchain is non-code memory. You even called this out in another thread, mentioning task management- I find that the type of memory I need is not scoped to a code module, but an agent session - specifically to the orchestration of many agent sessions. What we have today are techniques, using a bunch of hacked together context files for sessions (tasks.md, changes.md), for agents (roles.md), for tech (architecture.md), etc etc, hoping that our prompts guide the agent to use them, and this is IMO a natural place for some abstraction over memory that can provide rigor.

I am observing in my professional (non-Claude Max) life that context is a real limiter, from both the “too much is confusing the agent” and “I’m hitting limits doing basic shit” perspectives (looking at you, Bedrock and Github), and having a tool that will help me give an agent only what it needs would be really valuable. I could do more with the tools, spend less time trying to manually intervene, and spend less of my token budget.

threecheese | 17 days ago

>block-level deduplication (saves 30-40% on typical codebases)

How is savings of 40% on a typical codebase possible with block-level deduplication? What kind of blocks are you talking about? Blocks as in the filesystem?

elpocko | 17 days ago

Thank you for sharing this. Sorry for a possible noob question. How are embedding generated? Does it use a hosted embedding model? (I was trying to understand how is semantic search implemented)

rkunnamp | 17 days ago

I also developed yet another memory system !

https://github.com/jerpint/context-llemur

Although I developed it explicitly without search, and catered it to the latest agents which are all really good at searching and reading files. Instead you and LLMs cater your context to be easily searchable (folders and files). It’s meant for dev workflows (i.e a projects context, a user context)

I made a video showing how easy it is to pull in context to whatever IDE/desktop app/CLI tool you use

https://m.youtube.com/watch?v=DgqlUpnC3uw

jerpint | 17 days ago

Wicked cool. Useful for single users. Any plans to build support for multiple users? Would be useful for an LLM project that requires per user sandboxing.

mempko | 18 days ago

How would you use the built in functionality to enable graph functionality? Metadata or another document used as the link or collection of links?

sitkack | 18 days ago

In my RAG I use qdrant w/ Redis. Very successfully. I don't really see the use of "another memory system for LLM", perhaps I'm missing something.

huqedato | 17 days ago

I like it and I will be perusing your code for what could be used in my 'not yet working' variant.

A4ET8a8uTh0_v2 | 17 days ago

Cool! Any plan to support shared storage like cloud RDBs or S3?

yukukotani | 15 days ago

What about versioning of files?

marcofiocco | 18 days ago

not trying to be a hater but how is 100mb/s high performance in 2025? that's as performant as a 20 years old hdd

izabera | 17 days ago

I'm puzzled - where are the header files?

yard2010 | 18 days ago

>MCP server (requires Boost)

I see stuff like this, and I really have to wonder if people just write software with bloat for the sake of using a particular library.

ActorNightly | 18 days ago

Thanks, I learned a lot from this.

JSR_FDED | 18 days ago

How does this compare to Letta?

vira28 | 18 days ago

[dead]

hotelbet | 17 days ago

That sounds like a practical take on LLM memory — especially the block-level deduplication part.

Most “memory” layers I’ve seen for AI are either overly complex or end up ballooning storage costs over time, so a content-addressed approach makes a lot of sense.

Also curious — have you benchmarked retrieval speed compared to more traditional vector DB setups? That could be a big selling point for devs running local research workflow

skyzouwdev | 17 days ago

The domain listed on the GitHub repo redirects too many times.

winterrx | 18 days ago

[dead]

bestspharma | 14 days ago

[dead]

bestspharma | 15 days ago

Hader

yawerali | 18 days ago