Show HN: HelixDB – Open-source vector-graph database for AI applications (Rust)

GeorgeCurtis | 180 points

I spent a bit of time reading up on the internals and had a question about a small design choice (I am new to DB internals, specifically as they relate to vector DBs).

I notice that in your core vector type (`HVector`), you choose to store the vector data as a `Vec<f64>`. Given what I have seen from most embedding endpoints, they return `f32`s. Is there a particular reason for picking `f64` vs `f32` here? Is the additional precision a way to avoid headaches down the line or is it something I am missing context for?

Really cool project, gonna keep reading the code.

quantike | 6 hours ago

Congrats on the launch! I'm one of the authors of that paper you cited, glad it was useful and inspiring to building this :) Let me know if we can support in any way!

rohanrao123 | 14 hours ago

I was thinking about intertwining Vector and Graph, because I have one specific usecase that required this combination. But I am not courageos or competent enough to build such a DB. So I am very excited to see this project and I am certainly going to use it. One question is what kind of hardware do you think this would require ? I am asking it because from what I understand Graph database performance is directly proportional to the amount of RAM it has and Vectors also needs persistence and computational resources .

srameshc | 11 hours ago

Congrats! Any chance Helixdb can be run in the browser too, maybe via WASM? I'm looking for a vector db that can be pre-populated on the server and then be searched on the client so user queries (chat) stay on-device for privacy / compliance reasons.

hbcondo714 | 17 hours ago

Congrats on the launch!

lleymrl651 | an hour ago

This is very interesting, are there any examples of interacting with LLMs? If the queries are compiled and loaded into the database ahead of time the pattern of asking an LLM to generate a query from a natural language request seems difficult because current LLMs aren't going to know your query language yet and compiling each query for each prompt would add unnecessary overhead.

tmpfs | 14 hours ago

Can I run this as an embedded DB like sqlite?

Can I sidestep the DSL? I want my LLMs to generate queries and using a new language is going to make that hard or expensive.

huevosabio | 16 hours ago

Excellent work. Very exited to test this out. What are the limits or gotchas we should be aware of, or how do you want it pushed?

What other papers did you get inspiration from?

sitkack | 9 hours ago

The fact that it's "backed by NVIDIA" and licensed under AGPL-3.0 makes me wonder about the cost(s) of using it in production.

Could you share any information on the pricing model?

rationably | 7 hours ago

What would be a typical/recommended server setup for using this for RAG? Would you typically have a separate server for the GPUs and the DB itself?

anonymousDan | 7 hours ago

Looks very interesting, but I've seen these kind of multi-paradigm databases like Gel, Helix and Surreal and I'm not sure that any of them quite hit the graph spot.

Does Helix support much of the graph algorithm world? For things like GrapgRAG.

Either way, I'd be all over it if there was a python SDK witch worked with the generated types!

youdont | 13 hours ago

Graph DB OOMing 101. Can it do Erdős/Bacon numbers?

Graph DBs have been plagued with exploding complexity of queries as doing things like allowing recursion or counting paths isn't as trivial as it may sound. Do you have benchmarks and comparisons against other engines and query languages?

dietr1ch | 13 hours ago

How does it compare with https://kuzudb.com/ ?

esafak | 17 hours ago

It sounds very intriguing indeed. However, the README makes some claims. Are there any benchmarks to support them?

> Built for performance we're currently 1000x faster than Neo4j, 100x faster than TigerGraph

Attummm | 15 hours ago

Nice "I'll have this name" when there's already the helix editor :)

carlhjerpe | 17 hours ago

How do you think about building the graph relationships? Any special approaches you use?

J_Shelby_J | 17 hours ago

Super cool!!! I'll try it this week and go back to give a feedback.

SchwKatze | 17 hours ago

"faster than Neo4j" How does it compare to Dgraph?

wiradikusuma | 8 hours ago

What is the max number of dimensions supported for a vector?

javierluraschi | 17 hours ago

What method/model are you using for sparse search?

elpalek | 16 hours ago
[deleted]
| 14 hours ago

How can I migrate neo4j to this?

raufakdemir | 14 hours ago

Can you do a compare/contrast with CozoDB?

https://github.com/cozodb/cozo

michaelsbradley | 8 hours ago

how did you get it 3 OOMs faster than neo4j?

lennertjansen | 14 hours ago

How scalable is your DB in your tests? Could it be performent on graphs with 1B/10B/100B connections?

riku_iki | 13 hours ago

Looks nice! Are you looking to compete with https://www.falkordb.com or do something a bit different?

sync | 18 hours ago

> so much easier that it’s worth a bit of a learning curve

I think you misspelled "vendor lock in"

mdaniel | 16 hours ago

why not surrealdb?

basonjourne | 15 hours ago