Show HN: Postgres as a VectorDB GUI
I have yet to find a better tool than the old Tensorflow projector: https://projector.tensorflow.org/
Granted, it requires to prepare your data into TSV files first.
lmk if anyone has any thoughts...if I could go back I may have not gone through Electron
Doing dimensionality reduction locally posed a few challenges in terms of application size--the idea was that by analyzing just a few thousand randomly sampled points you can get an idea of your data through a local GUI where you interact with your data and see some correlated metadata.
Not sure if there's too much need for an individual GUI to go along with Postgres as a VectorDB, maybe people just do analysis separate from a normal "GUI"? But maybe not.
What you think?
Have folks seen https://atlas.nomic.ai/ <-- absolutely beautiful vector visualization
Why use PostgreSQL instead of columnar databases that are likely to perform way better for these types of analytical workloads?
README suggestions:
Put the animated gif at the top
Add subtitles to the gif explaining what you're doing.
Does this use pgVector?
That is excellent visualization!
Very interesting, thanks for sharing!
As a non-native English speaker and not very familiar with vector database, the title seems very ambiguous to me. I understand it as Postgres as a GUI for some VectorDB. Upon closer inspection, I realized that "Postgres as a VectorDB" is a full name. Maybe shorten that thing to something else. Just my 2 cents.
This is good, but could also be good to mention that you're using umap for dimensionality reduction with cosine metric.
https://github.com/Z-Gort/Reservoirs-Lab/blob/main/src/elect...
Dimensionality reduction from n >> 2 dimensions to 2 dimensions can be very fickle, so the hyperparameters matter. Your visualization can change significantly significantly depending on choice of metric.
https://umap-learn.readthedocs.io/en/latest/parameters.html
You may want to consider projecting to more than 2 dimensions too. You may ask, how does one visualize more than two dimensions? Through a scatterplot matrix of 2 axes at a time.
https://seaborn.pydata.org/examples/scatterplot_matrix.html
These are used for PCA-type multivariate analyses to visualize latent variables in higher dimensions than 2, but 2 dimensions at a time. Some clustering behavior that cannot be seen in 2 axes might be seen in higher dimensions. We used to do this our lab to find anomalies in high dimensions.