Show HN: Postgres as a VectorDB GUI
I have yet to find a better tool than the old Tensorflow projector: https://projector.tensorflow.org/
Granted, it requires to prepare your data into TSV files first.
lmk if anyone has any thoughts...if I could go back I may have not gone through Electron
Doing dimensionality reduction locally posed a few challenges in terms of application size--the idea was that by analyzing just a few thousand randomly sampled points you can get an idea of your data through a local GUI where you interact with your data and see some correlated metadata.
Not sure if there's too much need for an individual GUI to go along with Postgres as a VectorDB, maybe people just do analysis separate from a normal "GUI"? But maybe not.
What you think?
Have folks seen https://atlas.nomic.ai/ <-- absolutely beautiful vector visualization
README suggestions:
Put the animated gif at the top
Add subtitles to the gif explaining what you're doing.
Why use PostgreSQL instead of columnar databases that are likely to perform way better for these types of analytical workloads?
Does this use pgVector?
That is excellent visualization!
Very interesting, thanks for sharing!
As a non-native English speaker and not very familiar with vector database, the title seems very ambiguous to me. I understand it as Postgres as a GUI for some VectorDB. Upon closer inspection, I realized that "Postgres as a VectorDB" is a full name. Maybe shorten that thing to something else. Just my 2 cents.
This is good, but could also be good to mention that you're using umap for dimensionality reduction with cosine metric.
https://github.com/Z-Gort/Reservoirs-Lab/blob/main/src/elect...
Dimensionality reduction from n >> 2 dimensions to 2 dimensions can be very fickle, so the hyperparameters matter. Your visualization can change significantly significantly depending on choice of metric.
https://umap-learn.readthedocs.io/en/latest/parameters.html
You may want to consider projecting to more than 2 dimensions too. You may ask, how does one visualize more than two dimensions? Through a scatterplot matrix of 2 axes at a time.
https://seaborn.pydata.org/examples/scatterplot_matrix.html
These are used for PCA-type multivariate analyses to visualize latent variables in higher dimensions than 2, but 2 dimensions at a time. Some clustering behavior that cannot be seen in 2 axes might be seen in higher dimensions. We used to do this our lab to find anomalies in high dimensions.