The best way I have found is to setup keepalived -> pgbouncer -> Postgres. Use repmgr to manage replication and barman for backups. Setup a VIP with keepalived with a small script that checks if the server is primary. You loose about 7-9 pings during a failover, have keepalived check about every 2 seconds and flip after 3 consecutive failures.
Haven't used it yet. But seeing as both Yugabyte and Cockroach being mentioned...
pgEdge: https://github.com/pgedge/pgedge Demo: https://youtu.be/Gpty7yNlwH4?t=1873
Not affiliated with them.
I recall that aspirationally pgEdge aims to be compatible with the latest pg version or one behind.
Great that nobody can track, or easily contribute to, the underlying postgres bug, because postgres has no issue tracker.
Keeps the number of reported bugs nice and low. The discussion of critical bugs that lose your data is left to HN and Twitter threads instead.
> If the PostgreSQL backend is cancelled while waiting to acknowledge replication (as a result of packet cancellation due to client timeout or backend failure) transaction changes become visible for other backends. Such changes are not yet replicated and may be lost in case of standby promotion.
This sounds like the two generals problem, which has no solution. But I may be misunderstanding.
> require mandatory telemetry collection for free version
Couldn't one simply define kubernetes network policies to limit egress from CockroachDB pods?
I'm currently looking for similar info but for MySQL/MariaDB for an IoT side project ... any suggestions?
It would be awesome if you could do the same test with Stolon!
What an absolutely delightful little project and write up.
Anyone familiar with autobase.tech?
Is there any alternative to Jepsen that does not involve writing spaghetti Clojure code?
Is anyone here using YugabyteDB for high-availability Postgres?
It seems like a compelling option:
* Much closer to Postgres compatibility than CockroachDB.
* A more permissive license.
* Built-in connection manager [1], which should simplify deployment.
* Supports both high availability and geo-distribution, which is useful if scaling globally becomes necessary later.
That said, I don't see it mentioned around here often. I wonder if anyone here has tried it and can comment on it.
--
1: https://docs.yugabyte.com/preview/explore/going-beyond-sql/c...