Building blobd: single-machine object store with sub-ms reads and 15 GB/s upload

charlieirish | 47 points

> Direct I/O means no more fsync: no more complexity via background flushes and optimal scheduling of syncs. There's no kernel overhead from copying and coalescing. It essentially provides the performance, control, and simplicity of issuing raw 1:1 I/O requests.

Not true, you still need fsync in direct I/O to ensure durability in power loss situations. Some drives have write caches that means acknowledged writes live in non-volatile memory. So maybe the perf is wildly better because you’re sacrificing durability?

rockwotj | 6 hours ago

That’s a lot of work creating a whole system that stores data on a raw block device. It would be nice to see this compared to… a filesystem. XFS, ZFS and btrfs are pretty popular.

amluto | 9 hours ago

S3's whole selling point is 11 9s of durability across the whole region which is probably why it's slow to begin with.

grenran | an hour ago

Similar systems include Facebook's Haystack and its open source equivalent, SeaweedFS.

Scaevolus | 11 hours ago

> Despite serving from same-region datacenters 2 ms from the user, S3 would take 30-200 ms to respond to each request.

200ms seems fairly reasonable to me once we factor in all of the other aspects of S3. A lot of machines would have to die at Amazon for your data to become at risk.

bob1029 | 6 hours ago

Interesting project but lack of S3 protocol compatibility and fact it seems to YOLO your data means it's not acceptable for many.

stackskipton | 11 hours ago