Note di Matteo


#storage

Healthchecks.io Now Uses Self-hosted Object Storage. Il maestro del self-hosting ora self-hosta anche uno storage S3-compatible, basato sul file system:

Versity S3 Gateway turns your local filesystem into an S3 server. An S3 PutObject operation creates a regular file on the filesystem, an S3 GetObject operation reads a regular file from the filesystem, and an S3 DeleteObject operation deletes a file from the filesystem. It does not need a separate database for metadata storage. You can use any backup tool to take backups. The upgrade procedure is: replace a single binary and restart a systemd service. It is written in Go, and is being actively developed. The one bug I found and reported was fixed in just a few days.

Il tradeoff è l'assenza di HA e replicazione:

With this setup, if both drives on the object storage server fail at the same time, the system could lose up to 2 hours of not yet backed-up ping request bodies. This can be improved, as usual, with the cost of extra complexity.

E costa di più:

The costs have increased: renting an additional dedicated server costs more than storing ~100GB at a managed object storage service. But the improved performance and reliability are worth it.

#453 /
18 aprile 2026
/
10:20
/ #cloud#storage

Improving storage efficiency in Magic Pocket, our immutable blob store. Dropbox spiega come funziona il suo sistema di storage per i dati degli utenti, e come un bug ha portato ad avere volumi di storage usati solo in piccola parte e come poi sono stati "ricompattati".

Magic Pocket is the core Dropbox storage system—a custom-built, exabyte-scale blob storage system designed for durability, availability, scale, and efficiency. It holds user content, which means it must be safe, fast, and cost-effective to scale with the company. For Dropbox, storage efficiency really matters. We measure it by looking at how much total disk space we use compared to how much user data we’re actually storing.

#445 /
14 aprile 2026
/
09:25
/ #storage

Firn is a high-performance, multi-tenant vector and full-text search engine backed by object storage (S3 / MinIO / R2 / GCS). It is designed as a credible open-source alternative to turbopuffer, proving that a professional-grade tiered storage architecture (RAM → NVMe → S3) is achievable entirely from open-source components. The cost efficiency of S3 with the speed of local RAM. A multi-tenant vector and full-text search engine backed by S3. Built with LanceDB and Foyer for microsecond-scale search latency on top of object storage.

#443 /
13 aprile 2026
/
23:42
/ #ai#database#storage

Altri feedback sulle alternative a MinIO, ora definitivamente abbandonato nella versione open source, da Hacker News.

Un confronto tra le principali alternative (vedi anche miei post precedenti):

First off, I don't think there is anything wrong with MinIO closing down its open source. There are simply too many people globally who use open source without being willing to pay for it. I started testing various alternatives a few months ago, and I still believe RustFS will emerge as the winner after MinIO's exit. I evaluated Garage, SeaweedFS, Ceph, and RustFS. Here are my conclusions:

  1. RustFS and SeaweedFS are the fastest in the object storage field.

  2. The installation for Garage and SeaweedFS is more complex compared to RustFS.

  3. The RustFS console is the most convenient and user-friendly.

  4. Ceph is too difficult to use; I wouldn't dare deploy it without a deep understanding of the source code.

Although many people criticize RustFS, suggesting its CLA might be "bait," I don't think such a requirement is excessive for open source software, as it helps mitigate their own legal risks.

Furthermore, Milvus gave RustFS a very high official evaluation. Based on technical benchmarks and other aspects, I believe RustFS will ultimately win.

https://milvus.io/blog/evaluating-rustfs-as-a-viable-s3-compatible-object-storage-backend-for-milvus.md

Un po' di promozione dall'autore di SeaweedFS:

I work on SeaweedFS since 2011, and full time since 2025.

SeaweedFS was started as a learning project and evolves along the way, getting ideas from papers for Facebook Haystack, Google Colossus, Facebook Tectonics. With its distributed append-only storage, it naturally fits object store. Sorry to see MinIO went away. SeaweedFS learned a lot from it. Some S3 interface code was copied from MinIO when it was still Apache 2.0 License. AWS S3 APIs are fairly complicated. I am trying to replicate as much as possible.

Some recent developments:

  • Run "weed mini -dir=xxx", it will just work. Nothing else to setup.

  • Added Table Bucket and Iceberg Catalog.

  • Added admin UI

#351 /
14 febbraio 2026
/
15:31
/ #storage

How to build a distributed queue in a single JSON file on object storage. Usare l'object storage come un database è sempre affascinante e Turbopuffer ci ha costruito sopra un vector db. Ora hanno implementato anche la coda di indicizzazione dei dati in un singolo file queue.json in un bucket S3/GCS, sfruttando le primitive dell'object storage (come compare-and-set, o CAS) per gestire conflitti (è in realtà un po' più complicato di così e l'articolo è scritto molto bene).

#348 /
13 febbraio 2026
/
14:18
/ #database#storage





Quickwit

Numeri sulla migrazione di Mezmo da Elasticsearch a Quickwit.

Con Elasticsearch:

  • 2 PB di storage
  • 275 istanze EC2
  • 35 TB di RAM
  • 7770 core

(800 MB - 2 GB di integestion al secondo)

Con Quickwit (che è pazzesco!):

  • -80% storage
  • -40% di instanze EC2
  • -98% RAM
  • -93% CPU

#159 /
17 novembre 2025
/
14:35
/ #database#storage#cloud



MinIO sta diventando crescentemente un progetto prevalentemente commerciale:

  • MinIO removed web management features from its open-source community version, forcing users to command-line tools or paid upgrades
  • MinIO Community version was downgraded to basic object browser only with no account management, policy configuration, or administrative functions
  • The cost of MinIO’s paid version is substantial: software and support alone cost a MINIMUM of $96,000 per year, rising to $244,032 per year for 1 PB of usable capacity, according to MinIO’s website.

Ora la versione community non è più su Docker Hub ed è sparita anche la documentazione community apparentemente.

#95 /
23 ottobre 2025
/
09:35
/ #open-source#storage