GitHub Pages, our static site hosting service, has always had a very simple architecture. From launch up until around the beginning of 2015, the entire service ran on a single pair of machines (in active/standby configuration) with all user data stored across 8 DRBD backed partitions. Every 30 minutes, a cron job would run generating an nginx map file mapping hostnames to on-disk paths.
There were a few problems with this approach: new Pages sites did not appear until the map was regenerated (potentially up to a 30-minute wait!); cold nginx restarts would take a long time while nginx loaded the map off disk; and our storage capacity was limited by the number of SSDs we could fit in a single machine.
Despite these problems, this simple architecture worked remarkably well for us — even as Pages grew to serve thousands of requests per second to over half a million sites.
Hailey Somerville sul blog GitHub.


