2021/06/09
Fastly had a bit of an incident on June 8th which you might’ve seen. The outage lasted around one hour, but it meant that loads of sites that rely on their CDN were impacted.
Fastly uses a fork of Varnish 2 that they maintain internally - a general HTTP Cache. This is core to a lot of how they do business, but isn’t the only piece of software they employ. However, they do give customers access to VCL, a domain specific programming language to influence the behavior of their caching solution.
Best guess is that someone had included a configuration value that created VCL with undefined behavior which caused the shared infrastructure to crash or otherwise stop serving as expected. This is all a guess, of course, because they’re being relatively hush-hush about the exact details of the problem. (Makes sense because we don’t really need to know & it’s ~24 hours since the actual problem.)
They did provide a very great blog post and short post-mortem about the incident right away, though. For such a large company, that’s quite impressive.
Let’s just hope they don’t have to do that too often.