Using a CDN with Propshaft and Rails

How Propshaft Works (Pt. 1)
How CDNs Works (Pt. 2) (This page!)
How Propshaft Works With Importmaps (Pt. 3)

Welcome back for part two of assets, static files, browser delivery, and Propshaft! In part one we talked about the local piece of the puzzle: where to put files, how Propshaft moves them around, what fingerprinting is, and where static files ultimately end up! If you don’t feel confident around any of those topics, take a few minutes to go back and read part one, “How Propshaft Works: A Rails Asset-Pipeline (Visual) Breakdown”.

What we want to talk about today is more of the zoomed-out view, particularly in a production environment. Propshaft fingerprints our assets/ files and shoves them into public/… but then what?

Let’s talk about it! Put a snug snorkel on, we’re doing deep 🤿😎

Some Initial Context

Let’s suppose we have an application we just spun up on a cloud somewhere (more on that later). And let’s suppose we’ve bound it to the domain example.com. It’s a simple app — maybe just a static-content blog (like the one you’re reading now) or some other simple Rails MVC setup. Nothing complicated!

Regardless of where and how the app is spun up on the cloud (hold that thought!), we’re talking about assets and Propshaft in this series… so we must ask the first question: how are assets fetched from said app?

Well, let’s start with the first request; just navigating in a browser to the domain. While the browser will handle all of this for us, let’s snoop in on what that request and response actually look like:

# The request your browser makes (to `/` — root):
GET / HTTP/1.1
Accept: text/html
Host: example.com
Connection: keep-alive
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15

# The response the server sends back
HTTP/1.1 200 OK
Date: Fri, 27 Sep 2024 12:34:03 GMT
Content-Type: text/html
Connection: Keep-Alive:

<html>
  <head>
    # a bunch of stuff
    <link rel="stylesheet" href="/assets/tailwind-5cf1591d.css">
    # etc.
    <script type="text/javascript" src="/assets/application-127bf81.js"></script>
    # etc.
  </head>
  <body>
    # all the markup
  </body>
</html>

And this should make sense so far — since we’re using the Propshaft helpers in our source code for <%= stylesheet_link_tag 'tailwind' %> and <%= javascript_include_tag 'application' %> the actual href/src targets that get painted to the final HTML output include fingerprints. Nice!

The important part here is noting that as soon as the browser gets this response and parses the <head> of the document, it’s going to immediately fire off requests to the href and src targets for those assets. By having the <link and <script tags in the first place, your HTML is essentially proclaiming “this page needs these other resources to display and work correctly!” So the browser follows suit and grabs those linked resources before displaying the final painted page to the user.

This means that, ultimately, right after your server handles and serves a request for / (root), the browser is going to immediately request a few more paths: example.com/assets/tailwind-5cf1591d.css and example.com/assets/application-127bf81.js. What happens then?

I know we’re being ultra step-by-step here, but follow with me for a moment; it’s important to understand the details of how the pipes!

The MVP Approach

Okay, so we have the context of what’s going on browser-side. Let’s go back now and talk about how/where the app is hosted. We’re calling this “the MVP approach” because we’ll suppose that this app is just spun up on Heroku. Maybe just a single web process running on a basic dyno (again, just like this site!). Nothing fancy! We’ll also assume that the domain, example.com, is correctly bound to the app and that our DNS is pointing directly at Heroku for resolution.

That is, when you hit example.com, you hit the Heroku app. Again, keep it simple!

So let’s return to our browser workflow example. The browser requests root (/), gets an HTML page with a couple of linked resources, and quickly follows up with two requests for the resources. In the style of a sequence chart, that looks like this:

A sequence chart showing the first few requests between the browser and the server, highlighting the dual resource requests hitting the server just after the first server response with the HTML — Two more just after!

And this is all fine and well. Remember that we have some Rack middleware in place which will serve those static-asset requests with the correct content. We don’t need to do anything up in the Rails level (“our code”) to make this work.

We’ve got no real issue here! Everything should work as designed.

Well, no issues except efficiency and capacity 😛.

There’s a long-standing principle in web development that the only things served directly from your web-server ought to be that content which changes or needs to be rendered on a per-user basis. Since, by definition, everything else your server can serve will be the same for all users hitting the server, those things should be served by another system. Something made specifically for serving up static stuff over and over, quickly!

This principle came to be realized because web servers back in the early days didn’t have very many resources (CPU, RAM, network bandwidth). So serving requests took a lot more juice, proportionally, than they do now! As such, the early web developers realized that, if you only have a pie so large, you should spend as much of that pie as possible serving requests that only you can serve! Since many of the requests are for things which don’t need to be rendered on a per-user basis, it’s wasted energy and time having the main web server handle those requests.

It’s like hiring a professional software developer to cook you a meal. Can they do it? Probably. Is it a good use of your dollars? Absolutely not! You’re using far more of your resources than you need to, to accomplish the exact same job.

So, if we want to be efficient with our web-server’s time and energy, we should only ask it to handle requests that are actually dynamic — where parts of the response will be different depending on who’s asking.

But… what handles the assets requests, then? Here we see the rise of the CDN.

👀 Note

You may be wondering, “well sure, web servers 20 years ago had far less juice than even my basic Heroku dyno… can’t I just serve my assets directly from my Rails app now?”

And the answer is, actually, yes! You definitely can. There’s really no negative to serving your assets directly from your app these days, as long as your app is fairly low-traffic or you’re trying to keep it as absolutely-simple as possible. Even Heroku’s basic dynos have a lot more horsepower than most folks give them credit for and can easily handle hundreds of requests per second (static asset requests included).

But as an application grows and takes on more traffic and users, we naturally want to start getting a bit more efficient. The reality is that it’s never been easier to push static-content-serving off to a third-party service (more on that below), so not doing it is nearly wasting money. At the very latest, you should setup a CDN the moment your app grows too large for a basic dyno. You might even find that, once your static asset requests are handled elsewhere, you can stay on a basic dyno longer!

We setup a CDN for all of our applications, by default, when we first set them up in production (since it’s virtually one click). There’s really no downside to using one.

The CDN

The premise of the Content Delivery Network is that of a third-party helper. Essentially, it’s some server outside of your Rails application that serves all of your static assets instead of your Rails application doing so directly. In fact, it’d be nice if “CDN” was actually “SCDN”, or static-content delivery network. The goal of a CDN is to provide an off-site server to serve up anything that lives in public/ on behalf of your application. Since, as we previously covered, everything in assets/ ends up in public/, that means a CDN is like a middleman for assets/* and public/*. Visually, it looks like this:

A sequence chart showing the first few requests between the browser and the server, now showing the requests for css and js going to a CDN in the middle rather than the Heroku Dyno — Player three has entered…

Or at least, it looks like that once the CDN has the assets cached and ready to serve. The CDN does have to pull each asset from the origin server (your app dynos) at least once. But once it has a copy, it can serve that copy forever without asking for a new copy! (See the notes on fingerprinting in the prior post)

The key nuance here is understanding what goes through the CDN (to your app) vs. what gets served by the CDN (not hitting your app). And this is where the high-resolution request breakdown we did above comes in handy.

When a user first requests the root path of your site, example.com/, they’re sending a GET request for the path /, looking for a response in html format. A CDN, by default, isn’t going to cache/serve any HTML response. And that’s a good thing, because as we mentioned, the base HTML document is the one thing likely to change on a per-user basis.

We can generalize this to understand that any top-level request is likely going to pass through the CDN — requests like GET /books/242 or GET /blog-post/another-day-in-the-rails-life, each of which return a base HTML document — will never be cached (by default). On the other hand, most requests for linked resources or linked assets will be cached by default. Going back to our HTML markup, we’re talking about these things:

<html>
  <head>
    <!-- linked stylesheet (cached) -->
    <link rel="stylesheet" href="/assets/tailwind-5cf1591d.css" />
    <!-- linked script (cached) -->
    <script type="text/javascript" src="/assets/application-127bf81.js"></script>
    <!-- etc. -->
  </head>
  <body>
    <!-- linked image (cached) -->
    <img href="/foo/bar.jpg" />
  </body>
</html>

Which, again, is great for the Rails app. Even in our tiny example, that’s three requests that our app doesn’t need to serve itself… for every visitor to that page! As in, if you had a thousand people visit that route, you’d save yourself three thousand requests to your app!

There are generally two styles of CDNs these days: “Pull” and “Reverse Proxy”. Let’s talk about each!

The “Pull” CDN Approach

The Pull CDN setup is essentially exactly what we described above. The CDN sits as a third-party and pulls assets from the origin server, usually just once, then serves them indefinitely thereafter. The unique detail of a traditional Pull setup is that the CDN endpoint itself is mounted on a custom subdomain! If our primary site domain is example.com, we’d expect a Pull setup to use a subdomain like cdn.example.com or static.example.com, but which uses the same path as the primary domain. So a script at example.com/assets/application-127bf81.js would be available at cdn.example.com/assets/application-127bf81.js.

Now, what we mentioned about HTML requests passing ‘through’ the CDN above doesn’t apply here. Since a Pull setup mounts the CDN to a fully separate domain, there is no middleman layer determining if a particular request is for a static asset or not. Any request sent to example.com will reach the origin server in this setup. It looks like this:

Diagram showing a browser hitting either the Heroku Dyno app or a CDN server depending on the request domain — Depends on request domain

Luckily, Rails can automatically integrate with a Pull CDN setup — we just need to set the asset_host config value to our subdomain:

# config/environments/production.rb

Rails.application.configure do
  # ...
  config.asset_host = "https://cdn.example.com"
  # ...
end

Once that configuration is in place, all of the asset helper methods we wrote into our markup (stylesheet_link_tag, javascript_include_tag, etc.) will now generate URLs pointing to our CDN rather than simply relative links back to the Rails app:

<html>
  <head>
    <!-- now fully-qualified URL to subdomain: -->
    <link rel="stylesheet" href="https://cdn.example.com/assets/tailwind-5cf1591d.css" />
    <script
      type="text/javascript"
      src="https://cdn.example.com/assets/application-127bf81.js"
    ></script>
  </head>
</html>

As long as you’re using the asset helper methods in your markup, everything should just work(™️) out of the box.

Let’s compare that to the other modern CDN style…

The “Reverse Proxy” Approach

This workflow was popularized by Cloudflare several years ago and remains their bread-and-butter to this day. This setup is the middleman premise we talked about above. We essentially give the CDN control over our primary domain (example.com) and it then chooses which requests ought to pass through to our Rails app and which request it can serve directly. Visually, it looks like this:

Diagram showing a browser requesting various content from a domain and the reverse proxy CDN in the middle deciding what goes through to origin and what it serves itself — Shall you not pass?

Fortunately, this approach requires no changes to our Rails application. In fact, the application doesn’t even know there’s a reverse-proxy CDN in front of it! The app will simply receive nearly zero requests for static assets and be perfectly happy about it!

Where this approach does require deeper changes is at the DNS layer. Traditionally, a reverse-proxy CDN requires more access into your DNS settings and routing for your domain. In order for the CDN’s servers to transparently sit “in front” of your application, it needs to ensure that all request to your application are actually routed through itself first. Typically this means that your domain’s nameservers must be hosted through the CDN company. That can be a big deal!

Some Pros and Cons

The obvious question, then, is which approach to take for your application’s static-asset delivery needs! Both the Pull and the Reverse Proxy strategies each have their own pros and cons.

Pull CDN Pros

First, it’s hard to refute the simplicity of the Pull setup. It’s a separate service mounted easily on a subdomain, it pulls static resources automatically when requested, and it only requires a single change to a Rails application to setup. This strategy has also been around longer than the Reverse Proxy setup and is extremely battle tested.

And this might sound obvious, but using a Pull setup means that requests to your app will go directly to your app. Having no middleman can save a few milliseconds.

Pull CDN Cons

First, a gotcha. While Rails’ asset helpers will automatically set the target URL host to the value we put in our config, be careful with static files that we place in public/ ourselves! Since there’s no Rails helpers for those files, there’s no automatic injection of the CDN subdomain host.

Additionally, while not having a middleman can be beneficial in some ways, it can also come with some costs. Most Reverse Proxy CDNs have lots of other features that can be activated thanks to their positioning. One of these is typically ‘attack protection’. If you’re running a Pull CDN, make sure that you’ve got great firewall and access control systems in place.

Reverse Proxy CDN Pros

While it’s not quite as simple as the Pull CDN setup can be, we’ll still go ahead and say that the Reverse Proxy setup is pretty simple. It requires you to host your nameservers at the CDN host, but once you’ve done that, it’s essentially one click to enable. Some of the Reverse Proxy CDN providers also act as domain registrars, making the process even easier if you’re open to keeping your domains with the CDN provider.

The major side-benefit of using a Reverse Proxy CDN setup is actually everything beyond simple CDN service. Since all requests to your domain are proxied through the CDN servers, the CDN provider is able to provide your app with a lot of other goodies. Code optimizations, before-your-app redirections, analytics, and on-domain image hosting are just a few of these things. And each is super powerful. It’s hard to understate how much of a benefit this can be if you choose a RP-CDN that offers advanced features like these.

And, on the topic of the CDN asset caching itself, most RP-CDN’s offer opt-in, conditional caching of html content too! So if you have a few pages on your Rails app that are simple, static pages (like this whole site), you can instruct the CDN to cache the entire HTML response as if it’s a static asset. That means you can have entire pages that are served and work totally-normally… but never hit your Rails back-end! ✨

Lastly, it’s worth noting that when you run an RP-CDN, you get some real security benefits. While it’s not an excuse to leave basic security and firewall tools out of your Rails app itself (don’t!), proxying all of your application’s traffic through an RP-CDN gives that CDN provider a lot of latitude to have your back. They can detect bot traffic, (D)DoS attacks, and other attempts at cracking your app before those requests even hit your application. Most RP-CDN’s have immediate-response features that you can flick on in case of an attack or outage. In those moments… it’s very handy to have.

Reverse Proxy CDN Cons

Finally, we need to cover a couple of the cons of Reverse Proxy CDNs.

First, we’re doubling down on our CDN dependency. Mostly meaning that, should our CDN provider have downtime, our application is likely toast. With a Pull CDN approach, only our static assets would be broken — the HTML would still be delivered. And, in a pinch, we could deploy a version of the app with the asset_host disabled — meaning that our app would temporarily serve static assets itself. Everything would work fine.

Not so with a Reverse Proxy CDN. There’s no simple way out if the provider goes down. Even if you wanted to turn off the middleman proxying… the provider is down. So it’s unlikely you could get into the dashboard to disable anything.

It’s not something that happens very often with the major RP-CDN providers, but it does happen occasionally.

And second, it’s worth mentioning that while we consider the RP-CDN approach to be simple, it can be more complicated if issues arise. Coordinating a fleet of servers to look like, act like, and certify like your app, even though they’re not your app, has challenges. In particular, many folks get tripped up with how to setup their SSL certificates between Cloudflare and their Origin servers (Heroku apps) so that everything plays nicely together.

Wrap Up / What We Do

There we have it! Three unique strategies for getting our static assets served to end-users after Propshaft stamps and moves them. We can do nothing (the MVP approach) and let Rails serve all static assets natively, we can hook up a Pull CDN with a subdomain and a little config tweak, or we can get into a Reverse Proxy CDN and let the platform power run! All three strategies will get our static content served, so it’s just a matter of choosing the right one for your application in its current state.

In general, putting a CDN in front of a Rails app is almost like getting the best of both the “static website” and “dynamic website” worlds. The static stuff that doesn’t need to change per-visitor can be served at light-speed by our CDN, while the dynamic, per-user stuff gets handled on a per-user basis! The perfect balance of speed and flexibility.

So what do we do here at Judoscale? Truth be told, we’re on the Cloudflare train. Cloudflare makes it incredibly simple to spin up a new site, get the nameservers patched over, and turn on CDN proxying. Nearly one-click. We also enjoy and use many of the other features that come out of the box with a Reverse Proxy service. In fact, all of the images in this post are handled directly via Cloudflare Images! Many of our routes are redirected and massaged before they hit our Rails servers, too.

We hope this walkthrough gave you some foundational basis for how Rails and CDNs work together to deliver your static content! Now you just need to make sure that your dynos can automatically scale up to meet the traffic needs of your HTML requests! We recommend Judoscale for that 😉.