Quantcast

Delivering Paperclip and Pipeline Assets through Amazon CloudFront



Content Delivery Networks (CDNs) can greatly speed up asset load times. We're always looking for ways to improve the user experience, and page load times are the number one technical benchmark. Using a CDN speeds up things in a few ways. First, just by being on purpose-built servers with fantastic network capacity. Secondly, the content can be mirrored to different physical locations around the world, so that the bits don't have to bounce across the globe. Thirdly, it frees up the application server to do its job handling business logic and talking to the database rather than simply serve up files. And while S3 can be used to serve images, CloudFront is cheaper and much more suited to the task.

How CloudFront Works

First, you create a distribution, essentially where the files will be served from. A distribution has one or many origins, for our purposes that's either an S3 bucket or our Rails server and its assets folder. And for each origin, there is at least one behavior, which sets how files are slurped into the distribution. For an S3 origin, the default behavior is to cache everything; and notably, it's done up front. For generic HTTP sources like a Rails project, it's done lazily. Don't think that's a bad thing! Simply, as visitors' browsers request assets, CloudFront first checks its cache, and if it's not there or expired, it goes ahead and grabs it from the server, and onto the visitor and subsequent visitors.

Cloudfront

Set up

Create your distribution. If you're using S3 to store Paperclip uploads, add that bucket as an origin. Paperclip uses query strings for cache busting, so it's recommended to forward those, otherwise, the default settings should work. To follow 12 Factor and for simplicity, I recommend storing the distribution's hostname in an environment variable, e.g CLOUDFRONT_ENDPOINT. It'll be referred to in a few different place in Rails' configuration, so it's good to set up a single source of truth.

Configuring for Paperclip uploads

Assuming you've got S3 storage for Paperclip already configured, you just need two additional settings, s3_host_alias pointing to the distribution's hostname, i.e xxxxxxxx.cloudfront.net, and url overridden to Paperclip's special flag :s3_alias_url, that will make sure Paperclip::Attachment#url returns the proper CloudFront URL.

Here's a simple snippet that can be safely appended after your existing configuration.

# config/initializers/paperclip.rb

# ...

if ENV['CLOUDFRONT_ENDPOINT']
    Paperclip::Attachment.default_options.merge!(
      s3_host_alias: ENV['CLOUDFRONT_ENDPOINT'],
    url: ':s3_alias_url'
  )
end

That's it really! Assuming the distribution has finished processing your bucket, you should be good to go. For reference here's one of our common default Paperclip configurations, note the cache control headers.

Paperclip::Attachment.default_options.merge!(
  :storage => :s3,
  :bucket => ENV.fetch('S3_BUCKET', 'project-default-bucket'),
  :path => "/system/:class/:attachment/:id/:style/:filename",
  :s3_protocol => 'https',
  :s3_region => 'us-east-1',
  :s3_headers => {
    'Cache-Control' => 'max-age=315576000',
    'Expires' => 10.years.from_now.httpdate
   }
)

Configuring for the Asset Pipeline

Setting up CloudFront for a generic server takes a little more effort since it has to know a little more about it. First, create an origin with the production domain. Then, create a behavior.  To cache asset pipeline requests, set the path pattern to /assets/*. The other vital portion configuring what HTTP headers to forward. This will vary depending on your needs, but you'll at least want to forward Access-Control-Allow-Origin, for the bundle of joy that is CORS, otherwise, browsers may refuse to load your assets if the domain isn't whitelisted.

Rails Configuration

First, set the asset host to our CloudFront endpoint, this will prepend all of the asset_path, image_path, etc helpers with the CloudFront subdomain. Rails is still serving the assets, as you can see if you navigate to them manually, but visitors only see the CloudFront URLs, and CloudFront will hit the Rails server to grab an asset if needed.

Remember the CORS headers that were whitelisted? This is where they'll come from. The cache-age can be cranked up because the Asset Pipeline fingerprints the filenames, so old versions of assets simply won't be linked to, and will eventually be purged from CloudFront, no intervention necessary.

# config/environments/production.rb

config.action_controller.asset_host = ENV['CLOUDFRONT_ENDPOINT']

config.public_file_server.enabled = true
config.public_file_server.headers = {
  'Cache-Control'                => 'public, max-age=2592000',
  'Access-Control-Allow-Origin'  => '*',
  'Access-Control-Allow-Methods' => 'GET, HEAD',
  'Access-Control-Allow-Headers' => '*',
  'Access-Control-Max-Age'       => '1728000'
}

This works for using Puma on Heroku, but if you're serving static files some other way, through Apache, NGINX, or Passenger Standalone, you can leave the Rails public file server disabled and configure things at that level.

Deploy

If CloudFront is finished processing your bucket, you ought to be ready to deploy the Rails' configuration. Paperclip attachments should now be giving the CloudFront URL, and stylesheets, javascript, fonts, and images from the Asset Pipeline should have the CloudFront host as well.  Some things to check for when troubleshooting:

  • In the views/stylesheets that the you're using the proper helpers.
  • The distribution is ready and not InProgress
  • The CORS headers, specifically the origin
  • The file truly exists on your production domain.

While it does add complexity to a server set up, CloudFront largely Just Works(tm) after it's configured. And it becomes less daunting to set up once you understand its components.