0%
Reading Settings
Font Size
18px
Line Height
1.5
Letter Spacing
0.01em
Font Family
Table of contents
    blog cover

    Jekyll Asset Caching Strategy for AWS S3 + CloudFront Deployment

    Software Engineer
    Software Engineer
    Frontend
    Frontend
    published 2025-08-11 10:52:50 +0700 · 3 mins read
    Deploying a static Jekyll site to AWS can be fast, but without the right caching strategy, users might see stale content or you might waste bandwidth re-downloading unchanged assets.
    This blog explains a dual-cache policy that combines long-term caching for immutable assets with no-caching for dynamic pages, using AWS S3, CloudFront.

    Note: If you're using gem github-pages, it will ignore all custom plugins by default and this approach won't work (https://github.com/jekyll/jekyll/issues/5265#issuecomment-241267253)

    1. Why Caching Matters

    When you push a new blog post or update CSS, you want the changes visible immediately. The static assets (like CSS, JS and images) don’t need to be re-fetched every visit.

    The challenge:
    • Over-caching -> Users see old content.
    • Under-caching -> Site loads slowly and burns bandwidth.

    Our approach:
    • Immutable assets get a cache buster and can be stored for a year.
    • Dynamic content is always fetched fresh.

    2. Cache Busting in Jekyll

    To ensure browsers always download fresh versions of updated assets, we append a version query (?v=<build_time>) to asset URLs.
    Example:
    // language: html
    <link rel="stylesheet" href="/styles.css?v=1723366892">
    <script src="/main.js?v=1723366892"></script>
    <img src="/logo.png?v=1723366892">

    This way, when you rebuild your site, the version number changes, and browsers are forced to fetch the latest file instead of using a cached copy.

    Create this _plugins/cache_buster.rb file to automatically add versioning to your CSS/JS/images. Currently my implementation only supports img tags. You can customize it to work with picture tags as well.
    // language: ruby
    require "uri"
    require "nokogiri"
    
    module Jekyll
      # Appends ?v=<build_time> to local asset URLs (img/src, script/src, link/href).
      # - Stable per build (uses site.time if available).
      # - Preserves existing query params and fragments.
      class CacheBuster
        CACHE_BUST_KEY   = "v"
        ASSET_EXTENSIONS = %w[.png .jpg .jpeg .gif .svg .css .js .json].freeze
    
        @build_bust_value = nil
    
        Jekyll::Hooks.register [:pages, :posts, :documents], :post_render do |item|
          next unless item.output
    
          # One stable value for the whole build
          @build_bust_value ||= (item.site.respond_to?(:time) ? item.site.time.to_i.to_s : Time.now.to_i.to_s)
          item.output = bust_assets(item.output)
        end
    
        class << self
          private
    
          def bust_assets(html)
            doc = Nokogiri::HTML::DocumentFragment.parse(html)
            asset_tags = [["img", "src"], ["script", "src"], ["link", "href"]]
    
            asset_tags.each do |tag, attr|
              doc.css("#{tag}[#{attr}]").each do |node|
                url = node[attr].to_s.strip
                next if url.empty? || skip_url?(url)
    
                uri = safe_parse_uri(url)
                next unless uri
    
                path = uri.path.to_s
                is_supported_asset = !path.empty? && path.end_with?(*ASSET_EXTENSIONS)
                next unless is_supported_asset
    
                params = URI.decode_www_form(uri.query.to_s)
                params.reject! { |k, _| k == CACHE_BUST_KEY }
                params << [CACHE_BUST_KEY, @build_bust_value]
                uri.query = URI.encode_www_form(params)
    
                node[attr] = uri.to_s
              end
            end
    
            doc.to_html
          end
    
          def skip_url?(raw)
            inline_data = raw.start_with?("data:")
            external_url = raw.start_with?("http://", "https://", "//")
            inline_data || external_url
          end
    
          def safe_parse_uri(raw)
            URI.parse(raw)
          rescue URI::InvalidURIError
            nil
          end
        end
      end
    end

    3. AWS + CloudFront Deployment

    We use the AWS CLI to deploy our _site folder to S3 with two different cache policies: one for immutable, versioned assets and one for always-fresh content.

    Long-Term Caching: For versioned assets (safe to cache for up to 1 year)
    // language: bash
    aws s3 sync _site/ "s3://$S3_BUCKET" \
      --exclude "*" \
      --include "*.js" \
      --include "*.css" \
      --include "*.json" \
      --include "*.png" \
      --include "*.jpg" \
      --include "*.jpeg" \
      --include "*.gif" \
      --include "*.svg" \
      --cache-control "public, max-age=31536000, immutable" \
      --delete

    No-Caching: For HTML and other files that must always be fresh
    // language: bash
    aws s3 sync _site/ "s3://$S3_BUCKET" \
      --exclude "*.js" \
      --exclude "*.css" \
      --exclude "*.json" \
      --exclude "*.png" \
      --exclude "*.jpg" \
      --exclude "*.jpeg" \
      --exclude "*.gif" \
      --exclude "*.svg" \
      --cache-control "no-cache, no-store, must-revalidate" \
      --delete

    CloudFront Invalidation: After deployment, clear CDN edge caches so users immediately see changes
    // language: bash
    aws cloudfront create-invalidation \
      --distribution-id "$CLOUDFRONT_DISTRIBUTION_ID" \
      --paths "/*"

    4. Conclusion

    By combining build-time cache busting with S3 dual-cache policies and CloudFront invalidation, your Jekyll site remains:
    • Fast: Long-term caching for static assets.
    • Fresh: No-cache headers for dynamic files.
    • CDN-friendly: Automatic invalidation keep content consistent worldwide.

    Related blogs