AEM Dispatcher Caching Flush Strategies

Published
When caching is implemented for your website, it's essential to clear the dispatcher cache after publishing pages to ensure the most recent content is displayed to end users. To reflect changes made by authors, we need to configure Replication Agent on Author instance to push changes to the Publish instance and Dispatcher Flush Agent on Publisher instance to invalidate the Dispatcher cache. Once the cache is invalidated, next request will be served from the publisher, and the response will be added to the cache; subsequent requests will be served from the cache without interacting with the publisher.

Configure Replication Agent on Author instance

From Tools, select Deployment, and click on Replication. On the Replication page, click on Agents on author. Configure the Default Agent (publish) Replication Agent by updating the URI, User, and Password in the Transport tab according to the Publisher instance configuration. After updating the values, use the Test Connection option to ensure the Replication Agent is working correctly.
AEM Dispatcher Invalidate Cache FLush Agent
AEM Dispatcher Invalidate Cache Flush Headers

Configure Dispatcher Flush Agent on Publisher instance

On the Replication page, navigate to Agents on Publish. Configure the Dispatcher Flush (flush) agent by updating the URI in the Transport tab to match your dispatcher settings and add Host: flush to the HTTP Headers in the Extended tab. After making these updates, use the Test Connection option to verify that the Dispatcher Flush is working correctly.
AEM Dispatcher Invalidate Cache FLush Agent
AEM Dispatcher Invalidate Cache Flush Headers
In case the dispatcher flush fails, ensure that flush is added as a host in the dispatcher configuration, as shown below.
available_farms / publish_flush_farm.any
/virtualhosts { "flush" }
available_vhosts / aem_flush.vhost
ServerAlias flush
Additionally, allow the publish IP address to trigger cache invalidation.
cache / publish_invalidate_allowed.any
/0001 { /glob "${PUBLISH_IP}" /type "allow" }
With this configuration, any time the author publishes new changes, the end user will see the latest content. However, you can adjust the Dispatcher Cache settings based on your specific requirements to achieve optimal results.

Dispatcher Cache Configuration

The Dispatcher has two primary methods for updating the cache content when changes are made to the website: Content Updates and Auto-Invalidation. Content Updates remove the pages that have changed, and files that are directly associated with them, while Auto-Invalidation automatically flags sections of the cache that might be outdated following an update.
During a content update, when one or more AEM contents are modified and activated, the AEM Dispatcher is notified and receives an activation event. For instance, if /content/aem-demo/us/en/mobile is activated, Dispatcher removes /content/aem-demo/us/en/mobile.* files and /content/aem-demo/us/en/mobile/_jcr_content folder from the cache.
Auto-Invalidation invalidates parts of the cache - without physically deleting any files. At every content update, statfile (.stat) is touched, so its timestamp reflects the last content update. For example: if you set the statfileslevel property to 6 and a file is invalidated at level 5 then every .stat file from docroot to 5 are touched. Continuing with this example, if a file is invalidated at level 7 then every stat file from docroot to 6 are touched (since /statfileslevel = "6").
available_farms / publish_farm.any
/cache { /docroot "/mnt/var/www/html" /statfileslevel "6" /rules { $include "/etc/httpd/conf.dispatcher.d/cache/publish_cache.any" } /invalidate { /0000 { /glob "*" /type "deny" } /0001 { /glob "*.html" /type "allow" } } }
The /rules property controls which documents are cacheable according to the document path.
cache / publish_cache.any
/0000 { /glob "*" /type "allow" } /0001 { /glob "*/private/*" /type "deny" }
This Dispatcher configuration is solely for open and static pages. For gated pages requiring authentication, refer to this resource, and for dynamic content, check out this article.

How Dispatcher returns Contents

When an end user requests content, it's served from the Dispatcher cache if available, cacheable, and up-to-date; otherwise, the request is forwarded to the publish instance. The following diagram outlines the steps involved in serving content from a Dispatcher caching enabled site.
AEM Dispatcher Caching Strategies
  • Request Cacheable: Whether content is cacheable is determined by the /rules defined in the /cache section. If the content is not cacheable, the request is forwarded to the AEM publish instance and the response is sent back to the user.
  • Content is Cached: The requested content path is combined with the /docroot specified in the /cache section to verify if the resulting path exists in the Dispatcher. For instance, when /content/aem-demo/us/en/mobile.html is requested and /docroot is set to /mnt/var/www/html, the Dispatcher looks for the file at /mnt/var/www/html/content/aem-demo/us/en/mobile.html.
  • Content is Up to Date: During content publishing, the .stat file's modification date is updated based on the statfileslevel value. When content is later requested, the modification date of the .stat file and the requested file are compared. If the .stat file's modification date is newer, the content is fetched from the publish instance. For debugging, you can check the modification date using date -r {filename}, adjusting the filename as needed based on your current directory.
If you've followed along, you should now understand how to configure agents to push changes from author to publisher and dispatcher, as well as how the dispatcher invalidates and serves the latest content to users. Happy learning!