Performance improvements for an enterprise Drupal website

A deep dive into language negotiation and path processing

PHPStan

Another adjustment we made was for the Redis in-memory cache. We increased the capacity and changed the max-memory purge policy to allkeys-lfu. That means the least frequently used items will be purged from the cache when the memory is full. This makes sense because the most utilized cache items, such as the bootstrap, discovery, and config, should always persist. The Redis status report gives an informative overview to evaluate how different cache items behave.

Road map, entity architecture & data quality

WebP – The Right Way and The Wrong Way in Drupal by Peter PónyaRecently, we worked on performance improvements for a complex enterprise-scale Drupal website. Performance bottlenecks can appear in many different parts of such systems. Many possible performance improvements merit their respective discussion – each with a high level of detail. In this blog post, we want to share what we have learned from a more distant perspective. For some improvements, the Drupal ecosystem already offers well-working solutions in core and the contribution space. Some other changes sound easy on paper, but the implementation requires much effort. That is why we need to zoom out more to make good decisions on the software architecture level. Here are some of the noteworthy changes and decisions we made.

  • External services deliver pure data and have no control over downstream processing. Here, arriving at a strict implementation and enforcing standards is essential for future maintainability.
  • Flatten the entity architecture by reducing nesting and rogue references. Also, every entity has the proper type. For some cases, we introduced designated data entities separated from the respective editorial entities to simplify the structure.
  • The integration layer between the front- and backends is kept thin. In this system, it is primarily an interfacing template file with a one-to-one mapping.
  • Move logic from the preprocessing to the build process to get a better grip on caching and translation. We introduced multiple new formatters to achieve this.
  • Expand the test coverage, especially for the integration. We introduced multiple new test traits to allow worry-free code movement both in the front- and backend.
  • Improve the developer experience by making the code more manageable. Changes included introducing more specific modules, improving class and method naming, strengthening services and interfaces, and typing strictly.
Language-country negotiation

When optimizing for performance, one consideration should be the volume of data sent to the user. Accessing a website with a large DOM through a mobile data provider can impact the user experience. For this website, four notable steps reduced the payload significantly:Another caveat has to be made about seemingly innocent methods of core services. For example, matching a path with the URL matcher service or getting a route collection for a request from the route provider can be very expensive for a large website. As these are just one-liners in the codebase, it is necessary to raise awareness about these cases among the developers.

Dependency updates

An anti-pattern we learned about is overburdening constructors. Setting property values in the constructor to ensure availability throughout the whole class can be very tempting for a developer. However, when doing this, one also needs to consider the instantiation of the respective class. In Drupal, services may be instantiated as a dependency of another service. Also, plugins may be instantiated by their respective plugin manager – sometimes even on discovery. Offloading expensive executions to their respective methods is, therefore, essential.When upgrading PHP and Drupal, one gets many improvements almost for free. According to benchmarks, PHP 8.3, in combination with Drupal, is 1.5x faster than PHP 8.1 on the language level! Further, many recent Drupal improvements aim at performance – even in the minor versions. To name a few:As always, caching is our friend. Strategically utilizing static variables, class properties as static caches, or cache backends, such as the memory cache backend, can significantly reduce redundant computations. For example, we improved the static caching of the language fallback chains in a custom alias manager. This change was not apparent when looking at the implementation, but it became clear after seeing the propagation in the flame graph.For example, in our case, we had a service that was instantiated as a dependency for many editorial pages making requests to an external API in the constructor. Such errors can be tricky to resolve because performance may vary between subsequent requests and by each user’s location. A flame graph analysis helped to reveal the mistake.

Reduce payload

Ultimately, we must recognize that the client invested heavily in their website. They invested in revitalizing their digital ecosystem. They invested to give them a powerful tool to present themselves in a way suitable for a company of that stature. Together, we created a system that allows editors to express their creativity rather than fighting technical difficulties. And now, with the performance improved, the creation process will be even more enjoyable.

  • A quick win is to use Minify HTML, which strips white space and comments from the HTML response.
  • We analyzed the image processing to ensure all images were delivered with the correct source sets for responsiveness with the WebP image format. Still, there is more room for improvements, such as using the Avif format and optimizing the processing pipeline when creating image derivatives. See this blog article by Peter Pónya with a good introduction.
  • Ensuring that only the used subsets of fonts are attached can reduce the payload significantly.
  • We refactored the data structure delivered through HTML data attributes. Usually, this is not a concern, but especially for listing pages, the DOM can quickly bloat.
Service optimizations

A key point was establishing a bijective mapping between the content delivery network and the origin server. Each URL should point to a unique set of code executions. Delivering varying responses for identical requests has to be avoided. At first glance, this sounds easy to achieve, but in many regards, we had to change our way of thinking. Each developer must be aware of the extra caching layer when constructing responses. For example, the backend can not carelessly use user-specific cookies or location-based logic.With different teams working on the back- and frontend, some friction with the integration is almost unavoidable. Inherently, both have diverging motivations when implementing new features. Especially for this website, we faced the issue that, on the one hand, we have complex reusable components on the frontend. On the other hand, we want to ensure high cacheability in a multi-lingual and multi-market setup. Naturally, with the challenging integration, we made some compromises. With that, some bad practices emerged, such as:Optimal Cache Header Setup by LagoonThis change also enabled us to get better control of the caching. Blocks are built lazily, and when placed in regions, the caching information can bubble correctly through the different components. Drupal core has a great way to debug cacheable metadata while rendering. See the blog post by Matt Glaman for more details. Also, the module Renderviz allows one to visualize how the renderer sees the different components of the website. These tools make it easier to emit cacheable metadata in the right places and eliminate redundant or falsy caching information.The primary focus for the infrastructure changes was to improve the interplay between the content delivery network and the origin server. Especially redirecting users properly before reaching the origin server can significantly improve the browsing experience. Instead of making a costly roundtrip to the origin server to calculate the request’s destination, the CDN can efficiently predict the redirect location within milliseconds. In our setup, the origin server still pertains sovereignty over the response since the CDN acts on a heuristic logic with limited information. As described above, orchestrating the language-country negotiation as part of that is a delicate topic. That is why it is one of the most tested features of the website. Next, we improved the HTTP cache control response headers. Here, we will refer to the HTTP Cache Control module and the overview by Lagoon on configuring them properly.Debugging your render cacheable metadata in Drupal by Matt Glaman

Infrastructure changes

With more people using the website and having a great experience, the likelihood of this website surviving for longer increases. The longevity of this website is not only attributed to the work of our team but to the whole community, as everyone is working together to deliver more exciting systems.Some time ago, we switched from a content delivery network that only supplies images, videos, and static assets to one that caches the complete origin server response. Surprisingly, this switch went relatively smoothly due to the powerful modules in the Drupal contribution space, which provide vendor integrations and take care of the purge mechanisms. Still, we could not fully leverage the content delivery network because some existing back-end behavior required adjustments. The implementation of the language and country negotiation especially caused difficulties. An earlier blog post showcases some insights about path processing and language negotiation.

Frontend integration & cache variations

The most important decisions had to be about the responsibilities of the different system parts. However, the overall picture of an extensive system cannot always be seen clearly. Therefore, it is good to formulate overarching ideas. Some of them were:

  • Using extensive nesting of paragraphs and placing blocks in blocks.
  • Introducing coupling of components, especially handing over parent-to-child properties.
  • Managing cache directives in the preprocessing.

Commonly, the frontend fetches some user information asynchronously after the initial response is delivered. We realized that the straightforward approach of providing REST endpoints from the Drupal CMS is not always the fastest. Inherently, the Drupal bootstrap is relatively expensive and, in some cases, not necessary. For example, when fetching data from third-party systems or lazy building content. Granted, this approach is not always feasible, but if possible, one can shave off 500 ms response time easily and reduce the server load significantly for frequently hit endpoints. Some considerations for the server configuration are necessary. In our case, we execute the plain PHP files on the nginx server, pass the cache on the content delivery network, and deliver a cache-control response header set to private.Gander – The Open Source Automated Performance Testing Framework by tag¹ consulting

Cache warming

Next, we had to put a magnifying glass on our custom implementations. For this project, visualization via flame graphs emerged as a solid tool to detect performance bottlenecks. When using DDEV with XDebug, one can use speedscope to render traces as flame graphs. In the following, we describe some exemplary improvements.

Drupal bootstrap

Improving the overall architecture in the backend is a long-running process. For many aspects, changes require migrations and careful planning to resolve dependencies and match the ongoing feature development. In this project, we faced further complications because the code base grew organically over multiple years with different maintainers. For example, external services delivered formatted data, preprocess hooks scattered between theme, include, and module files, templates hosted business logic, and Twig filters took on back-end responsibilities.

Final thoughts and further reading

Static analysis may not be a performance improvement in itself. The introduction of PHPStan was still necessary to make the code base more maneuverable. With it to support, one can refactor with confidence. The initial investment is high. However, with fast-moving code, the payback comes sooner rather than later. Also, the developer conversations elevate from unimportant technicalities to meaningful decisions. The code becomes more maintainable with early detection of deprecations and enforced best practices.Also, there have been many efforts to make Drupal core performance testable. The performance test framework Gander is now part of Drupal core. It is open beyond its use in core, which offers a new performance test base alongside a performance test trait. Gander integrates with Open Telemetry and Grafana for monitoring. All can be set up conveniently with a DDEV add-on for local development. Excellent results have already been achieved with the StandardPerformanceTest in Drupal core, which gets a grip on all database queries in a standard installation. The test has already led to multiple follow-up improvements and is a safeguard against potential performance degradations when introducing new features.Sometimes, custom queries can be worth it. Although it is advisable to utilize entity queries and the methods on the respective entity storages as much as possible, fully loading entities can be overkill. For example, we have a heavily utilized service that determines the structure of paragraphs on a given node. The structure is calculated via the position and the bundles of the paragraphs. After introducing custom queries and proper caching, the execution time for this service went down to non-measurable.Navigating dependency updates between feature development can be challenging. Granted, we probably needed to maintain the dependencies better for this website. However, during the previous relaunch project, it would have added another layer of complexity. We always try to stay close to the original implementation and utilize provided plugin systems for extensions, but sometimes incompatible upstream changes are unavoidable. Then, at the end of 2023, we deliberately reserved time to resolve blockers and move to Drupal core version 10. Although a little painful, the reward was a website that was faster in benchmarks, but more importantly, a website that also felt faster. Of course, the first is what justifies investing in performance. However, the second is what ultimately makes the client happy.Back to our project, we had to shake the dependency tree and improve our Renovate Bot configuration, which makes keeping up with updates much more manageable. However, technical tools can not solve all problems alone. Developers and project management also renewed their commitments to stay close to the bleeding edge.The longest response time of a Drupal website is usually caused by hitting cold caches. For the discussed website, this became relevant because we were frequently deploying new features and maintenance fixes. Although the content delivery network takes over some heavy lifting by persisting static files, computationally heavy page builds and delivering new JavaScript was still necessary in most cases. That is why we decided to introduce cache warming. Again, the Drupal contribution space already provides a solid framework for accomplishing this. The Warmer module and its sub-modules made it easy for us to configure entity cache warming and CDN warming, which utilizes the sitemap. Incorporating respective jobs into the deployment pipelines is also straightforward since one can use Drush commands.Employing the patch to render blocks later to be placed individually in a region template helped us handle the integration better. It will likely take a long time until this change can land in Drupal core because it has backward compatibility issues. However, we face no problems with the patch, and it provides more flexibility.

Similar Posts