8 minute read Published: Author: Derek Laventure
Drupal Planet , Drupal , Computed Field


Introduction

In a recent project for a federal government client, we needed to upgrade (rebuild, improve, and migrate content) an internal application from Drupal 7 to Drupal 10. One of the challenges we faced was to build reports that display a large set of data fields pulled from a complex data model.

We found that using Views alone was insufficient to meet the performance and maintenance requirements for these reports. This led us to develop Computed Token Field as a way to streamline Views configuration required to produce the necessary reports.

The Computed Token Field has provided us with several key benefits:

  • Performance Boost: Reducing the number of relationships in Views has led to faster and more efficient database queries.
  • Maintenance Ease: With a simpler Views config, the system is easier to manage.
  • Scalability: Allowed for managing a large volume of data with minimal added complexity or degraded performance.

Using Drupal’s Token system, the Computed Token Field module dynamically populates fields based on predefined tokens. This capability enables site builders to easily implement and adjust computed data fields, utilizing Drupal’s strong caching to ensure data consistency.

Data model complexity

This client’s application manages the complex review processes for all publications the department produces. The system holds data about the publications being produced, rather than the publications themselves. This metadata nonetheless represents a large volume of information collected while reviewing publications.

Each stage of the workflow process is handled by a different user or role, and securing access to the data entered in other stages was critical. To address this, we developed a hierarchical data model using ECK entities to create boundaries around sets of data fields that needed to be treated distinctly in terms of access control and workflow state.

The hierarchical data structure had several levels of content entities:

  • Node types (Series): Top-level “publication” nodes, representing 4 different types or “series” of publications.
  • Publication Parts: Content entities linked by Entity Reference fields from the node, to contain data collected at different workflow stages.
  • Sub-Parts: Further bundles within the Publication Part ECK entity type, but linked from the main Part rather than the node.

Each of these was a distinct content entity, linked by Entity Reference fields. Certain fields on these entities were shared across all Publication Series, others were shared but appeared on different Parts, and others were unique to a particular Series.

Challenge: Views relationships for Reporting

From the perspective of access control, this structure served us very well. However, another key function of this application was to provide comprehensive reporting and data export features, driven by the venerable Views.

The reporting Views needed to display a large set of fields across any of the publication Series, and provide a comprehensive set of filters to narrow down the results by certain criteria. Using standard Views configuration, we would need to add a Relationship for each of a large set of entity data tables. This introduces a great deal of complexity into the SQL query, as well as the configuration of the View itself.

The immediate concern we saw was one of performance. Even with the relatively moderate tens of thousands of publication records in the database, the query produced by introducing so many relationships was starting to become cumbersome.

Equally important, maintaining such a complex View configuration was a significant concern, primarily because this client needed to be able to improve and tweak this reporting aspect of the site on an ongoing basis.

We were able to reduce some of the complexity by sharing the Publication Part reference fields wherever possible, thus reducing the overall number of relationships required. This simplification helped a lot, especially when there was overlap in fields.

However, in cases where the same field (eg. ISBN) existed on different parts on each Series, for example, we’d need a relationship for each one, add the field/filter for each relationship, and then collapse them to appear as a single field in the Report. This was all viable in principle, but seemed untenable in practice.

Idea: Caching field data on the node

With the idea of limiting the number of Relationships in our View configuration, we took a step back to consider alternate approaches to this problem. One idea was to use Computed Field to “mirror” some of these tricky fields on the Node entity. If we treat these as a kind of “local cache” of key field values, they become easier to add directly as Views fields and filters.

This technique turned out to work quite nicely. We recognized some risk with keeping data in sync, inherent whenever data is duplicated in a system. However, our initial prototyping indicated the Computed Field’s feature set would ensure the computed field was always updated appropriately any time the “source” field changed.

However, there was one challenge with using Computed Field as-is: there is no UI beyond “write a PHP function”. Not only would it become tedious to write a function for even a handful of fields, but this bar was too high to expect our client to maintain and extend. We needed to give staff an interface, to be able to configure the “compute function” through Drupal’s admin interface.

This is where Token module came into play. We realized that our customized Computed Fields could take a configurable Token associated with the field, to pull the value from the source field and copy it up to a field on the node. This makes it trivial to add to a View without the need for an extra relationship. Thus, we could write a single function to simply render the token value provided by the field configuration, and use that to populate the value of the field.

For example, we had a Computed Token Field on the “Formal Publication” series node type, configured with a token value like [formal_publication:field_part2:entity:field_part2c:entity:field_actual_release_date:value]. This computed field behaves like a Date field, and could thus be included as both a field in the Report and a filter to narrow down results.

Solution: Computed Token Field is born

Another key feature we needed from these “cache” fields was that they “look and feel” like fields of the same type as their source. We identified 3 key types of fields we needed to support: Date fields, Text fields, and Entity Reference fields.

We set out to extend the “core” Computed Field implementations of FieldType plugins of these types, but only String (short and long) fields were supported by Computed Field. The Entity Reference and Date field implementations in Computed Token Field required some delving into the core FieldType plugins for these types, as well as the details of how Computed Field manages the various FieldTypes it does support. In the end, these classes ended up rather clean, extending the core FieldType classes but incorporating the ComputedFieldItemTrait to make them act like a computed field.

The majority of the code for these classes (even the String ones) lives in a trait that provides the field settings form as well as the executeCode() method, which does the actual rendering of the token. Notably we don’t support multi-value scenarios well, and would welcome feedback and ideas on the d.o issue.

Finding other uses

Having established this functionality and improved the configuration and performance of our reporting Views, we noticed another use-case for Computed Token Field.

The publication management application sends emails to key stakeholders as a given publication moves through different stages of its workflow. Who gets an email is determined by the value in a user reference field somewhere in the data about that publication. We were using ECA to accomplish this, setting up Events based on workflow state changes, and Actions to send out emails.

The content of the emails was shared across Series, but the field pointing to the user to email was on different Parts for different Series. This meant that we’d have duplicate ECA rules for different Series whose only difference was which field on which Part we looked up the user email.

Once again, the technique of “caching” a copy of the user reference at the node level allowed us to streamline the number of ECA rules required. By moving the value from the various Parts into a shared field on the Node, we could configure our ECA rules to look for the relevant email address in one place, regardless of which Series it was.

We see potential for the Computed Token Field module to address advanced data management needs, particularly in sectors like government, education, and healthcare where complex data structures are common. However, we’re interested in hearing from the Drupal community about other possible applications.

If you have thoughts on how the Computed Token Field might be integrated into your projects, or suggestions for further enhancements, we’d like to hear from you. Your feedback can help us refine this tool and explore new applications within the Drupal ecosystem.

We look forward to discovering how you might leverage the Computed Token Field to streamline your data management tasks!


The article Introducing Computed Token Field first appeared on the Consensus Enterprises blog.

We've disabled blog comments to prevent spam, but if you have questions or comments about this post, get in touch!