the floating-point divide: A Drupal migration that wraps values in paragraphs

The array_first function is new in PHP 8.5. Interestingly, you can use it in Drupal 11 even if you’re on PHP 8.4 or earlier. This is because one of Drupal core’s dependencies is symfony/polyfill-php85.

Let’s start with the easy part of the migration code:

tmp_field_creator_list_creators:

plugin
: sub_process
source
: field_old_article_creators
process
:
item
:
plugin
: create_default_paragraph_revision
paragraph_default
:
create_paragraph_bundle
: single_creator
field_single_creator_creator
: ‘@target_id’

plugin
: multiple_values

plugin
: callback
callable
: array_first

Sometimes the way that we’ve structured a site’s content originally is no longer ideal after several years. Requirements change. New tools become available. We find opportunities for refactoring.

  • old_article content type
    • field_old_article_creators — unlimited taxonomy terms
  • new_article content type
    • field_new_article_creators — one creator_list paragraph
      • field_creator_list_creators — unlimited single_creator paragraphs
        • field_single_creator_creator — one taxonomy term

(Why the new structure? In real life, the creator_list and single_creator paragraph types have some additional fields, which I’m leaving out of this example for simplicity.)

The skeleton of the migration

Today I’ll share an example of using a Drupal migration to restructure content. This is based on recent work I did on a site where content types have been added incrementally over the years. After a while, we’d ended up with various content types that describe different kinds of published works (scholarly articles, reports, videos, etc.), but each content type representing the creators of those works a little differently. We decided to unify the data structure for creators across content types so that we wouldn’t have to maintain all those variations, and so that new content types could reuse the same structure and theming.

In this example, we created New Article nodes from Old Article nodes. It would also be possible to migrate from the old-style Creators field to the new-style Creators field on Old Article nodes — modifying the existing nodes rather than creating new ones.

Creating paragraphs

The paragraphs were deleted by the entity_reference_revisions module’s Orphan Purger queue worker, which is less sinister than it sounds.

field_new_article_creators:
plugin
: create_default_paragraph_revision
paragraph_default
:
create_paragraph_bundle
: creator_list
field_creator_list_creators
: ‘@tmp_field_creator_list_creators’

An alternative solution would be to write a custom process plugin, maybe just extending create_default_paragraph_revision and setting handle_multiples to false.

This example demonstrated:

Wrapping taxonomy terms in paragraphs

Let’s start to fill in the “???” from the code above:

  • old_article content type
    • field_old_article_creators — unlimited taxonomy terms
  • new_article content type
    • field_new_article_creators — one creator_list paragraph
      • field_creator_list_creators — unlimited single_creator paragraphs
        • field_single_creator_creator — one taxonomy term

[
0 => [
“target_id” => 14023
“target_revision_id” => 28504
],
1 => [
“target_id” => 14024
“target_revision_id” => 28505
]
]

id: articles
label
: ‘Convert Old Article nodes to New Article nodes’
source
:
plugin
: ‘content_entity:node’
bundle
: old_article
process
:
created
: created
changed
: changed
uid
: uid
status
: status
title
: title
???
destination
:
plugin
: ‘entity:node’
default_bundle
: new_article

If you’re not already familiar with using Drupal migrations to modify content within a site, I’d recommend you check out this article first.

In Drupal 10, the best solution I could find using core and contrib modules was to replace array_first with array_pop in the callback plugin. The trouble with this solution is it emits a warning each time array_pop is called: Argument #1 ($array) must be passed by reference, value given. This happens because the array_pop function has the side effect of modifying the array’s internal pointer. Ideally we shouldn’t be calling array_pop with callback since callback passes the array by value.

This example will migrate nodes from a content type called Old Article to a content type called New Article. Old Article has a Creators field that holds taxonomy terms. New Article has a Creators field that holds paragraphs.

We used the create_default_paragraph_revision migrate process plugin to create paragraphs, specifying each of the paragraph’s fields and the value that we wanted to set it to.

Nitty gritty details

Why iterate with sub_process instead of multiple_values?

[
0 => [
“item” => [
“target_id” => 14023
“target_revision_id” => 28504
]
],
1 => [
“item” => [
“target_id” => 14024
“target_revision_id” => 28505
]
]
]

The output of the sub_process plugin is an array that would look something like this for an article with 2 creators:

array_first in Drupal 11

In the code above that defines tmp_field_creator_list_creators, we introduce the item keys into the array with the sub_process plugin, then have to remove them with array_first. It would be simpler if we could do something like this:

array_pop in Drupal 10

Previously when migrating nodes with paragraphs, I’d performed the migration in two stages: paragraphs first, then nodes. This time, while looking at the list of migrate process plugins in core and contrib modules, I happened to notice one called create_default_paragraph_revision from the Migration Tools module.

I was curious if rolling back the migration would clean up the paragraphs created by create_default_paragraph_revision, or if orphaned paragraphs would remain in the database. Rolling back did not delete the paragraphs. But a subsequent cron run did.

Are paragraphs deleted when you roll back the migration?

Here’s how you’d do that:

// Doesn’t work!
tmp_field_creator_list_creators
:

plugin
: multiple_values
source
: field_old_article_creators

plugin
: create_default_paragraph_revision
paragraph_default
:
create_paragraph_bundle
: single_creator
field_single_creator_creator
: ‘@target_id’

For each of the taxonomy terms in field_old_article_creators, we need to create a single_creator paragraph whose field contains that term. Here’s how I did that in Drupal 11:

In the code above, the sub_process plugin iterates through the values in field_old_article_creators. For each value (a taxonomy term), the create_default_paragraph_revision plugin creates a single_creator paragraph whose field_single_creator_creator is set to the term.

  • Add a second Creators field to Old Article. Make it a paragraph reference field that accepts a creator_list paragraph.
  • Similar to this example, write a migration from Old Article to Old Article nodes. Under overwrite_properties, put the second Creators field.
  • In the process, set up tmp_field_creator_list_creators and field_new_article_creators the same as in the above example, except change the name of field_new_article_creators to the second Creators field.
  • After running the migration, delete the original Creators field.

Conclusion

Recall the data structures that we’re migrating from and to:

  • creating nested paragraphs
  • iterating over multiple values in a field
  • transforming arrays of entity references

create_default_paragraph_revision lets you create a paragraph by saying what type of paragraph you want to create and listing out the value for each field.

Similar Posts