Using this information we can setup a pretty simple finished method. All we need to do look to see if the $success variable is true or not. If it is then we can report to the user (via the Messenger service) that the batch finished, log the fact that the batch finished. If the batch failed (for whatever reason) then we print this out as an error, passing in the operation that caused the issue.Finally, I’d like to say thanks to Selwyn Polit and his Drupal at your Fingertips book for some of the code examples I have used here. The page on Drupal batch and queue operations is really worth a read.For example, let’s say you wanted to run a function that would go through every page on you Drupal site and perform an action. This might be removing specific authors from pages, or removing links in text, or deleting certain taxonomy terms. You might create a small loop that just loads all pages and performs the action on those pages.public static function batchFinished(bool $success, array $results, array $operations, string $elapsed): void {
// Grab the messenger service, this will be needed if the batch was a
// success or a failure.
$messenger = Drupal::messenger();
if ($success) {
// The success variable was true, which indicates that the batch process
// was successful (i.e. no errors occurred).
// Show success message to the user.
$messenger->addMessage(t('@process processed @count, skipped @skipped, updated @updated, failed @failed in @elapsed.', [
'@process' => $results['process'],
'@count' => $results['progress'],
'@skipped' => $results['skipped'],
'@updated' => $results['updated'],
'@failed' => $results['failed'],
'@elapsed' => $elapsed,
]));
// Log the batch success.
Drupal::logger('batch_form_example')->info(
'@process processed @count, skipped @skipped, updated @updated, failed @failed in @elapsed.', [
'@process' => $results['process'],
'@count' => $results['progress'],
'@skipped' => $results['skipped'],
'@updated' => $results['updated'],
'@failed' => $results['failed'],
'@elapsed' => $elapsed,
]);
}
else {
// An error occurred. $operations contains the operations that remained
// unprocessed. Pick the last operation and report on what happened.
$error_operation = reset($operations);
if ($error_operation) {
$message = t('An error occurred while processing %error_operation with arguments: @arguments', [
'%error_operation' => print_r($error_operation[0]),
'@arguments' => print_r($error_operation[1], TRUE),
]);
$messenger->addError($message);
}
}
}
There’s quite a lot of information here on how to setup and use the Batch API, but what I have shown here is the simplest version of using batch. Using the above code you can create a form that will run a batch operation that takes a few seconds but should allow you to experiment with the API and see what it does.If you want to see the source code for the examples above (or all examples in this series) then I have released them all as a GitHub project that has a number of different sub-modules that show how to use the Batch API in different situations and combinations. Feel free to take a look at the project and use the source code in your own projects. Also, please let me know if there are any improvements to the module that you can think of.This technique can be used any a variety of different situations. Many contributed modules in Drupal make use of this feature to prevent processes taking too long.This article is the first in a series of articles that will look at various aspects of the Batch API and how to use it. In this article we will look at the core Batch API and how to get set up with your first batch run.
The Batch Process
Rather than get the batch operation to do anything destructive to the site I decide to just loop through the items in each chunk and get the process to sleep for a few milliseconds to simulate things happening to the site. This means that you can run this batch call as many times as you like without causing lots of content to be added (or removed) from your site. I will go into more concrete mechanisms later in this series of articles so show nodes being created.
- Initiate Step – This is where the batch is started. It’s best to start the batch from some sort of action like a controller, form, or Drush command as it means that the batch can proceed unimpeded. When the batch starts the site will redirect to the path /batch, so you need to be sure that it’s the last thing run in the action or submit handler.
- Processing Step(s) – After the batch is initialised the batch process itself is then run. The number of processing steps can be set in the initiate step, but you can also set a single step and have that step run repeatedly until the task is finished. During the processing steps you can keep track of your progress, including how many items you have processed or how many errors occurred. It’s also possible to run multiple different steps that do different actions.
- Finishing Step – The final step is a finish step. In this step you can log what happened in the batch and optionally perform a redirect to another page on the site.
There are a number of situations that you might want to use the Batch API, I’ve hinted at a couple in the introduction, but here’s a list of some examples.// Create 10 chunks of 100 items.
$chunks = array_chunk(range(1, 1000), 100);
// Process each chunk in the array to operations in the batch process.
foreach ($chunks as $id => $chunk) {
$args = [
$id,
$chunk,
];
$batch->addOperation([self::class, 'batchProcess'], $args);
}
The core of the Batch API in Drupal 8+ is the BatchBuilder class. Using this class we can create the needed parameters that have to be sent to the batch_set() method, which is where the batch operations are started.When we first start the batch process the $context array will look like this.$args = [
$id,
$chunk,
];
$batch->addOperation([self::class, 'batchProcess'], $args);
When we submit this form we see the following batch process running.Once complete we will be redirected back to the form that we submitted, where a message will show us show many items were processed.batch_set($batch->toArray());
The Batch API in Drupal is a really powerful component of processing data whilst also giving the user a decent experience. It does away with long page loads and introduces a nice progress bar that will show you how long your user has to wait for it to complete. Drupal makes use of the Batch API in a few places, and even allows certain parts of Drupal (e.g. update hooks) to integrate with the Batch API with very little extra code.
The Batch Process Method
This is normally fine on sites that have a small number of pages (i.e. less than 100). But what happens when the site has 10,000 pages, or a million? Your little loop will soon hit the limits of PHP execution times or memory limits and cause the script to be terminated. How do you know how far your loop progressed through the data? What happens if you tried to restart the loop?The following three steps are involved in the batch process in Drupal.The batch process method is where your processing will be done and is the main body of the batch run. The name and arguments that the method have depend on the arguments array you used when calling addOperation() when setting up the batch.Setting a patch operation in a form is pretty simple though, in the submitForm() handler of the form class we just create a new BatchBuilder object and set the batch up.Essentially, if you return a redirect response from the finished method then this will be used and the user will be redirected, but returning a redirect response is optional.This is pretty much it for the batch process method. The Batch API will call each of the operation methods we setup at the start, passing in the array items we set for each operation. When finished, the results will be passed to the batch finish method.
- sandbox – This is used within the batch process methods only. This is normally used to keep track of the progress of the batch run, or to figure out the max number of elements in the batch. Once the batch processing is finished this array will be thrown away.
- results – This is used by the batch processing methods to keep track of the progress of the batch run. The difference here is that this array is passed to the finish callback method, which gives us the ability to report on how the batch process went. As a result, this array is normally used to store the number of successful or failed operations that happened. What you add to this part of the array depends on what you want to print in the finished output.
- finished – This is a special value that is used by the batch system to see if the batch processing is finished. If you set this to a value of less than 1 then Drupal will call the batch method again to finish off the batch. This value is really powerful, but only comes into play when have an open ended batch process. If you set up your batch process with a specific number of items and a set number of operations then this flag will not be used. I will go into this setting in more detail in later posts.
- message – To communicate progress to the user you can set a message to this array variable and this will be shown on the batch processing page (along with the progress bar).
Eating the breakfast sandwich challenge in one go is certainly difficult, but it certainly sounds easier when you consume the sandwich in 100 smaller meals over the course of a couple of days. This is just what batch processing does; it takes a large amount of items and breaks them up into smaller chunks so they’re easier to handle (or digest).Here is a typical finished method, based on the batch operations we ran in the above step.It is quite common to initiate a batch operation from a form. Doing so means that we can accept parameters from the user about what to do in the batch, but it also gives a more definite warning to the user that performing this action will result in a (potentially) lengthy process.For example, to set up a minimal batch process you would set up the batch operation object like this.A good analogy I like to use is to compare the batch process to food challenges. In my home town of Congleton there is a cafe called Bear Grills that hosts a food challenge called Bear Grills’ Grizzly Breakfast Sandwich Challenge. This is a 2.7kg sandwich that contains 6 sausages, 6 slices of bacon, 4 eggs, 4 potato waffles, beans, and topped off with cheese.
The Batch Finish Method
The Batch API in Drupal solves these problems by splitting this task into parts so that rather than run a single process to change all the pages at the same time. When the batch runs a series of smaller tasks (eg. just 50 pages at a time) are progressed until the task is complete. This means that you don’t hit the memory or timeout limits of PHP and the task finishes successfully and in a predictable way. Rather than run the operation in a single page request the Batch API allows the operation to be run through lots of little page request, each of which nibbles away at the task until it is complete.
- $success – TRUE if all Batch API tasks were completed successfully.
- $results – An results array from the batch processing operations.
- $operations – A list of the operations that had not been completed.
- $elapsed – Batch.inc kindly provides the elapsed processing time in seconds.
public function submitForm(array &$form, FormStateInterface $form_state): void {
// Create and set up the batch builder object.
$batch = new BatchBuilder();
$batch->setTitle('Running batch process.')
->setFinishCallback([self::class, 'batchFinished'])
->setInitMessage('Commencing')
->setProgressMessage('Processing...')
->setErrorMessage('An error occurred during processing.');
// Create 10 chunks of 100 items.
$chunks = array_chunk(range(1, 1000), 100);
// Process each chunk in the array to operations in the batch process.
foreach ($chunks as $id => $chunk) {
$args = [
$id,
$chunk,
];
$batch->addOperation([self::class, 'batchProcess'], $args);
}
batch_set($batch->toArray());
// Set the redirect for the form submission back to the form itself.
$form_state->setRedirectUrl(new Url($this->getFormId()));
}
public static function batchProcess(int $batchId, array $chunk, array &$context): void {
if (!isset($context['sandbox']['progress'])) {
$context['sandbox']['progress'] = 0;
$context['sandbox']['max'] = 1000;
}
if (!isset($context['results']['updated'])) {
$context['results']['updated'] = 0;
$context['results']['skipped'] = 0;
$context['results']['failed'] = 0;
$context['results']['progress'] = 0;
$context['results']['process'] = 'Chunk batch completed';
}
// Message above progress bar.
$context['message'] = t('Processing batch #@batch_id batch size @batch_size for total @count items.', [
'@batch_id' => number_format($batchId),
'@batch_size' => number_format(count($chunk)),
'@count' => number_format($context['sandbox']['max']),
]);
// Process the chunk.
}
public static function batchProcess(int $batchId, array $chunk, array &$context): void {
}
Create a BatchBuilder object like this.The batch finish method is the final function that is called when the batch operations finish. This method accepts the following parameters.
Running A Batch From A From
One final thing in the finished method is the return value, which depends on where you start the batch from. If you start the batch operation from a form then the form redirects will be taken into account and used to send the user to the whatever was set in the form. If the batch operation is initiated from a controller then the return value must be a redirect response as controllers must return either a render array or a response object.When we first start the batch process there isn’t any information in the sandbox and results array items, so we first set up these values in the processing method. We can also add to the message parameter of the $context array since we also know some things about the batch process we are currently running.$batch = new BatchBuilder();
Core to the Batch API is the BatchBuilder class, so lets start off looking at that.In the betch setup code we added a number of calls to the method batchProcess(), and passed in an argument array that was 2 elements in length. Here is the call again in isolation.
When To Use The Batch API
The Batch API is a powerful feature in Drupal that allows complex or time consuming tasks to be split into smaller parts.
- Performing an operation on lots of different items of content. For example, updating every page on a site or deleting lots of taxonomy terms.
- If you are interacting with an API that requires lots of operations to complete a task then the Batch API can be useful. This allows you to show the user a progress bar whilst you perform the actions, and can often mask a slow API system or prevent the API from timing out the user’s page.
- If you want to accept a file from a user and process the results then using the Batch API can often help break down that file into smaller parts. I have successfully managed to parse a CSV file with 100,000 entries using the Batch API.
When Not To Use The Batch API
The complexity around working with the Batch API is mostly about how you set up the processing steps. There are a couple of different flavors of initialising a batch run and the processes you create will depend on the tasks you are trying to accomplish.public static function batchProcess(int $batchId, array $chunk, array &$context): void {
if (!isset($context['sandbox']['progress'])) {
$context['sandbox']['progress'] = 0;
$context['sandbox']['max'] = 1000;
}
if (!isset($context['results']['updated'])) {
$context['results']['updated'] = 0;
$context['results']['skipped'] = 0;
$context['results']['failed'] = 0;
$context['results']['progress'] = 0;
$context['results']['process'] = 'Form batch completed';
}
// Keep track of progress.
$context['results']['progress'] += count($chunk);
// Message above progress bar.
$context['message'] = t('Processing batch #@batch_id batch size @batch_size for total @count items.', [
'@batch_id' => number_format($batchId),
'@batch_size' => number_format(count($chunk)),
'@count' => number_format($context['sandbox']['max']),
]);
foreach ($chunk as $number) {
// Sleep for a bit (making use of the number variable) to simulate work
// being done. We do this so that the batch takes a noticeable amount of
// time to complete.
usleep(4000 + $number);
// Decide on the result of the batch. We use the random parameter here to
// simulate different conditions happening during the batch process.
$result = rand(1, 4);
switch ($result) {
case '1':
case '2':
$context['results']['updated']++;
break;
case '3':
$context['results']['skipped']++;
break;
case '4':
$context['results']['failed']++;
break;
}
}
}
Array(
[sandbox] => Array()
[results] => Array()
[finished] => 1
[message] =>
)
Of course, the Batch API isn’t always the best thing to use in all circumstances. If you want to process a bunch of items quickly (and give user feedback whilst doing it) then the Batch API is normally the best approach.If you don’t need to give that feedback to the user, or timescales are less important, then just using a queue processor can be a better solution. The Batch API in Drupal is build upon the Queue API so if you build the batch operation it isn’t too difficult to swap to a queue processor after the fact.$batch = new BatchBuilder();
$batch->setTitle('Running batch process.')
->setFinishCallback([self::class, 'batchFinished'])
->setInitMessage('Commencing')
->setProgressMessage('Processing...')
->setErrorMessage('An error occurred during processing.');
In the next article we will look at setting up a batch run so that it can be run using either a form or via a Drush command.