Tuesday, April 24, 2018 04:35 am EDT

Example of Using the D7 Batch API to Process Changes to Entities From a Select Query

Bri's picture

The other day I decided that I needed to figure out how to use the Batch API. I have had a number of times where I needed to "fix" a bunch of nodes on a site. These are usually one time fixes such as adding a new field to the user object and needing a default value to be set for all the previous users. In most cases I would just create a standard menu callback and let it process. If it was too much to do in a single page request, I would change the query to limit the results and run it multiple times. I knew the Batch API was a better choice, but I hadn't taken the time to figure it out. Well, this week I did. I had a site that was only collecting the zip code when new users would sign up. We started to get in to some reporting some time later and realized that having the city/state would be very helpful. While a user was allowed to go fill in this information later in their profile, very few people did so. So my task was to fix all these user profiles by using the ziptasticapi to have the city/state returned to me for each zip code. In this particular case, I had about 4000 users and for each one, I wanted to grab the zip code, make a request to the ziptastic api and get the returned city/state and save it to their profile. It was time to learn the batch API.

I saw a few examples elsewhere on how to do this and it helped me to some extent, but I still ended up looking thru core files to see how it was handled in other contexts. Here is a break down of how i handled it in case it is helpful to other people.

The first thing I did was setup a menu callback:

 * Implements hook_menu().
 * @return
 *   array of menu information
function mlp_fix_menu() {
  $items = array();
  $items['mlp/fix_city_state'] = array(
    'title' => "Setting City and State for blank users",
    'page callback' => 'fix_states',
    'access arguments' => 'administer users',
    'type' => MENU_CALLBACK,
  return $items;

So here I am defining the function "fix_states" to run when accessing the path 'mlp/fix_city_state'. Great, so lets take a look at the fix_states function.

/** Menu Callback for the path mlp/fix_city_state
 * Queries the database for the users that need fixing and defines a batch process to fix them. 
function fix_states() {

  $results = db_query("SELECT DISTINCT entity_id, field_wo_address_administrative_area, field_wo_address_locality, field_wo_address_postal_code
FROM {field_data_field_wo_address}
WHERE (field_wo_address_administrative_area = ''
OR field_wo_address_administrative_area IS NULL
OR field_wo_address_locality IS NULL
OR field_wo_address_locality = '')
AND field_wo_address_postal_code <> ''")->fetchAll();

  //sets the batch - other keys can be used - see https://api.drupal.org/api/drupal/includes!form.inc/function/batch_set/7
  // operations is the only required parameter
    'title' => t('Batch process fix city and state'), 
    'operations' => array(
        array('process_city_state_batch', array($results)), //set a batch function and pass in the query results to it
    'finished' => 'process_city_state_finished_batch',
  //defines where the page will redirect to after the batch has completed

This function is where we actually execute the query against the database to get the items (in my case the entity_id's of the profiles needing modified) that need manipulating. Next we set up the batch itself. The batch_set function takes in an array with several keys but the only one you must have is the operations key. This one will define the functions you wish to execute as part of the batch and allows you to pass in the data to be operated on. The finished callback is the function that executes when the operations are completed. The call to "batch_process" defines the URL to redirect to when everything is complete.

Now let's take a look at the process_city_state_batch function, the function that does all the manipulation to our items.

/** The function that executes the processing of the batch
 * @param array $results - the results of the query, the records that need adjusting
 * @param array $context: An array of contextual key/values.
function process_city_state_batch($results, &$context) {
  //these variables persist and should be set so that they remain during the batch
  if (!isset($context['sandbox']['progress'])) {
    //initialize the batch progress
    $context['sandbox']['progress'] = 0;
    //set the max to the total number of records selected by the query
    $context['sandbox']['max'] = count($results);
    //store my results in a variable
    $context['sandbox']['nodes'] = $results;
  //sets the limit to the lesser of 10 or remaining nodes
  $limit = min(10, count($context['sandbox']['nodes']));
  //for loop to process this chunk of the batch
  for($i = 0; $i < $limit; $i++) {
    //get my current record
    $current_node = array_shift($context['sandbox']['nodes']);
    //create the URL to call to ziptastic
    $url = 'http://ziptasticapi.com/' . substr($current_node->field_wo_address_postal_code, 0, 5);
    //make the request
    $city_state = drupal_http_request($url);
    //get the response as an array of values by specifying true to this function - see ziptasticapi.com/46530 to see how results are returned
    $city_state_array = json_decode($city_state->data, true);
    //check for an error
    if(array_key_exists('error', $city_state_array)) {
      $context['results'][] = 'Error for ID of '.$current_node->entity_id.'
'; } else { //good data //fix the profile with the data we had returned $profile = profile2_load($current_node->entity_id); //set these values in the form_state so the submit handler can save them to the profile $profile->field_wo_address[LANGUAGE_NONE][0]['locality'] = ucwords(strtolower($city_state_array['city'])); $profile->field_wo_address[LANGUAGE_NONE][0]['administrative_area'] = $city_state_array['state']; profile2_save($profile); $context['results'][] = 'Successfully updated city/state information'; } //increment our progress $context['sandbox']['progress']++; } //check if batch is finished and update progress if ($context['sandbox']['progress'] != $context['sandbox']['max']) { $context['finished'] = $context['sandbox']['progress'] / $context['sandbox']['max']; } }

I tried to comment this function to the extent that I understand it. It takes in the $results that we passed in from the definition of batch_set. It also has a context variable that will be used to keep persistent information as the batch processes. Therefore, in the beginning we check to see if our progress variable is 0 and if so, we set all our variables to exist inside the context variable. Then we set a limit of how many items to process before a new request is made. We loop thru the items and do whatever processing it is that you need to do. In my case I was using the entity_id's from the database to load a profile2 profile. Using the zip code I already had, I made calls to ziptasticapi to get the city and state. If successful, i stored them on the profile and saved the changes. The very last part of this function checks the progress variable vs. the max variable and updates a variable used to display the progress to the user.

The last part of it all is the finish callback which isn't all that fancy. Mine is below.

/** The batch function that is executed after the batch process has completed
 * @param bool $success: A boolean indicating whether the batch operation successfully concluded.
 * @param int $results: The number of nodes updated via the batch mode process.
 * @param array $operations: An array of function calls (not used in this function).
function process_city_state_finished_batch($success, $results, $operations) {
  if($success) {
    $message = t('The batch was successful');
  else {
    drupal_set_message(t('An error occurred and processing did not complete.'), 'error');
    $message = format_plural(count($results), '1 item successfully processed:', '@count items successfully processed:');
    $message .= theme('item_list', array('items' => $results));

Here we pretty much just check to see if we were successful and display the appropriate message. That about does it. Hopefully this has helped you gain a bit of an understanding on how to use the Batch API in Drupal 7.