Batch processing

Compared to the Pacemaker Community Edition, the Pacemaker Professional Edition has a built-in batch processing function that stacks data that needs to get added or updated and executes a multi-value SQL statement based on a set of configurable triggers.

For some exceptions or used as a framework, Pacemaker Professional Edition allows configuration when the batch processing is triggered:
  • Stack size

  • Dedicated attributes to start the execution for the multi-value SQL statement

Configure the stack size

By default, the stack size is 1000.

  • The SQL statement gets executed before the next value gets added to the stack, and the stack gets cleared

  • Additionally, a trigger ensures that the batch is processed even if the import gets finished, but it still contains values

  • It is impossible to configure the maximum batch size via the workflow engine configuration, which is the case with most other configuration options

  • The batch size must get configured via the DI configuration in exchange

  • The constructor of the implementation is TechDivision\Import\Batch\Actions\Processors\GenericBatchProcessor, which provides the batch functionality and expects four arguments

    • The fourth argument is the maximum batch size, after which the batch should get cleaned up

Example

If the maximum batch size of the processor that does the generation of the product’s DateTime attribute values needs to be changed, the DI configuration must get changed, and the DI configuration can be overwritten, e.g., with

services.xml
<service
        id="import_product.action.processor.product.datetime.create"
        class="TechDivision\Import\Batch\Actions\Processors\GenericBatchProcessor">
    <argument type="service" id="connection"/>
    <argument type="service" id="import_batch.repository.sql.statement"/>
    <argument type="collection">
        <argument type="constant">
            TechDivision\Import\Batch\Utils\SqlStatementKeys::CREATE_UPDATE_PRODUCT_DATETIME
        </argument>
    </argument>
    <argument type="integer">2000</argument>
</service>

Configure the dedicated attributes

  • The dedicated attribute change option that triggers the stack cleanup allows registering additional attributes besides the url_key attribute

  • The DI configuration can get overridden by adding the corresponding attribute to the loader collection argument

Example

If the stack needs also to get cleaned up, e.g., if a value for the url_path attribute gets added to the stack.

services.xml
<service
        id="import_batch.loader.product.varchar.processor.attribute.id"
        class="TechDivision\Import\Batch\Loaders\GenericAttributeIdLoader">
    <argument type="service" id="configuration"/>
    <argument type="service" id="import.processor.import"/>
    <argument type="collection">
        <argument type="constant">TechDivision\Import\Product\Utils\MemberNames::URL_KEY</argument>
        <argument type="constant">TechDivision\Import\Product\Utils\MemberNames::URL_PATH</argument> (1)
    </argument>
</service>
1 It will get passed as the fifth argument to the processor that handles the creation of the product varchar attribute values and is based on the TechDivision\Import\Batch\Actions\Processors\GenericAttributeBatchProcessor implementation.

The url_key attribute triggers the processor’s batch cleanup, which triggers the creation of the product’s varchar attributes since URL key management require that the database’s actual URL keys always get used.