Table of Contents

Importing large amounts of data by using the data import File Listener

When you need to import large volumes of data (millions of rows), use the data import File Listener instead of the data import wizard. The data import File Listener uses multithreading for faster throughput, while the wizard uses single-thread processing.

To import your data with increased performance, see Improving data import performance by using configuration templates.

Pega Sales Automation™ includes a File Listener for the following entities:

  • Operator
  • Contact
  • Household
  • Lead (individual and business)
  • Opportunity (individual and business)
  • Organization
  • Task
  • Customer activity
  • Territory
  • Account (beginning with Pega Platform 8.2)

Import recommendations

For best performance, while using the data import File Listener, keep in mind the following recommendations:

  • Before starting the import for all your records, import a few records to start with and fix any issues.
  • The size of the File Listener base upload should not exceed 1 million records in a single file.
  • Batch size value recommended for upload is 1000 records. Set it up in App Studio > Settings > Application settings.
  • To improve performance and to disable creating audit history, use Add Only mode for the initial data import.
  • To ensure a maximum parallel processing, there must be as many input files for the file listener as there are threads, because each thread processes one file at the time. Set it up in the File Listener properties in Dev Studio, in the Listener properties section. 
  • As of 8.3, in a PostgreSQL single-tenant system, unique ID generation is highly performant during a high-volume import process. Work item IDs are generated in batches, and you can set the batch size with the idGenerator/defaultBatchSize dynamic system setting. For more information, see Increased performance for work ID generation and <link to dss help topic>.
  • In non-PostgreSQL multi-tenant systems, high-volume import process should not include generating unique IDs. Include pyID for work object records in the .csv import file to skip calling the GenerateID activity and by doing this, save time. After contacts import is complete, update the unique ID stored in the data-uniqueID database table. Set the Table name value in the data-uniqueID database table to the last imported pyID record in the contact table.
  • Database indexes improve query performance; however, when you update a large database table, the system performs reindexing, which can cause lower performance. Remove non-essential indexes during the import phase. After the import is complete, enable indexing.

To import data by using the data import File Listener, complete the following steps:

  1. Preparing the data
  2. Configuring the data import File Listener
  3. Running the data import File Listener

    Preparing the data

    The data import File Listener uses the same underlying APIs as the data import wizard to process files located in predetermined folders on the server. Importing data by using the data import File Listener requires templates. It is recommended to use the data import wizard to make any template changes prior to using the file listener. For more information, see Preparing data and Pega Sales Automation sample data templates.

    Configuring the data import File Listener

    This task applies to both on-premises and cloud environments.

    1. For on-premises configuration, perform the following steps:
      1. In the navigation panel of App Studio, click Settings > Application Settings.
      2. In the File Listener Configurations section, enter the base folder in the File Listener source location and the email address to which you want to send notifications.

        The following figure shows an example of the File Listener Configurations section.

        Thumbnail
        File Listener example configuration
      3. Optional: To improve performance and to disable creating audit history, in the File Listener Configurations section, select the Initial Data Migration check box.
      4. Optional: If you disabled creating audit history in step 1c, after the import is completed, clear the Initial Data Migration check box to generate audit records.

      5. Click Save.
      6. Optional: If you want to modify the default template and purpose configuration, in the navigation panel of Dev Studio, click App,and then search for and open the ResourceSettings data transform.

        The following figure shows an example of the Resource settings transform.

        Thumbnail
        Resource settings example configuration

        By default, the data import File Listener is configured with SA_<name of objects> as a template and Add or update as a purpose.

    2. For Pega Cloud configuration, perform the following steps:
      It is recommended to use SFTP server implementation.
      1. In the header of Dev Studio, search for and select the storage/class/defaultstore:/type dynamic system setting (DSS).
      2. In the Value field, enter filesystem.
      3. Click Save.
      4. In the header of Dev Studio, search for and select the FileListenerSourceLocation dynamic system setting. 
      5. In the Value field, enter the base folder in the File Listener source location.
      6. Click Save.

    Running the data import File Listener

    1. In the navigation panel of Dev Studio, click Records > Integrations-Resources > File Listener, and then open a listener that you want to run.
    2. In the Listener nodes section, clear the Block startup check box.
    3. In the Source properties section, enter the source file format.

      Only .csv and .txt formats are supported.

    4. In the Listener properties section, set the number of threads per node to the number of CPUs on that node.

      The number of threads per node should be the same as the number of CPUs on that node. By doing this, you improve the performance of the initial load.

    5. Click Save.
    6. In the header of Dev Studio, click the Switch Studio menu, and then click Admin Studio.
    7. In the navigation panel of Admin Studio, click Resources > Listeners, and then open a listener that you configured.
    8. In the Requestor login section, enter your user name and password.

    What to do next

    After an entire file is processed, output files are created in the source file location that you specified in App Studio. The output file lists failed records and information about each error. The data import results summary is emailed to the notification email addresses that are listed as part of the File Listener configuration process.

      Have a question? Get answers now.

      Visit the Pega Support Community to ask questions, engage in discussions, and help others.