LinkedIn
Copied!

Table of Contents

Creating a Kafka data set

Version:

Only available versions of this content are shown in the dropdown

You can create a Kafka data set in Pega Platform, and then associate it with a topic in a Kafka cluster.

Configure Kafka data sets to read and write data from and to Kafka topics, and use this data as a source of events, such as customer calls or messages. Your application can use these events as input for rules that process data in real time and then trigger actions.

For example, when a customer who has a checking account with UPlus Bank accesses the bank's ATM, this event can initiate an associated action, such as displaying an offer for a new credit card on the ATM's screen. For more information, see Triggering a real-time event with the Event Stream service.

You can connect to an Apache Kafka cluster version 0.10.0.1 or later.

Ensure that a Kafka configuration instance for connecting to your Kafka server or cluster of servers is available in your system. For more information, see Creating a Kafka configuration instance.

  1. In Dev Studio, click Create Data Model Data Set .

  2. Provide the data set label and identifier.

  3. From the Type list, select Kafka.

  4. Provide the Context , Apply to class, and select Add to ruleset.

  5. Click Create and open.

  6. In the Connection section, in the Kafka configuration instance field, perform one of the following actions:

    • Select a Kafka configuration instance in the Data-Admin-Kafka class.
    • Create a Kafka configuration instance (for example, when no instances are present) by clicking the Open icon.

      For more information, see Creating a Kafka configuration instance.

  7. Check whether Pega Platform is connected to the Kafka cluster by clicking Test connectivity.

  8. In the Topic section, perform one of the following actions:

    • Select Create new, and then enter the topic name to define a new topic in the Kafka cluster.
    • Select Select from list, and then connect to an existing topic in the Kafka cluster.
    • Select Use application settings with topic values, and your Kafka data set will use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment. To use this setting, you first have to configure the Application Settings rule, for more information, see Configuring application settings for Kafka data set topics.
    By default, the name of the topic is the same as the name of the data set. If you enter a new topic name, that topic is created in the Kafka cluster only if the ability to automatically create topics is enabled on that Kafka cluster.
  9. Optional:

    To define the data set partitioning, in the Partition Key(s) section, perform the following actions:

    1. Click Add key.

    2. In the Key field, press the Down Arrow key to select a property to be used by the Kafka data set as a partitioning key.

      By default, the available properties to be used as keys correspond to the properties of the Applies To class of the Kafka data set.
    By configuring partitioning you can ensure that related records are sent to the same partition. If no partition keys are set, the Kafka data set randomly assigns records to partitions.
  10. Optional:

    To configure the Message Values, perform one of the following actions:

    1. If you choose JSON, you can select between two Field mappings:

      Automatically map fields
      Auto-map fields from Kafka to fields with identical names in Pega.
      Use data transform
      Uses the JSON Data Transform ruleform with the ability to map only the properties that you want to map. If you have a long JSON message with many attributes, you can skip some of them. You can also have special characters in your property names (for example, the $ sign), and then map them to the corresponding Pega properties. For more information, see Data transform actions for JSON.
    2. If you choose Avro, you must preconfigure an Avro schema. For more information, see Configuring Avro schema for Kafka data set.

    3. If you choose a Custom configuration, you must configure the record settings:

      Serialization implementation
      In this field, enter a fully qualified Java class name for your value serialization.
      For example: com.pega.dsm.kafka.CsvPegaSerde.
      Additional configuration
      In this field you define additional configuration options for the implementation class. Click Add key value pair, and then enter properties in the Key and Value fields.
      For information about writing and configuring custom Kafka serialization, see Pega GitHub repository.
  11. Optional:

    To enable custom value processing, in the Add the Java class with reader implementation and Add the Java class with writer implementation fields,provide the Java class which implements the custom serialization and deserialization logic. You can find a sample implementation of custom processing here: Kafka message custom processing.

  12. Optional:

    To configure the Message Keys, perform one of the following actions:

    1. If you choose JSON, you can select between two Field mappings:

      Automatically map fields
      Auto-map fields from Kafka to fields with identical names in Pega.
      Use data transform
      Uses the JSON Data Transform ruleform with the ability to map only the properties that you want to map. If you have a long JSON message with many attributes, you can skip some of them. You can also have special characters in your property names (for example, the $ sign), and then map them to the corresponding Pega properties. For more information, see Data transform actions for JSON.
    2. If you choose Avro, you must preconfigure an Avro schema. For more information, see Configuring Avro schema for Kafka data set.

    3. If you choose a Custom configuration, you must configure the record settings:

      Serialization implementation
      In the Serialization implementation field, enter a fully qualified Java class name for your keys serialization.
      For example: com.pega.dsm.kafka.CsvPegaSerde.
      Additional configuration
      In the Additional configuration section, define additional configuration options for the implementation class. Click Add key value pair, and then enter properties in the Key and Value fields.
      For information about writing and configuring custom Kafka serialization, see Pega GitHub repository.
  13. Optional:

    Specify key-value pair in the Message header section.

  14. Click Save.

  • Creating a Kafka configuration instance

    To manage connections to your Apache Kafka server or cluster of servers that is the source of your application stream data, configure a Kafka configuration instance in the Pega Platform Data-Admin-Kafka class.

  • Configuring application settings for Kafka data set topics

    Use the Use application settings with topic values setting in a Kafka data set creation ruleset to use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment.

  • Configuring Avro schema for Kafka data set

    When you configure a Kafka data set, you can choose Apache Avro as your data format for the Kafka message values and message keys. Avro is a lightweight binary message encoding that is at least two times smaller than regular JSON.

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.