Data Set rule form - Completing Data Sets |
|
The way a data set is configured to represent data depends on the data set type. You can create the following data sets:
Define the keys.
Note: You can create this data set when you have at least one decision data node in the cluster.
This data set stages data for fast decisioning. You can use it when you want to access data very quickly by using a particular key.
When you create an instance of this data set, you need to define the keys.
The HBase data set is designed to read and save data from an external Apache HBase storage. This data set can be used as a source and destination in the Data Flow rules instances.
For configuration details, see Configuring HBase data set.
The HDFS data set is designed to read and save data from an external Apache Hadoop File System (HDFS). This data set can be used as a source and destination in the Data Flow rules instances. It supports partitioning so you can create distributed runs with data flows. Becasue this data set does not support the Browse by key option, you cannot use it as a joined data set.
For configuration details, see Configuring HDFS data set.
This type of data set allows you to process continuous data stream of events (records).
Stream tab
The Stream tab contains details about the exposed services (REST and WebSocket). These exposed services handle stream data set as a resource located at http://<HOST>:7003/stream/<DATA_SET_NAME>, for example: http://10.30.27.102:7003/stream/MyEventStream
Settings tab
The Settings tab allows you to set additional options for your stream data set. After saving the rule instance, you cannot change the settings.
Authentication
The REST and WebSockets endpoints are secured by using the Pega 7 Platform common authentication scheme. Each post to the stream requires authenticating with your user name and password. By default the Enable basic authentication check box is selected.
In the Retention period field, you specify how long the data set keeps the records. The default value is 1 day.
In the Log file size field, you specify the size of the log files, between 10 MB and 50 MB. The default value is 10MB.
No configuration required. The data set instance is automatically configured with the Visual Business Director server location as defined by the Visual Business Director connection.