Configuring YARN settings
Configure the YARN Resource Manager settings to enable running external data flows (EDFs) on a Hadoop record. When an external data flow is started from Pega Platform, it triggers a YARN application directly on the Hadoop record for data processing.
Access a Hadoop record from the navigation panel by clicking.
On the Connection tab, select the Use YARN configuration check box in the YARN section.
In the User name field, provide the user name to be authenticated in the YARN Resource Manager.
In the Port field, specify the YARN Resource Manager connection port. The default port is 8032.
In the Work folder field, enter the location of the temporary work folder in the Hadoop environment where the execution data is stored.
Configure the Response timeout field.
- Select the Advanced configuration check box.
- In the Response timeout field, set the time (in milliseconds) to wait for a server response. The default value is 3000.
Enable secure connections.To authenticate with Kerberos, you must configure your environment. For more details, see the Kerberos documentation about the Network Authentication Protocol.
- In the Authentication section for YARN configuration, select the Use authentication check box.
- In the Master kerberos principal field, enter the Kerberos principal name of the YARN Resource Manager, typically following the parttern rm/<hostname>@<REALM>
- In the Client kerberos principal field, enter the Kerberos principal name of a user as defined in Kerberos, typically in the following format: <username>/<hostname>@<REALM>.
- In the Keystore field, enter the name of a keystore that contains a keytab file with the keys for the user who is defined in the Client Kerberos principal setting. The keytab file is in a readable location in the Pega Platform server, for example, /etc/hdfs/conf/thisUser.keytab.
Click Test connectivity to verify your settings.
View the status of the applications that are managed by the YARN Resource Manager.
- Click View applications. The YARN Applications modal dialog is displayed.
- Optional: Use the Application state drop-down menu to filter applications according to their progress status:
- All – Displays all finished and running applications.
- Finished – Displays applications with the FINISHED, KILLED, or FAILED status.
- Running – Displays applications with SUBMITTED, ACCEPTED, NEW, NEW_SAVING, or RUNNING status.
- JCA Resource Adapter form – Completing the Connection tab
Complete the Connection tab to identify the resource adapter's Connection Factory and to provide information about how the resource adapter connects to the back-end enterprise information system (EIS).
- About Hadoop host configuration (Data-Admin-Hadoop)
You can use this configuration to define all of the connection details for a Hadoop host in one place, including connection details for datasets and connectors.