Back ForwardExtracting data with BIX

Concepts and terms

With the Business Intelligence Exchange (BIX), you can specify the properties in each class of your application that you want to extract. You can extract all properties, including Value Lists, Value Groups, and multiple levels of embedded Pages, Page Lists, and Page Groups.

You can filter the data to be extracted, extracting only the data that matches the conditions you set.

You can extract the data contained in these properties to an XML file or a Comma Separated Values (CSV) file, or for import into a database.

Running a BIX extract in Designer Studio

To run an extract from within a PRPC application:

  1. From the Records Explorer select SysAdmin > Extract and select the extract rule you want to use.
  2. On the rule form, click the runrun button button. The Run Rule form displays.
  3. For CSV and XML output
    1. Provide the Number of Records to Update at a Time. To run the extract in non-batch mode, a single record at a time, set the number of records to update at a time to 1.
    2. Provide a Unique Identifier for the extract.
    3. Click Perform Extract to begin extracting data from your application. A progress bar informs you when the extract is complete.
  4. For an extract to a database
    1. Provide the Number of Records to Update at a Time. To run the extract in non-batch mode, a single record at a time, set the number of records to update at a time to 1.
    2. Provide a Unique Identifier for the extract.
    3. The first time you run the extract, check the Generate Schema DDL check box and provide a location and a name for the DDL to be saved as.
    4. Click Perform Extract to generate the DDL and save it in the specified location. Run the generated DDL manually in the required database to create the table.
    5. The next time you run the Extract, uncheck Generate Schema DDL check box. Click Perform Extract to begin extracting data from your application into the created table. A progress bar informs you when the extract is complete.

After the extract is complete, you can access the output data at the location (or locations, for XML output) you specified in the File Specifications tab of the rule.

Running a BIX extract from the command line

In addition to running a BIX extract from within the Designer Studio in PRPC, you can directly call the ExtractImpl Java class in the PRPC application libraries to define and run an extract process. Specify the data to be extracted and the output format by referencing an Extract rule you have created in PRPC. ExtractImpl can either access the Extract rule in the PRPC database or read an XML file containing the XML representation of the rule.

You can add other optional parameters that filter the data by any property, including date ranges, and specify other processing options: see the next section for details on the optional parameters that are available.

Running an extract rule from the command line works in the following way:

  1. BIX accesses the PRPC (source) database using the information defined in the BIX prconfig.xml file (see below).
  2. Using the information configured in the bix.xml file (see below) and specified in the extract rule, BIX extracts the specified data from the PRPC (source) database.
  3. The extracted data is output in the output format specified in the extract rule.
    Note: The output data includes a timestamp and a batch identifier. These help when you want to compares information from different extracts, or isolate information from a specific extract.

Configuring the BIX PRPC configuration files

To complete the command-line environment, provide information to the PRPC system in three configuration files:

In addition, you can create a pegarules.keyring file to support encryption of the database password that must be supplied with the database connection settings in prconfig.xml. Each of these files is described below.

A. Configure the prconfig.xml file

Edit the prconfig.xml file provided in the configuration directory of your BIX distribution to add database connection settings specifying the source PRPC database from which you want to extract data. If you want to use the Database Schema output format, add a second set of database connection settings specifying the target database to which the extracted data is written.

Create the source database

The prconfig.xml file contains sample database connection settings for an Oracle database:
<env name=”database/drivers” value=”com.microsoft.jdbc.sqlserver.SQLServerDriver;oracle.jdbc.OracleDriver” />
<env name=”database/databases/PegaRULES/url” value=”jdbc:icrosoft:sqlserver://localhost:1433;SelectMethod=cursor;
SendStringParametersAsUnicode=false;ProgramName=PR ${SystemName}@${NodeName} ${DBName} #${ConnectionID}” />
<env name=”database/databases/PegaRULES/icrosof” value=”a_username” />
<env name=”database/databases/PegaRULES/password" value="a_password" />

Modify the code as needed for your PRPC database, as described below.

Important: The following steps outline the procedure for specifying typical database connection settings. More detailed information is available on the PDN in the article “How to configure non-J2EE database connections in the prconfig.xml file”.
   Consult your DBA to resolve requirements specific to your installation.
  1. In the database/drivers element, change the value to the appropriate driver for your database, for example:

    Database Driver

    Value

    Oracle 9i/10g

    oracle.jdbc.OracleDriver

    IBM DB/2 Type 4

    com.ibm.db2.jcc.DB2Driver

    SQL Server 2005

    com.microsoft.sqlserver.jdbc.SQLServerDriver

    SQL Server 2000

    com.microsoft.jdbc.sqlserver.SQLServerDriver

  2. In the database/databases/PegaRULES/url element, set the value to the connection URL for your database. Here are examples of the appropriate format for some supported databases:

    Database Driver

    Value

    Oracle 9i/10g (Native OCI Client)

    jdbc:oracle:oci:@your_tns_name

    Oracle 9i/10g (Thin Client)

    jdbc:oracle:thin:@myServer:1521:myDatabase

    DB/2 (Universal Driver)

    jdbc:db2://serverName:port/dbName

    SQL Server 2005

    jdbc:sqlserver://<your_sql_server_host>:1433;
    SendStringParametersAsUnicode=false; SelectMethod=cursor;
    ProgramName=${NodeName}:${SystemName}.${ConnectionID}

    SQL Server 2000 (Type 4):

    jdbc:microsoft:sqlserver://<your_sql_server_host>:1433;
    SendStringParametersAsUnicode=false; SelectMethod=cursor;
    ProgramName=${NodeName}:${SystemName}.${ConnectionID}

  3. Set the value of database/databases/PegaRULES/userName and database/databases/PegaRULES/password to the user name and password of a user for the PRPC database.

Optionally, define the Target database connection settings

The BIX "Database Schema" output format writes the extracted data directly to a target database. To support this output, you must provide a second set of database connection settings specifying the target database to which the data will be loaded.

In the BIX prconfig.xml file, copy and paste the database connection settings you created for the source database.
In each element name, change PegaRULES to the name of the database you specified as the Output Database Name in the Extract rule, as in the following examples:


Change:

To:

database/databases/PegaRULES/url

database/databases/TargetDB/url

database/databases/PegaRULES/userName

database/databases/TargetDB/userName

database/databases/PegaRULES/password

database/databases/TargetDB/password

Repeat the procedure used for the source database connection to set the target database connection settings to the appropriate values.

Important: In addition to specifying the database connection settings, Oracle and UDB databases require some additional configuration. More detailed information is available on the PDN in the article 25008 How to configure non-J2EE database connections in the prconfig.xml file.
   Consult your DBA to resolve requirements specific to your installation.

Split schema setup

For split schema, add the following environment variables:

<env name="database/databases/PegaRULES/defaultSchema" value="rulesSchema_Name"/>

<env name="database/databases/PegaDATA/defaultSchema" value="dataSchema_Name" />

Note: See the related section under configuring prbootstrap.properties, below.

Other optional settings

You can add the following settings to the end of the prconfig.xml file to address certain issues that may occur:

Optimize XML output

This property is set to true by default to facilitate extracting a large number of class instances to XML:

<env name= “compatibility/BIXUseOptimizedClipboardXML” value= “true” />

Disable forward chaining

Use this property to enable or disable forward chaining calculations of properties when loading the clipboard for BIX extract:

<env name= “compatibility/BIXUseOptimizedClipboardXML” value= “true”>

The default is TRUE.

Disable backward chaining

Use this property to enable or disable backward chaining of properties when loading the clipboard for BIX extracts:

<env name=”compatibility/BIXdisableForwardChaining” value = “true” />

The default is TRUE.

Record inclusion
To ensure that records added to the database while the extract is running get included in the extract, add this property:
<env name=”bix/useHistoryClasses” value=”true” />

 

B. Configure the prlogging.xml file

prlogging.xml is a log4j configuration file that enables two log files for the extraction process, PegaBIX and PegaBIX-ALERT. By default, the log files are written to the directory in which the extraction process is started.
You can modify the FileNamePattern for the PegaBIX and PegaBIX-ALERT logs to change the location to which they are output.

C. Configure prbootstrap.properties

From version 6.1 onwards, PRPC stores engine code in the database as a CodeSet. Edit the prbootstrap.properties file provided in the configuration directory of your BIX distribution to add database connection settings specifying the PRPC database which holds the engine code.
Specify the database connection settings using the naming convention:
          <unique-identifier>,<dbtype>.<property>
where

Example: To connect to an Oracle database:
com.pega.pegarules.bootstrap.engineclasses.dbcpsource=example.oracle
com.pega.pegarules.bootstrap.assembledclasses.dbcpsource=example.oracle
example.oracle.url=jdbc:oracle:thin:@localhost:1521:codebase
example.oracle.username=user
example.oracle.password=pass
oracle.jdbc.class=oracle.jdbc.OracleDriver

Specify any additional properties needed to connect to the database using the connectionProperties entry, and provide a semicolon-separated list of values. For example:
example.oracle.connectionProperties=oracle.jdbc.V8Compatible=true
Note: set the system property com.pega.pegarules.bootstrap.ignorejndi to TRUE when using Pega-managed connections. This tells PRPC to ignore the JNDI data source information in the file, and to use the Pega-managed connection settings instead.
Note: In version 6.2, you can replace the following two lines:
com.pega.pegarules.bootstrap.engineclasses.dbcpsource=example.oracle
com.pega.pegarules.bootstrap.assembledclasses.dbcpsource=example.oracle

with this single entry:
com.pega.pegarules.bootstrap.allclasses.dbcpsource=example.oracle

Split schema setup

For a split schema setup, add the following properties:

com.pega.pegarules.bootstrap.allclasses.schema=" rulesSchema_Name "

com.pega.pegarules.bootstrap.datatables.schema=" dataSchema_Name "

D. Enable password encryption

BIX lets you pass in a PRPC username and password to rule-resolve the Rule-Admin-Extract rule that needs to run. If you want to enable security for the database username and password, you can implement JCE "keyring" encryption by creating a pegarules.keyring file.

You can use keyring encryption to encrypt the username and password. To do this, run the KeyringImpl java class, which accepts three parameters and generates the keyring file. The parameters are:

  1. The path where the pegarules.keyring file is to be generated (including the filename)
  2. The path to the prconfig.xml file
  3. The path to the BIX distribution directory

To encrypt BIX passwords in the same file, pass an additional argument named bix. Running the KeyringImpl Java file with this additional parameter prompts the user for the BIX username and password. The file encrypts the information you enter:

For details about how to set up keyring password encryption, see the PDN article KB-25881 How to encrypt database passwords using a JCE Keyring file .

Preparing a target database

Before extracting data to the target database, you must create a schema to match the extracted data. You can use BIX to generate a SQL file that will create the appropriate schema.

Call the ExtractImpl class, as described below, with the parameter -X <Path><FileName> to generate a DDL for the target database. Run the SQL script against the target database using your database tools.

Note: To use the -X parameter, the output format of the Rule-Admin-Extract rule you specify must be Database schema. The user must have Write access to the directory. If the output format is XML or CSV, an error message appears:

Cannot generate database schema when output type is set to CSV/XML.

Running the command-line extract process

In a properly configured environment, from the command line or in a script, make a Java call to com.pega.pegarules.data.internal.access.ExtractImpl supplying arguments to specify

Note to use this option, call the main program com.pega.pegarules.pub.PegaRULES, and provide the username and password to ExtractImpl as the first argument. The resulting statement might look like this:
java -Dpegarules.config="./config/prconfig.xml" -Dpegarules.logging.configuration="./config/prlogging.xml" com.pega.pegarules.pub.PegaRULES com.pega.pegarules.data.internal.access.ExtractImpl -a $USERNAME$ -p $PASSWD$ -i PegaSample!SampleBIX_CSV

See How to set optional command-line BIX parameters for details on all available parameters.
From PRPC 6.1+, the engine Java classes are stored in the database and not in the file system. The ExtractImpl class cannot be directly run using the Java interpreter. Instead, run the PegaRULES class and pass the ExtractImpl class as an argument. The PegaRULES class is part of the prbootstrap.jar library.
Example: to run the extract “PegaSample!SampleBIX” on a DB2 system:

java –Xms512m –Xmx768m –classpath
".;lib\prbootstrap.jar;lib\prdbcp.jar;lib\db2jcc_v95.jar;lib\jsr94-1.0.jar;%CLASSPATH%" -Dcom.pega.pegarules.bootstrap.properties.url=config\prbootstrap.properties -Dpegarules.config=config\prconfig.xml -Dpegarules.logging.configuration=config\prlogging.xml -Dcom.pega.pegarules.bootstrap.ignorejndi=true com.pega.pegarules.pub.PegaRULES com.pega.pegarules.data.internal.access.ExtractImpl -i PegaSample!SampleBIX


Note:
provide the appropriate path and JDBC driver name.

A standalone command line BIX process

In certain circumstances, you may want to establish BIX as a stand-alone command-line process. For information about using BIX in this way, see How to use a stand-alone command-line BIX process.

BIX logging and error handling

See How to work with BIX logging and error handling for information on tracing extract errors and identifying data from partial extracts.

DefinitionsBusiness Intelligence Exchange (BIX)
Related topicsAbout Extract rules
How to set optional BIX command-line parameters