This content has been archived and is no longer being updated. Links may not function; however, this content may be relevant to outdated versions of the product.

How to parse Comma-Separated-Value (CSV) files


Text files containing a series of values separated by commas are known as CSV files or Comma-Separated Values files.

You can create CSV files from a Microsoft Excel spreadsheet..

If none of the values contains a comma, you can parse CSV lines with a Parse Delimited rule, identifying the comma character as the delimiter. (For an example of this approach, see the PDN Exchange article Bulk Operator Load.)

However, if the CSV file contains a value that itself contains a comma, the value when saved from Excel will be surrounded by double quote characters. Double quote characters are are optional and often omitted for the values that do not contain a comma character. For such CSV files, you can't use Parse Delimited rules to parse the file, because neither quote characters nor comma characters are true delimiters that surround each field.

(Other files use a backslash \ or another character between every pair of values, including values containing commas. You can parse such files using a Parse Delimited rule.)

This article shows how to use two standard function rules parseCSVHeader() and parseCSVDetail() from the MapTo library to convert lines of a CSV file into property values.

Update: Parse Delimited rules in V5.5 can parse CSV files directly. See How to parse a Comma-Separated-Values (CSV) file using a file service.



Suggested Approach

In this example, each line of the CSV file (after the first line) contains 4 values that are to become the pyLabel, pyDescription, pyPrioritySample and pyUrgencyWorkAdjust values of a work object, separated by commas.

Test Data

The first line (column headers in Excel) of the CSV file contains the property names.

Because the last pyDescription value contains a comma, this value is between double quote characters. For all other values, the quotes are optional and omitted. Because spaces, commas, and quotes are significant, take care to follow the CSV "format" or use Excel to create the CSV file.

The example uses a file listener to monitor a specified directory and process any CSV file it finds, creating a work object from each line of the file (except for the first line).

Step 1: Identify the work type and initial flow rule

A new work object ordinarily starts a flow execution. By default, services do not have a "current operator", and such services, when executing, lack some of the properties associated with a interactive requestor. Make sure the initial tasks in the flow do not depend on such properties as "the current operator" or the operator's work group or organization.

In this example, the TaskSample flow routes all new work objects to a fixed workbasket ( identified explicitly in the flow.


Step 2. Create a service package

Every service requires a service package. In this example, all values in tabs other than the Context tab contain default values.

Service Package

Step 2. Create the service activity

The service activity performs the processing for the service. In this case, the Loop service activity calls the built-in standard activity Work-.svcAddWorkObject, passing as a parameter the name of a flow (TaskSample) to start. (When the Loop activity starts, the four properties extracted from the line of the CSV file are already on its primary page.) The standard activity acquires a work object ID, saves and commits the work object, and starts the flow execution.

Loop activity

Step 3. Create the File Service rule

The file service rule processes each CSV file found by the listener. (For a more general description of file services, see How to set up a file service and file listener.) The Service tab identifies a page that is constructed each time the file service rule is started.

Service File rule 1

The Method tab determines how the contents of the file are parsed. For Windows text files, the line terminator is \r\n. Since this text file is a CSV file, each line is a "record" for parsing purposes.

Service File rule

The Request tab calls two standard functions to parse the CSV file. At runtime, the parseCsvHeader() function — executed only once, for the first line of the file— extracts property names and saves them in an array named pyTagList.

The parseCSVDetail() function, executed for each row after the first row, creates a temporary primary page for the service activity containing values for each property, using property names from the the pyTagList array, and calls the service activity.

Service File 3

The flow execution routes the first assignment to a fixed workbasket.

Step 4: Create the File Listener.

This file listener monitors the directory E:\temp for any Windows file with a CSV file type. When a file is found, it calls the file service rule.

File LIstener 1

This file listener creates a RPT file and an internal record (instance of Log-Service-File) for each file processed. These are optional, but helpful in debugging and reporting.

File Listener 2

Step 5. Test

1. Use the System Management Application (Listener Management panel) to start the file listener. Alternatively, you can stop and restart the Process Commander server node to start the file listener.


2. To monitor service processing in full detail, open the service file rule and select Run > Trace Open Rule. The Tracer starts. Adjust Tracer settings to determine the desired amount of detail. For example:

Tracer output

3. Create a test file and copy it to the directory that the listener monitors. Wait.

4. After the listener finds and processes the CSV file, examine the corresponding RPT file.

5. Locate and examine the work objects created by the file service rule. In this example, the open assignments are in the workbasket:


Work object W-199 was created third but is listed first, because the pyUrgencyWorkAdjust value in the that row of input is +5, leading to an overall urgency value of 15. Assignments in this list view report are sorted by decreasing urgency.

In this example, the created work objects contain only four property values obtained from the input, and all end up in a single workbasket. In more sophisticated cases, the CSV file could provide many properties and the service activity could create more varied work objects with varying initial flows.

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.