Support Article

Hive defined fields not displaying via HDFS data set

SA-37893

Summary



Hadoop distributed file system (HDFS) dataset preview shows corrupted decimal content from Hive table.

The Hive table is stored as a file in a Hadoop HDFS file.

Error Messages



Not Applicable


Steps to Reproduce



1. From Hue >Hive Editor, create a table with one of the columns as decimal type. Also make sure that the table is stored as 'Parquet' format.
2. Insert some data in to the table. cross check if Hive editor displays the data.
3. From Pega:
a. Create a Hadoop configuration and check test connectivity.
b. Create a HDFS dataset and configure the file.
c. Perform Preview File and check the File content.


Root Cause



There is no 'decimal' data type in parquet and hive-parquet code is using its own encoding as 'fixed_len_byte_array' to store this decimal data.

Resolution



Use 'double' data type when creating the Hive table, and in PRPC use decimal property mode to hold decimal data.

Published May 12, 2017 - Updated June 15, 2017

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.