Parquet Destination Component is SSIS Data Flow Component for generating Apache Parquet files.
- The component metadata is either automatically retrieved from a sample Parquet file or can be manually specified in JSON format.
- The generated Parquet file can contain nested arrays of objects following the composite records pattern), where the fields for the arrays are fed via separate inputs.
In this section we will show you how to set up a Parquet Destination component.
- Double-click on the component on the canvas.
- Once the component editor opens, select the destination where generated JSON data will be stored: then provide Parquet sample file or directly write the schema into Schema text editor. You can also change the size of the groups of rows into which the parquet file will be divided internally.
- When clicking on Mapping tab the component would prepare the inputs and external columns by analyzing the scehma in the Schema text editor. Please note that the Parquet Destination can have multiple inputs (see the article about composite records), which columns you can see. The data in these inputs can be processed by upstream transformation and source components (e.g. a Query Transformation can be used to retrieve the necessary data from SQL Server database).
- Click OK to close the component editor.
Congratulations! You have successfully configured the Parquet Destination component.
Use the parameters below to configure the component.
Select an existing File Connection Manager or create new.
Represents the maximum number of rows in a parquet row group. Row group is a logical horizontal partitioning of the data into rows. It holds serialized (and compressed) arrays of column entries.
JSON string representing the schema of the Parquet file.
- Fixed: Incorrect lower-case headers (Thank you, Romain).
- Fixed: Missing record in each batch of records (Thank you, Romain).
- New Introduced component.