The PDF Source Component is an SSIS Data Flow Component for consuming tabular data from PDF files. The component detects tables in the PDF file and allows processing a single table or multiple consecutive tables (across several pages), assuming they have the same structure. All output columns are with type
In this section we will show you how to set up a PDF Source component.
- In the SSIS Toolbox, locate the COZYROC's PDF Source component and drag it onto the Data Flow canvas.
- Double click it to open it's editor.
- Choose the location of the PDF file and specify the following parameters to describe its tables structure and which table to process (as there are multiple tables).
Use the General page of the PDF Source dialog to specify the source PDF file and settings which table to process and how to do it.
Select a file via a standard FILE connection manager.
Specify PDF file password if necessary
Specify whether the PDF table to be processed has a header row with column names
Specify whether consecutive tables need to be treated as one. That's useful for table spanning across several pages. Only if the number of columns are the same, the table will be "merged", i.e. processed like a single table.
Specify which table to process from the PDF file (using a zero-based index). This settings enables skipping one or more tables in the beginning of the PDF document.
Specifies how many rows to skip at the end of the table. Useful, mainly when there is a summary row(s) at the end.
- New: Introduced component.