Data Prep Pipelines can be executed from OneCloud Integration Studio using the Run Pipeline Command from the OneCloud Data Prep BizApp.
To execute a Data Prep Pipe, add the Run Pipelines Command to the Chain. Configure the Command.
Select the Pipeline to be executed.
Optionally, if Runtime Variables are assigned to the Pipeline, specify the value for each Variable. Integration Studio Variables including Command Outputs that match the data type of the Data Prep Variable can be leveraged. Static values can also be specified.
Select the Connection and Runner.
Specify the File to be processed by the Pipeline. The file can be the Output of any Command in the Chain, a Runtime File Upload, or a Workspace Resource.
Complete the Field Mappings. The Field Mappings assume that the Pipeline field names match the data file field names. The field name in the file can vary from the Pipeline field name. When the field name does not match, simply update the File Column Name that associates (maps) to the Pipeline Column.
Select the delimiter of the file
Select the delimiter for the file created by the Pipeline. The output delimiter does not need to match the input file delimiter.
⚠️ Field names are case-sensitive!
📓 Parallel Processing is selected by default. In general, parallel processing optimizes the performance of the Pipeline execution. In instances of high levels of concurrency or limited system resources, parallel processing may degrade performance. Monitor this setting and adjust accordingly for your application and data processing demands.
📓 The Keep Unmapped Fields option is checked by default. This setting allows data files with more fields than are specified in a Pipeline Column Definition to be processed by a Pipeline. When this option is selected, the Pipeline Output will include the fields generated by the Pipeline as well as the additional fields in the data file that are not mapped to a Pipeline column. Be careful when enabling this setting with Pipelines that use a Group By transformation.
Updating Run Pipeline Command for Data Prep Variable Changes
When Data Prep Runtime Variables are added or removed from a Pipeline, the Run Pipeline Command needs to be updated to account for the change. In the Pipeline selection field, click the X to remove the Pipeline. Select the Pipeline again and the Data Prep Runtime Variables assigned to the Pipeline will be shown in the Run Pipeline Command form. Update the configuration according to the above instructions.
❗ Removing the selected Pipeline removes the configuration of all inputs of the Command. It is recommended to make note of the configuration, especially field mappings, prior to removing the Pipeline.
📓 If Runtime Variables were defined in the Pipeline but are not available on the Run Pipeline Command form then confirm that the Pipeline is published.