SINK INTO DB

Overview

This operation dumps the current pipeline data set into a SQL Server table. Dumping the data in database tables can assist with troubleshooting tasks and long-running operations that require access the data in a previous state. You can use pipeline variables and DataZen functions to calculate the name of the database object to create. The table schema is automatically detected based on the data types of the pipeline columns at the time this command is executed.

Using the ON COLUMN option changes the behavior of this component by reading the column name (or index) provided as an XML payload, detect parent/child relationships within the data, and create one or more tables of all individual nodes with children. The table names created will start with the table name provided as a prefix, and the table name suffix will be the name of the node. The REPLACE option will automatically join all these tables appropriately and return the fully denormalized data of the XML document.

Use this option instead of the inline APPLY TX operation when extracting large XML payloads for improved performance.

Syntax

Saves the pipeline data set into a database table with optional XML parsing of a column; when XML parsing is engaged multiple tables may be created

SINK INTO DB [CONNECTION] 
    TABLE '...' 
    { ON COLUMN '<field>'  }    -- optional xml column 
    { WITH  
        { APPEND | TRUNCATE | RECREATE | TRUNCATE_APPEND } 
        { REPLACE } 
        { AUTO_CLEAN }
    }
;

TABLE	The table name to create; may contain DataZen functions and pipeline variables; when using the ON COLUMN setting, this property will be used as the prefix of the table(s) that will be created after the XML transformation
ON COLUMN	Indicates an XML column needs to be parsed into one or more tables; each table will use the TABLE setting as their prefix
CLEANUP	When using a column transformation, and using the With Replace option, this setting drops the tables that were created by the Sink operation automatically after the join operations are completed; this turns these tables into transient storage for the resulting XML JOIN operation
REPLACE	When using a column transformation, this option performs a full LEFT JOIN operation on all the tables created to return a single output of all the XML inner tables; if inner tables are not in a parent/child relationship, a CROSS JOIN operation is performed instead
CONTINUE_ON_ERROR	Continues processing even if errors are detected

Example 1

-- Get data from an RSS Feed
SELECT * FROM HTTP [Rss Feed] (GET /econi.xml) APPLY TX (//item);

-- dumps the data set as-is into a table; the name of the table
-- will be calculated at execution time and will be recreated
-- if it exists
SINK INTO DB [sql2017] TABLE '[feed].[rss_@executionid]'
WITH RECREATE;

Example 2

-- Get data from an RSS Feed
SELECT * FROM HTTP [Rss Feed] (GET /econi.xml);

-- Dumps the XML payload into one or more tables
-- based on its internal parent/child nodes and 
-- return a full LEFT JOIN of all its nodes 
-- as the new pipeline data and delete the staged tables 
SINK INTO DB [sql2017] TABLE '[rss_]'
	ON COLUMN 'payload'
	WITH 
		REPLACE
		CLEANUP;