SELECT DB
Overview
The SELECT DB operation allows you to select data from any supported source database and keep a high watermark to minimize the number of records read progressively.
The recommended way to manage high watermarks is to specify the @highwatermark or
@highwatermarknull variable as part of your SQL command.
Syntax
Executes a request against a database using the SQL command provided and returns rows and columns.
{ APPEND }
SELECT
{TOP <n>}
< ... | * >
FROM DB [CONNECTION]
-- SQL operation to execute
( ... )
-- Used with the @highwatermark and @highwatermarknull variables
{ WITH HWM '<field>' }
{ APPLY TX '...'
{ ON < '<field>' | <n> > }
{ WITH RAWCOL '...' }
}
-- Inner pipeline to be executed per paged results
{ APPLY PIPELINE (...) }
;APPEND |
Appends the output of this operation, after APPLY TX if any, to the current data pipeline and adjusts the schema accordingly to accommodate new fields if needed |
TOP |
Client-side operator that limits the number of records after the inner pipeline has executed |
HWM |
The column name to use as the field containing the next high watermark value if found; the @highwatermark and @highwatermarknull variables can be used in the SQL operation directly to retrieve this value |
APPLY TX |
Applies a document transformation on the payload assuming the payload is an XML or JSON document using the xpath provided |
APPLY PIPELINE |
Executes an inline pipeline consisting of one or more components for the entire data set retrieved or for each batch of data retrieved when batching is possible |
ON |
When content transformation is enabled, indicates the name of the field to use or its column index |
RAWCOL |
When content transformation is enabled, indicates the column to add to the response that contains the raw content of each row |
Example 1
-- Execute a SELECT operation against a MySQL database using the
-- last_update field as a high watermark. The @highwamark variable
-- returns an empty string if empty (first-time execution)
SELECT * FROM DB [mysqldb]
(
SELECT * FROM Film WHERE last_update > '@highwatermark'
)
WITH HWM 'last_update'
;Example 2
-- Execute a SELECT operation against a PostgreSQL database using the
-- last_update field as a high watermark. The @highwamarknull variable
-- returns null if empty (first-time execution)
SELECT * FROM DB [postgresql]
(
SELECT
extract(epoch from last_update) "epoch", *
FROM film
WHERE
last_update > COALESCE('@highwatermarknull', '01/01/2000 0:00AM')::timestamp
)
WITH HWM 'last_update'
;Example 3
-- Execute a SELECT operation against a Snowflake connection
-- While the @highwatermark variable can be used, this engine
-- supports automatic replace of a highwatermark value for trivial
-- operations.
SELECT * FROM DB [snowflakeconnection]
(
SELECT * FROM DATALAKE..ACTIVECAMPAIGNCONTACTS
)
WITH HWM 'cdate'
;