SELECT DB

Overview

The SELECT DB operation allows you to select data from any supported source database and keep a high watermark to minimize the number of records read progressively.

The recommended way to manage high watermarks is to specify the @highwatermark or @highwatermarknull variable as part of your SQL command.

Syntax

Executes a request against a database using the SQL command provided and returns rows and columns.

SELECT  
  {TOP <n>}
  < ... | * >
FROM DB [CONNECTION] 
  -- SQL operation to execute
  ( ... )
  
  { APPLY TX '...' 
    { ON < '<field>' | <n> > }
    { WITH RAWCOL '...' } 
  }  

  -- Inner pipeline to be executed per paged results
  { APPLY PIPELINE (...) }

;

APPLY TX

Applies a document transformation on the payload assuming the payload is an XML or JSON document using the xpath provided

TRACE INTO

Saves each request and response to a database table for raw data inspection and debugging purposes without the payload response unless WITH_PAYLOAD is also specified

WITH_PAYLOAD

When tracing is enabled, indicates that the response payload should also be logged

APPLY PIPELINE

Executes an inline pipeline consisting of one or more components for the entire data set retrieved or for each batch of data retrieved when batching is possible

ON

When content transformation is enabled, indicates the name of the field to use or its column index

RAWCOL

When content transformation is enabled, indicates the column to add to the response that contains the raw content of each row

Example 1


-- Execute a SELECT operation against a MySQL database using the 
-- last_update field as a high watermark. The @highwamark variable 
-- returns an empty string if empty (first-time execution)

SELECT * FROM DB [mysqldb] 
  (
    SELECT * FROM Film WHERE last_update > '@highwatermark'
  )
  WITH HWM 'last_update'
;

Example 2


-- Execute a SELECT operation against a PostgreSQL database using the 
-- last_update field as a high watermark. The @highwamarknull variable 
-- returns null if empty (first-time execution)

SELECT * FROM DB [postgresql] 
  (
    SELECT 
           extract(epoch from last_update) "epoch", * 
           FROM film 
           WHERE  
              last_update > COALESCE('@highwatermarknull', '01/01/2000 0:00AM')::timestamp
  )
  WITH HWM 'last_update'
;

Example 3


-- Execute a SELECT operation against a Snowflake connection
-- While the @highwatermark variable can be used, this engine
-- supports automatic replace of a highwatermark value for trivial
-- operations. 

SELECT * FROM DB [snowflakeconnection] 
  (
    SELECT * FROM DATALAKE..ACTIVECAMPAIGNCONTACTS
  )
  WITH HWM 'cdate'
;