SELECT HTTP
Overview
The SELECT HTTP operation is designed to call any HTTP endpoint, typically involving a GET, POST, PUT, or PATCH operation. This operation supports all the options provided by DataZen's HTTP Reader; for details on all the available options, see the HTTP/S Reader help section.
The HTTP command to send is specified in parenthesis right after the HTTP connection, and needs to include the
HTTP Verb to execute. For example, the following executes a POST operation on an Intacct endpoint:
(POST https://api.intacct.com/ia/xml/xmlgw.phtml)
Use the high watermark settings when you need DataZen to remember the last high value of a filter in order to retrieve only changed (or new) records since the last pipeline was executed. See the Watermark Pattern section for details on how to use high watermark values with HTTP requests.
If specified, the APPLY TX operation happens before the APPLY PIPELINE operation. It can be used to transform payload responses into rows and columns directly. While this can be a handy way to work on a data set immediately, you lose the ability to inspect HTTP Response headers and other properties. For debugging purposes, it is recommended to also use the TRACE INTO operation so that you have a way to inspect actual response codes, full payload, and response headers.
The APPLY PIPELINE operation declared within the SELECT operation allows you to add one or more transformations before capturing the final data set. When using this approach, consider that the pipeline will execute for every paged request. For example, if you use the PAGING option, and 5 pages of data are retrieved, this pipeline will execute five times. If you would like to execute a pipeline transformation after all pages have been retrieved, but before the capture operation, use the APPLY PIPELINE operation after the SELECT operation.
Syntax
Executes an HTTP operation and returns the HTTP result including its payload and optionally returns rows and columns by transforming the response payload.
{ USING (...) ON DB [CONNECTION] { WITH { SEPARATE_EXECUTION} { INCLUDE_FIELDS } } } SELECT {TOP <n>} < ... | * > FROM HTTP [CONNECTION] (< GET | POST | PUT | PATCH | DELETE > ... ) { WITH { HWM '<field>' { HWM_PARAM '...' } { HWM_AS_DATETIME } } { TIMEOUT <n> } { PAYLOAD (...) } { HEADERS '...' } { PAGING < 'none' | 'offset' | 'page' | 'link' | 'token' > { PAGING_PARAM '...' } { PAGING_PATH '...' } } { CONTENT_TYPE '...' } { RETURN_BYTES } } { TRACE INTO [CONNECTION] TABLE '...' { WITH_PAYLOAD } } { APPLY TX '...' { WITH RAWCOL '...' } } -- Inner pipeline to be executed per paged results { APPLY PIPELINE (...) } ;
USING |
Specifies dynamic parameters retrieved from a SQL database (making an HTTP call per record returned) to use as part of the HTTP request URL and/or Payload using {{@param.<field>}} markers |
SEPARATE_EXECUTION |
Each dynamic row will be treated as a separate execution causing a change log to be created per dynamic row |
INCLUDE_FIELDS |
Add the dynamic row as additional columns to the output returned by the HTTP operation |
HWM |
The column name to use as the field containing the next high watermark value if found; this option requires the use of the APPLY TX transformation; the high watermark value can be used as part of the URL or the PAYLOAD using the @highwatermark variable, or will automatically be added to the URL as a parameter if the HWM_PARAM is also set |
HWM_PARAM |
The query parameter to be added or updated in the URL; leave blank to use the @highwatermark variable |
HWM_AS_DATETIME |
The high watermark value will be treated as a date/time field so the next high watermark can be correctly calculated |
HEADERS |
An array of key/value pair headers to add to the HTTP request (ex: [ { "key1": "value1" }, { "key2": "value2" } } ]) |
PAYLOAD |
For POST, PUT and PATCH operations, represents the optional payload to use |
TIMEOUT |
A timeout value in seconds |
PAGING |
The paging strategy to use; must be one of: 'none', 'offset', 'page', 'link', 'token' |
PAGING_PARAM |
The paging parameter to add to the URL as a query string; if not specified, paging variables can be used in the URL or the PAYLOAD (@pagingindex, @pagingmarker, @recordcount, @recordindex) |
PAGING_PATH |
The path within the raw response to use to identify the paging token or link to use on the next call |
CONTENT_TYPE |
The HTTP Request Content-Type setting for POST, PUT, and PATCH operations |
RETURN_BYTES |
Indicates that the HTTP call returns binary data (ex: an image or PDF document) instead of a string |
APPLY TX |
Applies a document transformation on the payload assuming the payload is an XML or JSON document using the xpath provided |
TRACE INTO |
Saves each request and response to a database table for raw data inspection and debugging purposes without the payload response unless WITH_PAYLOAD is also specified |
WITH_PAYLOAD |
When tracing is enabled, indicates that the response payload should also be logged |
APPLY PIPELINE |
Executes an inline pipeline consisting of one or more components for each page of data retrieved by the HTTP call |
RAWCOL |
When content transformation is enabled, indicates the column to add to the response that contains the raw content of each row |
Example 1
-- This HTTP request uses the Rss Feed connection and calls the /econi.xml relative URL -- The Rss Feed connection points to: https://www.govinfo.gov/rss -- This returns the raw output as a single row, which includes response headers -- and the full response payload SELECT * FROM HTTP [Rss Feed] (GET /econi.xml);

Example 2
-- This HTTP request uses the Rss Feed connection and calls the /econi.xml relative URL -- The Rss Feed connection points to: https://www.govinfo.gov/rss -- The APPLY operation transforms the HTTP response body using an XPath command that -- turns the payload into rows and columns automatically SELECT * FROM HTTP [Rss Feed] (GET /econi.xml) APPLY TX (//item);
