Job Writer: Drive

To save data into files, use a Drive connection. To use a specific file type, select the desired File Format. The following file types are supported:

CSV
JSON
XML
Parquet
Raw

Date Field Identifier

The Date Field Identifier changes which date is used when using date tokens. Date tokens can be used as part of the path and/or file name itself. By default, the date token (if left blank) is set to the job execution date/time. However, to partition data based on the data itself, you can choose a source field that represents a date/time to use instead. Using a source column allows you to group records into time windows, which is useful for loading Delta Lake environments. The following date tokens are avaiable:

yyyy
YYYY
yy
YY
mm (month)
MM (month)
dd
DD
dow
DOW
doy
DOY
hh
HH
nn (minutes)
NN (minutes)
ss
SS

For YY, MM, DD, DOY, HH, NN and SS, upper-case values force leading zeroes to be added when needed.

Path and File Name

You can provide a specific path or folder in the Path Override field to write the file into. The File Name should include the file extention; it is not automatically added. Both fields accept DataZen functions to control where files will be created.

For example, the following settings will create a target folder every year, based on the Date_of_Birth field, and a seperate file per country field.

Date Field Identifier: Date_of_Birth
Path Override: c:\tmp\csv\[yyyy]\
File Name: customer_{{country}}.txt

Example: Parquet Target

In this example, the settings use a Parquet file target in ADLS using the specified Container. The name of the file will be different for each execution since the name contains the @executionid variable.

The Parquet will use the Snappy compression algorythm. The Date Field Identifier used will be the execution date/time of the job; however, since no date token is being used this setting will be ignored.

Example: CSV Target

In this example, the settings use a CSV file target in ADLS using the specified Container. The name of the file will be different for each execution since the name contains the @executionid variable.

The CSV file will be a delimited file (since no fixed-length fields are specified). In addition, a header row will be added, and any date fields will be formatted using a sortable pattern.