Data Quality

This component allows you to perform a number of data verification steps and optionally exclude invalid records or stop the pipeline from processing if an unexpected condition happened. To use this component, select the field name to inspect, and choose one or more of these options:

  • Allow NULL values
  • Allow empty values
  • Must be numeric
  • Value is (not) in

When choosing to exlude records that fail the data validation, the component simply acts as a filter. Records that violate the checks will be removed from the data set and the pipeline will proceed its execution. If the Stop processing with error option is checked, the data pipeline will stop with an error message.

When using the Value is (not) in option, list a set of values to validate against separated by a comma.

For more complex data quality processing options, including validating field values from other fact tables, use the Run SQL component and throw an error directly from the T-SQL script.

Example

In this example, the AccountKey is validated to ensure it is numeric and not empty. If this condition is not met, the data pipeline will stop with an error.