[tabby title=»Performed»]

1.1 Data cleansing requirements are defined and performed.
Refer to Data Requirements Definition for more information related to devel- opment and management of requirements.
Example Work Products

  • Data cleansing requirements
  • Data cleansing guidelines

[tabby title=»Managed»]

2.1 Data cleansing activities adhere to data cleansing requirements, which are linked to process improvements to achieve business objectives.
Data cleansing requirements are often derived from data profiling, data quality assessments, change requests, identification of report discrepancies, and regulatory audits. Requirements typically identify the following topics:
What problem to solve (i.e., business objective not being met as a result of the data issue)

  • What data set to clean (data or metadata)
  • How the target state of the data will solve the problem

Describing the target state of the data involves an understanding of the business rules, data definitions, current and future usage, quality criteria, business objectives, cost and benefit analysis, data, software and system architecture, etc.

2.2 Data cleansing activities conform with data quality requirements (e.g., quality dimensions such as conformity, accuracy, uniqueness) and quality criteria.
Data cleansing processes and rules should be documented and linked to business objectives and requirements. The resulting cleansed data should be evaluated for consistency with the data definitions. The data dictionary, in turn, is updated as a result of deficiencies discovered during data cleansing e orts.

  • Data cleansing rules are used to support the achievement of data cleansing objectives and processes.
  • The development of rules typically involves:
  • Validating data cleansing rules against business definitions
  • Linking data cleansing rules to business rules
  • Conformance with quality criteria
  • Use of business SMEs to create rules
  • Documentation and approval of data cleansing rules and their validation
  • A process for adding new rules
  • Measures to assure consistent application of cleansing rules across repositories

2.3 The scope of data cleansing is defined.
The scope includes the identification of the data set, data elements and
rules that will define the cleansing activity, as well as the criteria for prior- itizing data cleansing activities. The scope needs to be balanced with any time-bounds or resource constraints that may exist. The impact analyses produced by the data quality assessment process are input to estimating and setting the boundaries of data cleansing e orts.
Data Quality Assessment and Data Profiling provide complementary infor- mation that can be leveraged for this practice.

2.4 The process for performing data cleansing is defined by a plan.
Typical planning steps include the following:

  • Evaluate the downstream impacts of the change
  • Determine where the changes needs to be made
  • Reconcile any changes with respect to the data’s original intent, versus the implied meaning resulting from the cleansing requirement.

This avoids unintentionally changing the meaning of the data, which results in ine ciencies and confusion
Data cleansing plans typically include:

  • The tools and methods that will be used for correcting the data (i.e., multiple repository comparison, verification against valid source, logic checks, etc.)
  • Defined process for resolving di erences of opinion on data validity
  • Logging of corrections, audit trails, version control (configuration management)
  • Feedback on needed adjustments to the data store maintenance plans
  • Definition of how referential integrity will be maintained after data changes are implemented

2.7 Data cleansing issues are communicated and resolved, when possible, in the internal or external source.
The best defense against spending the time and e ort to cleanse the same data set for multiple instances on multiple occasions is prevention at the point of origin.
Example Work Products

  • Data cleansing policy
  • Data cleansing processing and rules Data cleansing metrics
  • Data cleansing plans
  • Data correction methodologies Data cleansing issues

[tabby title=»Defined»]
3.1 Data change history is maintained through cleansing activities.
Data change history should be supported by traceability of cleansing rules from their requirements to where the changes are being made and down through their interconnected processes.
See the Metadata Management for guidance on ensuring that necessary information is managed to support this activity.
3.2 Policies, processes, and procedures exist to ensure that data cleansing activities are applied at the point of origination in accordance with published rules.
3.3 Data cleansing rules are applied consistently across the organization.
Quality rules and business rules for data elements that are shared among multiple business lines should not diverge unless a considered exception has been approved.
3.4 A governance group establishes, maintains, and ensures adherence to data cleansing rules.
3.5 Standard data cleansing results report templates, at the detail and summary level, are employed.
The organization can monitor these reports to develop a picture of dupli- cative e orts (cleansing the same data sets in multiple data stores). In circumstances where the same shared data is used (e.g., customer data), rather than spend the time and allow separate e orts to fix several data repositories, resources may be better spent to modify systems and processes to consume from a single standard data set that can meet the quality require- ments of all users.

Example Work Products

  • Data change history log
  • Traceability matrix
  • Data cleansing feedback
  • RACI matrix for data cleansing governance, activities, and rule devel- opment
  • Data cleansing results report templates

[tabby title=»Measured»]
4.1 Service level agreements include data quality criteria to hold data providers accountable for cleansed data.
The results of data validations should be provided to internal and external suppliers to enable the improvement of their cleansing processes.
Refer to Provider Management for more information related to these expecta- tions.
Example Work Products

  • Service level agreements
  • Feedback documentation

[tabby title=»Optimized»]

5.1 The organization is involved in the establishment and maintenance of external or industry standards for improving the quality of data content.
This is especially applicable when the organization is involved in industry initiatives to standardize data produced or consumed by multiple organizations.
5.2 Data cleansing requirements for data providers are managed in accordance with standardized processes.
Example Work Products

  • Meeting minutes showing involvement in standards
  • SLAs include cleansing processes and expectations for data providers

[tabbyending]