Home | Stats Matter | Toolkit | Data Quality

Data Quality

Stats Matter

Data Quality

Aim of this topic

To provide guidance to Tasmanian Government agencies on data quality considerations when accessing, using, collecting, storing and managing data.


What is data quality?

In the past, the term data quality has been generally understood as a concept for accuracy. The new consensus is that data is considered to have sufficient data quality when it is appropriate to use for its intended purpose, or fit for purpose. The purpose may include operational use, decision making, mandated reporting and legislative requirements. In this context, fitness implies both freedom from defects and possession of the desired features and attributes for sustainable use. Quality is therefore a multidimensional concept which does not only include the accuracy of statistics but also stretches to include other aspects such as relevance.

The following resources can provide information around data quality and why it should be assessed.

Resources

When and why should data quality be assessed?

The purpose of data is to provide information to aid decision making. Ensuring that the data is of the highest possible quality is essential to effective decision making. The use of data that is not fit for purpose may lead to incorrect conclusions and poor decisions being drawn.

Data quality should be a consideration when designing a collection or product to ensure that it is fit for purpose as well as when identifying and deciding whether to use data for a particular purpose.

Choosing the right data set

Data awareness can assist users to select data sets that provide accurate answers to the questions they are attempting to answer; or one that is fit for purpose. To select a data set that meets your needs, you should:

  • define your data need;
  • identify existing data sets;
  • assess the quality of those data sets, and consider completing a quality declaration.

Resources

How can data quality be assessed?

The Australian Bureau of Statistics has developed a Data Quality Framework which comprises seven quality dimensions that reflect a broad and inclusive approach to data quality definition and assessment. These quality dimensions are:

  • Institutional environment - refers to the institutional and organisational factors that may have a significant influence on the effectiveness and credibility of the agency producing the statistics.
  • Relevance - reflects the degree to which the data meets the current and potential future needs of users.
  • Timeliness - refers primarily to how current or up to date the data is.
  • Accuracy - refers to how well information contained within the system reflects reality.
  • Coherence - refers to the internal consistency of a statistical collection, product or release, as well as its comparability with other sources of information, within a broad analytical framework and over time.
  • Interpretability - refers to the availability of information to help provide insight into the data.
  • Accessibility - refers to the ease of access to data by users, including the ease with which the existence of information can be ascertained, as well as the suitability of the form or medium through which information can be accessed.

All seven dimensions should be included for the purpose of quality assessment and reporting. However, the seven dimensions are not necessarily equally weighted as the importance may vary depending on the data source and context. The following resources provide further information around the Data Quality Framework and how it can be used to evaluate the quality of statistical collections and products including administrative data.

Resources

Data quality statements

Data quality statements (or declarations) describe the key characteristics of data which impact on quality, so that potential users can make informed decisions about fitness for use. Data quality statements should report both the strengths and limitations of the data.

Data quality statements enable transparency for both the data producer and data user. Assessments about data quality enable all parties to determine the fitness of purpose for the data, i.e. whether they can use the data for the purpose they had in mind. Sometimes the fitness for purpose of the data won't be known prior to the data sharing being undertaken, for example where the data is being shared for analytical purposes. In these instances, this should be acknowledged in the agreement. Keep in mind that data that may be considered as being high quality for one purpose, may not be suitable for another purpose.

The following resources provide further information and guidance around the use and undertaking of quality statements.

Resources

Maintaining data quality and ensuring data fitness

Data quality describes how fit for purpose a data set is. Data fitness helps you understand how exercising good data practices can keep your data healthy so that maximum value can be made of your information sources. Data is healthy when it is:

  • fit for purpose;
  • measurable; and
  • comparable.

Data fitness not only includes Data Quality as a key element, but also Metadata Management and Data Sharing.

The following resource identifies the importance of keeping data in good shape and measuring data fitness.

Resources

Data quality and management policies

A number of jurisdictions have developed data quality and management policies specific to health related data due to the importance of data quality in the health system. Data must be reliable in order to support quality improvement and in understanding health care delivery. The resources below provide information around frameworks and polices used to assess and improve the quality of data created and managed in other jurisdictions.

Resources


Document key
 HTML page
 Link to external site
 PDF file
 MS Word
 MS Powerpoint
 MS Excel
 File