This website uses cookies to ensure you have the best experience. Learn more

Data Quality Essay

2366 words - 9 pages

Data quality is defined as “an inexact science in terms of assessments and benchmarks” [93]. Similarly high quality data can be described as “data that is fit for use by data consumers” [94].

11.2. Origin of Bad Data

There may be different sources from where erroneous data is originated. Data may become dirty if it is mistakenly entered, received from invalid external data source, or when good data is combined with outdated data and there is no way to distinguish between the two.

11.3. Categories and Dimensions of Data Quality

Since before data was the most valuable asset of an organization and data was rarely shared. Now businesses, governments, and research organizations rely on the exchange and sharing of various forms of data. As there is an increase in interconnectivity among data producers and data consumers; interest in data quality increases steadily. The management of data quality is typically a complex job. For the entire data management process all data quality aspects should be observed. Following table indicates the categories and dimensions of data quality [94]:

Table 11.1 Categories and Dimensions of Data Quality [94]

Categories Dimensions
Intrinsic Accuracy
Objectivity
Believability
Reputation
Contextual Completeness
Timeliness
Relevancy
Value Added
Among of data
Representational Interpretability
Ease of Understanding
Concise/Consistent representation
Accessibility Accessibility
Access security

11.4. Classification of data quality problems in data sources

Data quality problems are classified in two main categories: Single-Source problems and Multi-Source Problems [95]. A brief view of the classification and sub-classification is shown in the figure below that shows some typical problems for the various cases.

Figure 11.1 Classification of data quality problems in data sources [95]

11.4.1. Single-source Problems

Single-Source Problems can be occurred at Schema Level or Instance Level. Database systems usually enforce the restrictions of a specific data model along with the limitations of the application. Therefore at schema level there may be problems of lack of an appropriate model-specific integrity constraints or application specific integrity constraints. Data model limitations and poor schema design results in data quality problems at schema level. Also there is a high probability of errors and inconsistencies in data that arise from the sources having no proper schema, such as files. The inaccuracies and inconsistencies that cannot be handled at schema level are termed as instance level problems. As shown in above figure 1 instance level problems arise due to data entry errors like misspellings, redundancy and contradictory values. Data cleaning technique helps to overcome these issues. Data cleaning is an expensive technique, so to avoid all such problems an appropriate design is required. Also, the discovery of data cleaning rules during warehouse design can...

Find Another Essay On Data Quality

Big Data Analytics: A discussion of the importance of big data in today's world. - South Dakota State University/ Data Science - Essay

627 words - 3 pages of quality and trustworthy data. Veracity plays an important role in a sense that it is concerned with separating noise from valuable data that the organization actually needs. Big data can be deemed to be a hype to some extent because it is not satisfactory to just have the ability to store large volumes of data. It is most effective when applications are created to make meaning out of the stored data (analytics part). This implies that big data

case solution Essay

779 words - 3 pages Why is the role of a data Steward considered to be innovative? Explain. The role of Data Steward is considered to be innovative because Data Steward establishes and maintains the quality of data entered into the operational system that feed the data warehouse. Whereas most companies put someone in charge of data only when

Information Quality

945 words - 4 pages concerned with the accuracy and correctness of information whereas the pragmatic quality involves the value the correct and accurate data possesses that could support in the business operations (Ravichandran and Lertwongsatien, 2005, pp. 237-276).It should be noted that the data or information that has no value for the business is of no quality, despite its accuracy or correctness, for the enterprise. This paper aims at defining the information quality

Big Data

1812 words - 8 pages Petraeus. This metadata had contained location information which was used to correlate her stays at these different hotels. Consider the NSA again and the importance metadata might have in data management. Not only does the NSA need Big Data to handle the deluge of data it has, but it could use metadata to better sort or identify real threats and lessen the need to retain all data. Also, there is the aspect of standardization and quality of

DBMS And IRS

1348 words - 6 pages organized, it can be seen as something that is simple and useless. Data can be classified into qualitative data and also quantitative data. Qualitative data is data that deals with quality which is data that can only be observed and cannot be measured. For examples, to observe the colors, smells, tastes or appearance of something. On the other hand, quantitative data is usually related to numbers and can be measured. It is included the amount, length

Singapore's No Child Left Behind Quality Assurance Program

1119 words - 4 pages Quality Control & Assurance Introduction In identifying the strategic goals of improving student achievement, the school environment, partnership of the community and school staff effectiveness, the “no-child-left-behind” initiative launched by the Ministry of Education (MoE) in Singapore has necessitated the aggregate collection of disparate data from hundreds of primary, secondary and tertiary institutions across the country. The quality

Data Warehouse Architecture

545 words - 2 pages . From the result of the web-based survey, it is understood that hub and spoke architecture is the most prevalent architecture followed by bus architecture, centralized architecture, independent data marts and federated. The study findings show conclusively that independent data marts are the weakest solution in terms of information quality, system quality, individual impacts and organizational impacts (Ariyachandra & Watson, 2008). In

Analysis and Research for a data warehouse system

992 words - 4 pages Analysis and Research for a data warehouse system Data warehousing is a difficult system and has to have the capability deliver quality data. An operational database is one which is used by organizations to run its day to day database activities. They are designed to handle rapid transaction processes with systematically updates. Velocity is important to operational databases. They are most commonly operated by office staff, and are on

Project and Procurement Management Relationship

1003 words - 4 pages mechanisms in place to validate product data. Once data is entered, modification is difficult, tedious and costly.Errors in data classification, duplicate or obsolete product part numbers, and poor product description exist in every organization, affecting its ability to accurately budget expenditures. "The bottom line is that poor-quality product data create difficulties in controlling the costs of production, reduce the productivity of the company

Total Quality Model and Methodologies

1249 words - 5 pages areas of improvement. Process mapping and data collection is a key factor in the initiation of Continuous Quality Improvement. Mapping a specific process will allow the organization to breakdown a process through a detailed illustration and review of all the steps in the process, from start to finish, including all the inputs and outputs. Data collection will involve the identification of the relevant elements that need to be measured, focusing

What is a Data Mart?

2412 words - 10 pages   INTRODUCTION-WHAT IS A DATA MART? A data mart is a collection of data in a customized format in a data warehouse focused on a specific report or functional area such as hospital’s census, hospital’s charge activity, labor and delivery outcomes, or Nursing quality measures. (Oracle, 2012) Data marts are usually build by department of information services or an EHR vendor. Data marts usually draw data from more than one source. These data

Similar Essays

Russom's (2006) Article On The Consequences Of Poor Quality Data And The Advantages Of High Quality Data

751 words - 3 pages Russom's (2006) article talks about the consequences of poor-quality data and the advantages of high-quality data. In your view, to what extent are the data-quality statistics in Figures 1 through 4 in the article consistent with your organization's data quality situation? Discuss at least two different ways that database management software like Microsoft® Access® can help an organization avoid or reduce data-quality problems mentioned

Data Quality Assessment To Improve Quality Of Health Care In Resource Limited Settings

1855 words - 8 pages . Noticeable progress in quality could convince donors and governments that their resource are used efficiently and encourage further investment in health care (Leatherman et al., 2010). Thus, there has been growing interest on quality of these QI project results since policy makers make important decisions based on these information. Data Quality Audit (DQA) is one of the initiatives established to strengthen the quality and management of health

In Agosta's (2005) Article, Some Of The Trends In Data Quality Management

5766 words - 23 pages Augosta Week 4 DQ 1University of PhoenixDBM500: Database ConceptsAugust 24, 2007In Agosta's (2005) article, some of the trends in data quality management include: improved meta data quality, data profiling, standardization, and scoreboard reporting.Agosta identifies there where there is data, there is metadata, and the same quality practices that apply to our data should be applied to the metadata.The ability to profile data and determine any

Accuracy Of Data Input, Quality Output, Storage Devices, And Speed Of A Computer

850 words - 3 pages IntroductionThis Paper will discuss the importance of data entry and witch is the best method for data input for a variety of methods. Also this paper will look at convenience and why the quality of output is important. There are different types of storage devices that will be examined, to see which ones are appropriate for a variety of devices. And this paper will give an explanation of the role of hardware in determining the speed of a