As a population, we are bombarded with percentages and statistics, but how does one know if what we are being told is correct? The book How to Lie With Statistics by Darrel Huff was written to help readers better understand statistics especially when they are presented to us in ways that can be misleading or misunderstood. The book is not meant as a guide on how to change or manipulate statistical numbers. However, if statistics are not presented properly or perhaps purposely misleading people, this book will help readers question or form their own opinions from data. Most people simply are not that interested when you hear the word statistics and many times people do not believe the numbers presented. This mistrust occurs most often for two reasons: the person not being able to see the raw data and where or how it was collected and the person not being able to verify the credibility of the information presented. Throughout the book, Huff discusses different statistical techniques that can be used improperly and how one can discern good statistics from those that may have been manipulated.
Huff starts out by talking about the sample with a built-in bias. Data collected at the beginning of the study that is used to create statistics originates from someone or something. One of the problems with the data is someone may not answer the question honestly; therefore you do not get a truly unbiased response. The other way built in bias is noted occurs when the sample picker does not truly randomly select the people, or subjects and this introduces bias in the sample as well. One must be aware of how subjects are chosen and randomized.
On another topic of statistics, Huff writes about the well-chosen average. The average that one presents is important to the reader forming opinions of the statistics. There are three types of averages: mean median and mode. The mean is the arithmetic average, the sum of the values divided by the number of values. The median is the middle point of the data, half of the values are higher and half of the values are smaller. The mode which is the value that occurs most in the data set. Once a person understands the three types of averages, the presenter can lead the reader depending on what type of average is presented. The reader needs to always know which type of average is utilized, since each one is better or worse at representing what the true average may be. A mean can be skewed with large outliers, medians tend to represent this type of skewed data better, and modes can best describe where the largest occurrence of data values take place. It is important to note which average is being presented so as to not be mislead.
The reader needs to understand how sample data is selected. In other words, one needs to make sure there is a sufficient sample size, so it does not influence the results. Many marketing campaigns are notorious for picking a sample size that gives them the results they are looking for. One...