Statistical Errors
(Photo : Ruthson Zimmerman on Unsplash)

In statistics, errors are something to watch out for and keep in check. Data collected cannot be 100% accurate. These errors could result from:

●      Human errors,

●      Sample size,

●      Sample collection method,

While estimating sampling error for inferential statistics, one needs to check the precision of data collected from a survey. Precision is basically how closely distributed the data collected is. Broadly distributed data is therefore less precise that closely packed ones. One of the best ways to do this is to determine the margin of error. The margin of error is greatly determined by the size of the sample used for the survey.

For instance, you were to study the weight of children in a kindergarten. If you just weight 1% of the children, the data collected tends to be broadly distributed as compared to collecting weights for, say, 50% of the children. Therefore, the higher the sample size the more precise the data collected.

Now we get to complex calculation of the margin of error. This is very important when collecting sensitive data and at a professional level. Some of the main parameters that influence our value are:

Population:This is the total number of entries to study. This from our example is the total population of the kindergarten.

Sample Size: This is the number of data entries collected for the survey. This is the number of children weighed during the survey.

Alpha level: This is the probability of a null hypothesis being rejected. Considering our earlier example, a null hypothesis would be, say, "That no child weighs different from the other".Rejection of such a hypothesis would mean that such data does not reflect the weights of the entire kindergarten.

The value of alpha mostly recommended is 0.05 but others like 0.1 and 0.01 are still acceptable.

Standard Deviation: This is a measure of  how your data is distributed from the minimum value to the maximum value. For cases where one does not have access to the raw data, a standard deviation of 0.5 is used. However, the standard deviation should be calculated from the data collected.

The Z-Score: This is deviation of data from the mean as a factor of the standard deviation. For instance, a child weighing 10kg with the mean at 9.0kg and standard deviation 0.5,

The Z-Score can be determined by:

Statistical Errors
Statistical Errors

The Z-score can therefore be said to be 2

Statistical Errors

A Z-score table exists where one can determine the confidence level associated to the approximate Z-Score.

From the table a Z-Score of 1.96 is closest to our calculated value. This corresponds to a confidence level of 95%.

Confidence level: This is the probability of the Z-score. This represents the data that 'matters', or rather that represents the weights of the entire kindergarten.

Therefore, the sum of the alpha level and the confidence level adds up to 1 or 100%.

One might wonder why there is such a great concern around the confidence level. This is because it determines the sample size.

Statistical Errors

Therefore,

Statistical Errors
With this information, it is now possible to determine the margin of error.

This can be determined using:

Statistical Errors
Statistical Errors

From our example, if the total population is 100 children. The standard margin of error can be determined as:

Statistical Errors
From the Z-Score table, the Z-Score corresponding to a confidence level of 0.95 is 1.96
Statistical Errors
    

Therefore, the margin of error can therefore be determined as 10.316.

If  the raw data was accessible, then the computation would have been different. The standard deviation would have been calculated, the sample size and the confidence level. With raw data, the evaluation might be more tedious but more accurate.

In conclusion it can therefore be determined that the margin of error can be reduced mainly by increasing the sample size. For our example it would involve weighing more children. Increasing the sample size would in turn increase the confidence level proportionally. From the equation of the margin of error, it is inversely proportional to the confidence level. Therefore, increasing the confidence level would result in a proportional decrease in the margin of error calculator.

With the margin of error minimized, the precision and accuracy of data collected is maximized and this is very important while estimating sampling error for inferential statistics. Contact us for more assistance