Describe briefly how to calculate a confidence interval
What do we know
In a previous blog article, we provided a general overview about confidence intervals. We know so far that a confidence interval covers an area in which, with a certain probability (confidence level) and repeated measurements, the true location of a parameter (e.g. the mean) lies. Thus, it targets the precision of the measurement.
In our example, we have calculated that the true mean is between the lower (396.4 mg) and upper limit (401.2 mg) with 95% probability. This was based on normal distribution and a randomly drawn and sufficiently large sample. In addition, we've learned that the higher we raise the confidence level (for example, from 90% to 95%), the wider the confidence interval, and the more the precision drops. What we didn’t tell yet is that precision is a measure of the standard error of the mean (SEM) of an estimate. SEM considers the sample size n and the variation, i.e. standard deviation σ, within the population (and thus also the sample). This makes it easy to understand the formula for the calculation of the confidence interval (Note: z is not the confidence level, but the corresponding z-value):
What is required
According to the current ICH Q2(R1) guideline of 1996 respectively 2005 for the validation of analytical methods of drug substances and drug products, it is required to provide confidence intervals for both accuracy and all types of precision.
Regarding the validation of bioanalytical methods, confidence intervals are not required by the current documents of the EMA, the FDA, the Japanese or Brazilian Health Authority, although in bioanalytical methods, accuracy and precision are also validation parameters to be examined. However, the corresponding guidelines are more recent…
As part of a large number of method validations with different clients, I only encountered the explicit declaration of confidence intervals with one client because his client requested it. Depending on the method to be validated and the year, there was everything available for the parameter precision: declarations of confidence intervals for really every kind of precision such as repeatability, intermediate precision and also reproducibility (since 2 QC sites were involved), but also a single indication of one confidence interval for an overall precision, all completely independent of sample size. Which brings us to the subject. For me, from a statistical point of view, it doesn’t really make sense to calculate a confidence interval for a sample with n = 6 (which usually is the case for repeatability). How reasonable it is for n = 9 (for accuracy with 3 replicates at 3 concentrations) or n = 12 (for intermediate precision with 2 analysts and 6 replicates on two days), everyone may decide for himself. However, for samples with n ≥ 15, the width of their confidence interval doesn’t change much . For this reason, providing a confidence interval for an overall precision, which summarizes the results of all individual precision experiments, could be a good idea at least from a statistical point of view. An indication of a confidence interval for the accuracy I met only once and for each concentration individually, thus each time for n = 3...
Regardless of my personal experience, at least the Malaysian health authority seems to (or did) emphasize the declaration of confidence intervals for accuracy and precision in analytical method validations.
Suppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid. He calculates the sample mean to be 101.82. If he knows that the standard deviation for this procedure is 1.2 degrees, what is the confidence interval for the population mean at a 95% confidence level?
In other words, the student wishes to estimate the true mean boiling temperature of the liquid using the results of his measurements. If the measurements follow a normal distribution, then the sample mean will have the distribution N(,). Since the sample size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt(6) = 0.49.
Confidence Intervals for Unknown Mean and Known Standard Deviation
For a population with unknown mean andmean and known standard deviation , a confidence interval for the population mean, based on a simple random sample (SRS) of size n, is + z*, where z* is the upper (1-C)/2 critical value for the standard normal distribution.