Statistics: Suppose that the Internal Revenue Service (IRS) believes that the percentage of tax returns in …
Question : Statistics: Suppose that the Internal Revenue Service (IRS) believes that the percentage of tax returns in …
Error is 10%. IF IRS is correct, what is the probability that in a random audit of 1000 tax returns, between 70 and 90 tax returns will be in error?
thanks ![]()
internal revenue audit
Best answer:
Answer by dr k
You can’t determine the probability just from the estimated average. You also have to know or assume the underlying distribution. Typically, when words like “random” are used, we assume a normal (bell-shaped) distribution. That may not always be the best assumption, but its what you use when you don’t have anything else.
Assuming a normal distribution with a mean of 100 (10% of 1000), you should be able to calculate the variance and standard deviation. There is a formula for the percentage of the population falling within a given number of standard deviations from the mean.
If you can figure out how many standard deviations away 70 is from the mean of 100, you can get the number expected to fall between 70 and 100. Do the same for 90, subtract, divide by 1000 and there you are.
Let Xb be the number of tax returns in error. Xb has the binomial distribution with n = 1000 trials and success probability p = 0.1
In general, if X has the binomial distribution with n trials and a success probability of p then
P[Xb = x] = n!/(x!(n-x)!) * p^x * (1-p)^(n-x)
for values of x = 0, 1, 2, …, n
P[Xb = x] = 0 for any other value of x.
To use the normal approximation to the binomial you must first validate that you have more than 10 expected successes and 10 expected failures. In other words, you need to have n * p > 10 and n * (1-p) > 10.
Some authors will say you only need 5 expected successes and 5 expected failures to use this approximation. If you are working towards the center of the distribution then this condition should be sufficient. However, the approximations in the tails of the distribution will be weaker espeically if the success probability is low or high. Using 10 expected successes and 10 expected failures is a more conservative approach but will allow for better approximations especially when p is small or p is large.
In this case you have:
n * p = 1000 * 0.1 = 100 expected success
n * (1 – p) = 1000 * 0.9 = 900 expected failures
We have checked and confirmed that there are enough expected successes and expected failures. Now we can move on to the rest of the work.
If Xb ~ Binomial(n, p) then we can approximate probabilities using the normal distribution where Xn is normal with mean μ = n * p, variance σ² = n * p * (1-p), and standard deviation σ
Xb ~ Binomial(n = 1000 , p = 0.1 )
Xn ~ Normal( μ = 100 , σ² = 90 )
Xn ~ Normal( μ = 100 , σ = 9.486833 )
I have noted two different notations for the Normal distribution, one using the variance and one using the standard deviation. In most textbooks and in most of the literature, the parameters used to denote the Normal distribution are the mean and the variance. In most software programs, the standard notation is to use the mean and the standard deviation.
The probabilities are approximated using a continuity correction. We need to use a continuity correction because we are estimating discrete probabilities with a continuous distribution. The best way to make sure you use the correct continuity correction is to draw out a small histogram of the binomial distribution and shade in the values you need. The continuity correction accounts for the area of the boxes that would be missing or would be extra under the normal curve.
P( Xb < x) ≈ P( Xn < (x - 0.5) )
P( Xb > x) ≈ P( Xn > (x + 0.5) )
P( Xb ≤ x) ≈ P( Xn ≤ (x + 0.5) )
P( Xb ≥ x) ≈ P( Xn ≥ (x – 0.5) )
P( Xb = x) ≈ P( (x – 0.5) < Xn < (x + 0.5) )
P( a ≤ Xb ≤ b ) ≈ P( (a – 0.5) < Xn < (b + 0.5) )
P( a ≤ Xb < b ) ≈ P( (a – 0.5) < Xn < (b – 0.5) )
P( a < Xb ≤ b ) ≈ P( (a + 0.5) < Xn < (b + 0.5) )
P( a < Xb < b ) ≈ P( (a + 0.5) < Xn < (b – 0.5) )
In the work that follows X has the binomial distribution, Xn has the normal distribution and Z has the standard normal distribution.
Remember that for any normal random variable Xn, you can transform it into standard units via: Z = (Xn – μ ) / σ
P( 70 ≤ Xb ≤ 90 ) =
90
∑ P(Xb = x) = 0.1578600
x = 70
≈ P( 69.5 < Xn < 90.5 )
= P( ( 69.5 – 100 ) / 9.486833 < Z < ( 90.5 – 100 ) / 9.486833 )
= P( -3.214982 < Z < -1.001388 )
= P( Z < -1.001388 ) – P( Z < -3.214982 )
= 0.1583196 – 0.0006522629
= 0.1576674