If we have a non-linear function of some random variable, in the delta method we use a Taylor approximation of the function centered around the random variables expected value such that the approximate variance of the function of the random variable is easily computed.
Let [g(π^)=log(π^/(1−π^))]{.math .inline}. A linear approximation of [g]{.math .inline} is given by [g^(π^)=g(π)+g′(π)(π^−π),]{.math .display} the derivative of [g]{.math .inline} is given by [g′(π)=π(1−π)1,]{.math .display} and the variance of [g^(π^)]{.math .inline} is given by [Var(g^(π^))=(g′(π))2Var(π^)=π2(1−π)21nπ(1−π)=π(1−π)1n1.]{.math .display}
Since we do not know [π]{.math .inline}, we approximate it with [π^]{.math .inline}, thus [σ2(log(π^/(1−π^)))=π^(1−π^)1n1.]{.math .display} # Problem 2
Consider data from a retrospective study on the relationship between daily alcohol consumption and the onset of esophagus cancer.
In a retrospective study, the table draws samples from the conditional probabilities [P(X=i∣Y=j)]{.math .inline} and we denote [P(X=1∣Y=j)]{.math .inline} by [pj]{.math .inline}.
The ML estimator of [pj]{.math .inline} from the data in a retrospective study is given by [p^j=n+jn1j.]{.math .display}
In a retrospective study, an estimator of [σ(logθ^)]{.math .inline} is given by [σ^(logθ^)=[(p^11+1−p^11)n+11+(p^21+1−p^21)n+21+]1/2.]{.math .display}
Part (b)
It does not depend on the sampling scheme. When asymptotic normality holds, replacing parameters with statistics invokes the likelihood principle.
When we substitute the MLE [p^j]{.math .inline} into the equation for [σ^(logθ^)]{.math .inline}, we get the result [σ^(logθ^)=(n111+n211+n121+n221)1/2,]{.math .display} which is the same as for retrospective studies and cross-sectional studies.
Part (c)
Recall [θ=p2/(1−p2)p1/(1−p1)]{.math .inline}. If we replace [pj]{.math .inline} by its ML estimator [p^j]{.math .inline}, then [p^1=n+1n11=13171≈0.542]{.math .display} and [p^2=n+2n12=52382≈0.157.]{.math .display} Thus, by the invariance property of MLEs, [θ^=p^2/(1−p^2)p^1/(1−p^1)≈0.157/0.8430.542/0.458≈6.354]{.math .display} and therefore [logθ≈log6.354≈1.850]{.math .inline}. Next, we need to find the standard deviation of the [logθ^]{.math .inline}, [σ^(logθ^)=[(0.5421+0.4581)1311+(0.1571+0.8431)5231]1/2≈0.213.]{.math .display}
Thus, a [95%]{.math .inline} confidence interval for [logθ]{.math .inline} is [logθ^±1.96σ^(logθ^).]{.math .display} Plugging in these computed values, we get the [95%]{.math .inline} confidence interval [[1.850−0.417,1.850+0.417]=[1.433,2.267].]{.math .display}
Part (d)
Observe that [γ^=θ^+1θ^−1.]{.math .display} The MLE [θ^≈6.354]{.math .inline}. Thus, [γ^≈6.354+16.354−1≈0.728]{.math .inline}.
We estimate that there is a large size, positive association between daily alcohol consumption and the onset of cancer.
Problem 3
For the counts array, we populated it with the values [(7,7,2,3,2,8,3,7,1,5,4,9,2,8,9,14)]{.math .inline} and ran the code.