Skip to main content

Discrete Multivariate Analysis - STAT 579 - Problem Set 7

Problem 1

Consider an experiment on chlorophyll inheritance in maize. A genetic theory predicts the ratio of green to yellow to be 3:1. In a sample of [n=1103n = 1103]{.math .inline} seedlings, [n1=854n_1 = 854]{.math .inline} were green and [n2=249n_2 = 249]{.math .inline} were yellow.

Part (a)

The statistic is defined as [X2=j=1c(njnπj0)2nπj0.X^2 = \sum_{j=1}^{c} \frac{(n_j - n \pi_{j 0})^2}{n \pi_{j 0}}.]{.math .display}

Under the null model, the odds are 3:1, or [π10=0.75\pi_{1 0} = 0.75]{.math .inline} and [π11=0.25\pi_{1 1} = 0.25]{.math .inline}. We are given [n=1103n = 1103]{.math .inline}, [n1=854n_1 = 854]{.math .inline}, and [n2=249n_2 = 249]{.math .inline}. Under the null model, these values have expectations given respectively by [nπ10=827.25n \pi_{1 0} = 827.25]{.math .inline} and [275.75275.75]{.math .inline}.

The observed statistic is thus given by [X02=(854827.25)2827.25+(249275.75)2275.75=3.46.X_0^2 = \frac{(854 - 827.25)^2}{827.25} + \frac{(249 - 275.75)^2}{275.75} = 3.46.]{.math .display}

Part (b)

The reference distribution is the chi-squared distribution with [11]{.math .inline} degree of freedom, denoted by [χ2(1)\chi^2(1)]{.math .inline}.

The upper 10th percentile given [11]{.math .inline} degree of freedom, denoted by [χ0.102(1)\chi_{0.10}^2(1)]{.math .inline}, is found by solving for [χ0.102(1)\chi_{0.10}^2(1)]{.math .inline} in the equation [Pr(χ2(1)χ0.102(1))=10.10=0.9\Pr(\chi^2(1) \geq \chi_{0.10}^2(1)) = 1-0.10 = 0.9]{.math .inline}, which yields the result [χ0.102(1)=2.71.\chi_{0.10}^2(1) = 2.71.]{.math .display}

Part (c)

We see that any observed statistic [X02X_0^2]{.math .inline} with [df=1\rm{df}=1]{.math .inline} greater than [χ0.102(1)=2.71\chi^2_{0.10}(1) = 2.71]{.math .inline} is not compatible with the null model at significance level [α=0.10\alpha = 0.10]{.math .inline}.

Since the observed statistic [X02=3.46>2.71X_0^2 = 3.46 > 2.71]{.math .inline}, the null model, which is the genetic theory where the ratio of green to yellow is 3:1, is not compatible with the data.

Part (d)

Hypothesis testing, as a dichotomous measure of evidence, does not provide as much information as a more quantitative evidence measure. For instance, it does not provide information about effect size.

Problem 2

Part (a)

The MLE of [π1\pi_1]{.math .inline} is given by [π^1=n1n=8541103=0.774\hat{\pi}_1 = \frac{n_1}{n} = \frac{854}{1103} = 0.774]{.math .inline}. Letting [α=0.10\alpha = 0.10]{.math .inline} and inverting the Wald test statistic, we get the [90%90\%]{.math .inline} confidence interval for [π1\pi_1]{.math .inline}, [π^1±z1alpha/2σπ^=0.744±1.6450.774(10.774)1103,\hat{\pi}_1 \pm z_{1-alpha/2} \sigma_{\hat\pi} = 0.744 \pm 1.645 \sqrt{\frac{0.774(1-0.774)}{1103}},]{.math .display} which may be rewritten as [[0.754,0.795].[0.754, 0.795].]{.math .display}

Part (b)

Based on the observed data, we estimate that the probability [π1\pi_1]{.math .inline} of a green strain is between [0.7540.754]{.math .inline} and [0.7950.795]{.math .inline}.

As expected, the null model specifies a value for [π1\pi_1]{.math .inline} ([0.750.75]{.math .inline}) that is not contained in this confidence interval.