CHI-SQUARE TEST OF INDEPENDENCE OF STUDENTS’ PERFORMANCE IN UME AND POST-UME

ABSTRACT

In chapter one, the general introduction and important definitions were well stated. In chapter two, the chi-square distribution and chi-square goodness-of-fit test were treated. In chapter three, the chi-square test of independence was highlighted.

Finally, in chapter four, we considered the application of chi-square test of independence to real life data and conclusion was drawn.

CHAPTER ONE

INTRODUCTION

1.1 Background of the Study

The Federal Government in the year 2005 through the former Minister of Education, Mrs. Chinwe Obaji, introduced the policy of post-UME (University Matriculation Examination) screening by Universities. This policy mandated all tertiary institutions to carry out the task of further screening candidates after their UME result before giving admission. According to Obaji, candidates with a score of 200 and above will be short-listed by Jamb and their names and scores sent to their universities of choice which should then do or undertake another screening test in form of aptitude tests, oral interview or even another examination. Obaji measured the success of her policy by coming on national television to show cases of students who had scored 280 and above but could not score 20% in the post-UME screening. According to her, these students must have been engaged in cheating during Jamb examination and so could not pass post-UME screening because there was no room for them to cheat or be impersonated.
Based on this policy of the then Minister of Education, the former Vice-Chancellor, Prof. E.A.C. Nwanze implemented the policy by introducing the university post-UME screening. Since then, the policy has been highly effective in the university. 200 scores and above remain the benchmark of sitting for the screening exercise.

Chi-square test provides the basis for judging whether more than two population proportions may be considered to be equal. Chi-square test is discussed by considering two aspects, such as chi-square goodness-of-fit and chi-square test of independence. Chi-square test of goodness of fit provides a means for deciding whether a particular theoretical probability such as the binomial distribution is a close approximation to a sample frequency distribution. While test of independence constitutes a method for deciding whether the hypothesis of independence between different variables is tenable. This procedure provides test for the equality of more than two population proportions. Both X2 tests furnish a conclusion on whether a set of observed frequencies differs so greatly from a set of theoretical frequencies that the hypothesis under which the theoretical frequencies were derived should be rejected.

Theorem of chi-square distribution is given as: Let X1, X2, … Xv be independent normally distributed random variables with mean

V
zero and variance s2 = 1. Then, X21 + X22, + … + X2v = ∑ X2j is X2 –
j=1

distributed with v degree of freedom.

For any decision to be made in statistics, there is need to carry out hypothesis testing. Therefore, Kreyszig (1988) defined hypothesis as any reasonable assumption about the parameters of a distribution. In testing hypothesis in statistics, it is always difficult to carry out testing of hypothesis on the whole population, in this case, a sample is drawn from the population, and it is used to make inference concerning the population. If the inference made is not in agreement with the set assumptions, the hypothesis is rejected, otherwise it is not rejected.

Moreover, the procedure of testing hypothesis in parameter statistic helps us to decide whether to reject or not to reject a hypothesis or determine whether observed sample differs significantly from expected result. Hypothesis testing is therefore the process of making decision based on the sample.
According to Rao (1952, 1970), various attempts have been made to build up a consistent theory from which all tests of significance can be deduced as solution to precisely stated mathematical problems. It is difficult to argue whether such a theory exists or not, but formal theories leading to a clear understanding of the problems are nonetheless important. One such theory, contributed by Neyman and Pearson (1928, 1933) is an important development because it unfold the various complex problems in testing of hypothesis and led to the construction of general theories in problems of discrimination, sequential test, etc.

There are many questions that arise in the cause of hypothesis testing such as; when should a hypothesis be rejected or not rejected? What is the probability that we will make the wrong decision which can be led to a consequential loss? We may also ask if two variable are independent or interested to know whether if a distribution follows a specific pattern. All these are likely questions that arise in decision making. However, with chi-square statistical test, it is possible to provide answers to the above questions.

1.2 Objectives of the Study

1. To test students’ ability in carrying out a study independently.

2. To show students the use of chi-square tests and its application to real life problems.

3. To help students understand the process involves in decision making.

4. To assist students to understand hypothesis testing on the basis of sample data and make a statistical inference.

Significance of the Study

This study is highly significant to students in mathematics, social and management sciences and all other managerial discipline or field of understanding the nitty-gritty of the process involved in decision making using chi-square independence tests.

1.4 Scope of the Study

The scope of this study covers the University of Benin Post-UME, chi-square (c2) distribution, test and procedure for implementing c2-test.

1.5 Important Definitions

The following are some of the important concepts in statistical analysis:

Population: Ibrahim (2009) defined population as a set of existing units (usually people, objects, transaction, or event) that we are interested in studying. A population may be finite or infinite. For example, the population consisting of the number of students in mathematics department is finite, whereas the population consisting of all possible outcomes (heads, tails) in successive tosses of a coin is infinite.

Population Parameter: According to Dass (1988), the statistical constraints of the population such as mean (µ), standard deviation (s) are called population parameters. Parameters are denoted by Greek letters.

Sample: Ibrahim (2009) defined a sample as a subset of the units of a population – that is portion or part of the population of interest. The method of selecting the sample is called sampling procedure or sampling plan.

Statistical Hypothesis: According to Spiegel (1961), a statistical hypothesis is a statement about the value of a population parameter which may or may not be true. They are generally statement about the probability distribution of the population.

Null Hypothesis: According to Clark and Schkade (1969), the word null comes from the fact that this hypothesis assumed that there is “no significant difference between the value of the universe parameter being tested and the value of the statistic computed from a sample draw from that universe. Stated another way, the null hypothesis assumes that the difference between the parameter designated in the hypothesis and the statistic is a sampling difference. It is usually denoted by Ho.

Alternative Hypothesis: This is the hypothesis that will be accepted if statistical testing leads to rejection of the null hypothesis (Clark and Schkade, 1969). It is denoted by H1 or HA.

Variable: Ibrahim (2009) defined variable as any quantity or attribute whose value varies from one unit of investigation to another. While random variable is defined as a variable that takes on different numerical values because of chance. A random variable is best regarded as a number (Standing for some event) which has not yet been observed, but which will be chosen by means of a chance mechanism.

One-tailed test of Population Variance: Frank and Althoen (1994), the tail of a test is determined easily by the statement of alternative hypothesis. The alternative hypothesis may be directional and if it is directional, there are two possible tests situation, that is . For the right tail test the hypothesis is given as:

Ho: s2 = so2

Vs

H1: s2 > so2

And for the left tail test the hypothesis is given as:

Ho: s2 = s20

H1: s2 Two-tailed test of population variance: According to Frank and Althoen (1994), this tail is also determined by the alternative hypothesis. This is where the alternative hypothesis is non-directional, and in this case, there is only one possible test situation, such hypothesis is given as:

Ho: s2 = s2o
vs
H1: s 2 ¹ s2o

Type I and Type II Errors: According to Spiegel (1961), if we reject a hypothesis when it should be accepted, we say that a type I error has been made. If on the other hand, we accept a hypothesis when it should be rejected, we say that a type II error has been made. In either case, a wrong decision or error in judgment has occurred.

Level of Significance: According to Bluman (1992), this is the probability of committing the type I error. That is, the probability of rejecting a true null hypothesis instead of accepting it. It is usually denoted by µ or 100 µ% level.

Degree of Freedom: According to Dass (1988), the degree of freedom refers to the number of independent constraints “in a set of data”.

Rejection and Acceptance Region: According to Devore (2000), rejection region is the region at which the calculated statistics will fall that the null hypothesis will be rejected. In the case of X2 – distribution, it is represented by the diagram below:

Y

Rejection region

0 c2n-1 µ c2

While the acceptance region is the region of which the calculated statistics will fall that the null hypothesis will be accepted. It is also represented by the diagram below:

Y

Acceptance region
0 c2n-1 µ c2

Probability Distribution: According to Ibrahim (2009), this is very similar to frequency distribution which shows how many times given value in a range of values occur. Therefore, probability distribution is defined as a listing of all the outcomes of an experiment and the probability associated with each outcome.

Moment Generating Function: Guttman (1980) defined the moment generating function Mx(t) of random variable x for all values t by
∫¥ -¥ etxF(x) dx, if X is continuous
Mx(t) = E(etx) =
∑ etxp(x), if X is discrete
x

provided the integral and series converges.

1.6 Limitation of the Study

This project work is limited to University of Benin UME and Post-UME data, from Faculty of Physical and Life Sciences. Two sessions “2008/2009 and 2009/2010” were considered.