Introduction to Statistics:

Common Terminology:

Statistics definition: Statistics is a scientific discipline devoted to the study of data.

Statistics is the study of how to collect ,organize , analyze , interpret numerical information from data.

Data: Collection of numbers assigned as values to quantitative variables or characters assigned as values to qualitative variables.

Data can me quantitative(numerical) , qualitative(non numerical)

Data -> Information -> Knowledge


Statistic:A quantity calculated from a sample of data.A numerical measure that describes a characteristic of a sample.

Parameter: a numerical measure that describes characteristic of a population

Eg : Avg age of students,Avg Math grade,Standard deviation of Math grade.

Population :the entire collection of cases which we want to generalize. All measurements or observations of interest

Sample: a subset of a population.

Descriptive statistics: Procedures used to summarize , organize  and simplify data.

Inferential statistics : Procedures that allow generalizations about population parameters based on sample statistics.

Research methods:

Descriptive — Organize and summarize the data

Correlational — Examine relationships among variables

Experimental –Randomly assign students to different schedules

– Year round

– Summer break

Is achievement affected by schedule?

Sampling:part of the population

Random Sample:

-samples are used to infer conclusions about populations

-such conclusions are uncertain

-the uncertainty is measurable


Levels of measurement:

1. Nominal Level

The nominal type, sometimes also called the qualitative type, differentiates between items or subjects based only on their names or (meta-)categories and other qualitative classifications they belong to. Examples include gender, nationality, ethnicity, language, genre, style, biological species, and form.

“in name only ” not intended for numerical calculation.

Central tendency for Nominal level
The mode, i.e. the most common item, is allowed as the measure of central tendency for the nominal type. On the other hand, the median, i.e. the middle-ranked item, makes no sense for the nominal type of data since ranking is not allowed for the nominal type.

Eg: Names of states

2. Ordinary level

The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which data can be sorted, but still does not allow for relative degree of difference between them.

Examples :

Divided or dividing into two parts or classifications (Dichotomous )

 ‘sick’ vs. ‘healthy’ when measuring health, ‘guilty’ vs. ‘innocent’ when making judgments in courts, ‘wrong/false’ vs. ‘right/true’ when measuring truth value,

Non-dichotomous data consisting of a spectrum of values, such as ‘completely agree’, ‘mostly agree’, ‘mostly disagree’, ‘completely disagree’ when measuring opinion.

Data may be arranged in order , differences are meaningless.

Central tendency for ordinary level:

The median, i.e. middle-ranked, item is allowed as the measure of central tendency; however, the mean (or average) as the measure of central tendency is not allowed. The mode is allowed.

Eg: ranks of students in a class

3 . Interval level

The interval type allows for the degree of difference between items, but not the ratio between them.

order, differences are meaningful

may not have a zero starting point

Ratios are meaningless

Example: Temperature with the Celsius scale, Inventions years

The mode, median, and arithmetic mean are allowed to measure central tendency of interval variables, while measures of statistical dispersion include range and standard deviation.

4.Ratio level   

The ratio type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind

It includes a “zero” starting point.

Eg: Elapsed time, Money, distances,mass, length, duration, plane angle, energy and electric charge.