Tech Notes

My notes on Statistics, Big Data, Cloud Computing, Cyber Security

Tag Archives: Glossary

Basic Terms in Statistics – Glossary

A statistic is a quantity that is calculated from a sample of data. It is used to give information about unknown values in the corresponding population. For example, the average of the data in a sample is used to give information about the overall average in the population from which that sample was drawn.

A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, which we wish to describe or draw conclusions about

A sample is a group of units selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions about the larger group.

Statistical Inference
Statistical Inference makes use of information from a sample to draw conclusions (inferences) about the population from which the sample was taken.

A parameter is a value, usually unknown (and which therefore has to be estimated), used to represent a certain population characteristic. For example, the population mean is a parameter that is often used to indicate the average value of a quantity.

An estimator is any quantity calculated from the sample data which is used to give information about an unknown quantity in the population. For example, the sample mean is an estimator of the population mean.

Sampling Distribution
The sampling distribution describes probabilities associated with a statistic when a random sample is drawn from a population.
The sampling distribution is the probability distribution or probability density function of the statistic.
Derivation of the sampling distribution is the first step in calculating a confidence interval or carrying out a hypothesis test for a parameter.

An estimate is an indication of the value of an unknown quantity based on observed data.

Measure of Error.
Any estimate is useful if it comes with a margin of error. It gives us of some sense of how good the estimate is.

An experiment is any process or study which results in the collection of data, the outcome of which is unknown. In statistics, the term is usually restricted to situations in which the researcher has control over some of the conditions under which the experiment takes place.

Experimental (or Sampling) Unit
A unit is a person, animal, plant or thing which is actually studied by a researcher; the basic objects upon which the study or experiment is carried out. For example, a person; a monkey; a sample of soil; a pot of seedlings; a postcode area; a doctor’s practice.

Estimation is the process by which sample data are used to indicate the value of an unknown quantity in a population.

Sampling Variability
Sampling variability refers to the different values which a given function of the data takes when it is computed for two or more samples drawn from the same population.

Standard Error
Standard error is the standard deviation of the values of a given function of the data (parameter), over all possible samples of the same size.

Bias is a term which refers to how far the average statistic lies from the parameter it is estimating, that is, the error which arises when estimating a quantity. Errors from chance will cancel each other out in the long run, those from bias will not.

An outcome is the result of an experiment or other situation involving uncertainty.

An event is any collection of outcomes of an experiment.

Sample Space
The set of all possible outcomes of a probability experiment is called a sample space.

Disclaimer : These are my study notes – online – instead of on paper so that others can benefit. In the process I’ve have used some pictures / content from other original authors. All sources / original content publishers are listed below and they deserve credit for their work. No copyright violation intended.

References for these notes :