Tech Notes

My notes on Statistics, Big Data, Cloud Computing, Cyber Security

Tag Archives: Distributions

Distributions, Variables, Relationship between Variables

Data Types :

  • Quantitative (Discrete) – Numerical values for which math makes sense


  • Categorical  (Qualitative) – Records of several categories. It can be sorted according to category. For example, shoes in a cupboard can be sorted according to colour. Can be illustrated using
  1. Bar charts
  2. Relative frequencies
  3. Pie chart
  • Nominal – In a data set males could be coded as 0, females as 1; marital status of an individual could be coded as Y if married, N if single.
  • Ordinal – It can be ranked (put in order) or have a rating scale attached.
    Varieties of biscuit and classify each biscuit on a rating scale of 1 to 5, representing strongly dislike, dislike, neutral, like, strongly like.
  • Interval Scale
    An interval scale is a scale of measurement where the distance between any two adjacents units of measurement (or ‘intervals’) is the same but the zero point is arbitrary.
    The time interval between the starts of years 1981 and 1982 is the same as that between 1983 and 1984, namely 365 days.

Ways to Visualize Data

  • Frequency Table
  • Pie Chart
  • Bar Chart
  • Dot Plot
  • Histogram
  • Stem and Leaf Plot (Trees)
  • Box and Whisker Plot (or Boxplot)
  • Scatter Plot

Distributions : Pattern of values or Data

Characteristics of Data

  • Outlier
    An outlier is an observation in a data set which is far removed in value from the others in the data set. It is an unusually large or an unusually small value compared to the others
  • Symmetry
    Symmetry is implied when data values are distributed in the same way above and below the middle of the sample.

Read more of this post