Tech Notes

My notes on Statistics, Big Data, Cloud Computing, Cyber Security

Supervised and Unsupervised Learning, Machine Learning

Machine Learning is a class of algorithms which is data-driven, i.e. unlike “normal” algorithms it is the data that “tells” what the “good answer” is. Example: an hypothetical non-machine learning algorithm for face recognition in images would try to define what a face is (round skin-like-colored disk, with dark area where you expect the eyes etc). A machine learning algorithm would not have such coded definition, but will “learn-by-examples”: you’ll show several images of faces and not-faces and a good algorithm will eventually learn and be able to predict whether or not an unseen image is a face.

This particular example of face recognition is supervised, which means that your examples must be labeled, or explicitly say which ones are faces and which ones aren’t.

Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

Eg : Its like teaching a child by holding out models for a house and a bike, and child can make out how the house looks like and then how a child looks like, and based on that learning, the child can identify other classes of houses and bikes based on the features learnt earlier.

In an unsupervised algorithm your examples are not labeled, i.e. you don’t say anything. Of course in such a case the algorithm itself cannot “invent” what a face is, but it could be able to cluster the data in different class, e.g. it could be able to distinguish that faces are very different from panoramas, which are very different from horses.

Eg . Following the example above, if there are lots of things on the kindergarten floor, the child has to look at these and make out for himself all the patterns from the objects. Objective is fuzzy. He may say, that these three things are similar based on the features.

Unsupervised learning is a preprocessor to supervised learning.

There are “intermediate” form of supervision, i.e. semi-supervised and active learning techniques. Technically, these are supervised methods, in which there is some “smart” way to avoid the large number of labeled examples. In active learning, the algorithm itself decides which thing you should label (e.g. it can be pretty sure about a panorama and a horse, but it might ask you to confirm if a gorilla is indeed the picture of a face). In semi-supervised approach, there are two different algorithms, which start with the labeled examples, and then “tell” each other way they think about some large number of unlabeled data. From this “discussion” they learn.

Disclaimer : These are my study notes – online – instead of on paper so that others can benefit. In the process I’ve have used some pictures / content from other original authors. All sources / original content publishers are listed below and they deserve credit for their work. No copyright violation intended.

References for these notes :

A discussion thread on


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: