What Are the Types of Machine Learning?
In the previous chapter you learned that machine learning means a system that improves at a task by learning patterns from data instead of following hand-written rules. But "learning from data" is not one single thing. How the system learns depends entirely on what kind of data you give it and what question you are trying to answer.
Think of it like different ways a new employee at a company can be trained:
- Supervised learning is like a senior analyst sitting beside a trainee, showing hundreds of past loan applications each already stamped "approved" or "rejected". The trainee learns the pattern from the labelled answers.
- Unsupervised learning is like handing that same trainee a pile of customer records with no labels and saying "find the natural groups". Nobody tells them the right answer — they must discover the structure themselves.
- Reinforcement learning is like a delivery-route trainee who is not told the best route, but gets a reward (faster delivery) or a penalty (a traffic jam) after each trip, and gradually learns a good strategy through trial and error.
These three — plus two useful hybrids, semi-supervised and self-supervised learning — cover almost everything you will meet in practice. Choosing the right paradigm is the single most important early decision in any ML project, because it determines which algorithms, which evaluation metrics, and which data you will need.
The one question that decides your paradigm:
→ Do I have labelled examples of the answer I want to predict?
Yes, and the label is a category → Supervised: Classification
Yes, and the label is a number → Supervised: Regression
No labels, want to find groups → Unsupervised: Clustering
No labels, want fewer features → Unsupervised: Dimensionality Reduction
No labels, want item associations → Unsupervised: Association
Learn by acting and getting reward → Reinforcement Learning
A few labels + lots of unlabelled → Semi-supervised Learning
Supervised Learning
Supervised learning trains a model on data where every example already has the correct answer attached — called the label or target. The model's job is to learn the mapping from input features (X) to the output label (y), so it can predict y for new, unseen inputs.
Given training pairs: (x₁, y₁), (x₂, y₂), ..., (xₙ, yₙ)
Learn a function: ŷ = f(X) that predicts y from X
Goal: minimise the error between predicted ŷ and true y
The word "supervised" comes from the idea that the labels act as a supervisor or teacher, correcting the model during training. Supervised learning splits into two sub-types depending on what the label looks like.
Classification: Predicting a Category
In classification the target is a discrete category — a label from a fixed set of classes. The model outputs which class an example belongs to (and often a probability for each class).
- Binary classification: two classes, e.g. spam vs not-spam, fraud vs genuine.
- Multi-class classification: three or more classes, e.g. classifying a product image as
{shirt, shoe, watch, bag}.
Real-world example (India): A private bank in Mumbai wants to predict whether a customer will default on a loan. Each past customer is labelled default or no-default. Features include monthly income (₹), credit score, existing EMIs, and age. The model learns the pattern and flags risky new applicants.
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd
# Features (X) and category label (y): 1 = default, 0 = no default
df = pd.read_csv("loan_history.csv")
X = df[["income", "credit_score", "existing_emis", "age"]]
y = df["defaulted"] # discrete categories -> classification
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
predictions = model.predict(X_test) # e.g. array([0, 1, 0, 0, 1, ...])
Representative classification algorithms:
Logistic Regression, K-Nearest Neighbors, Decision Trees,
Random Forests, Naive Bayes, Support Vector Machines
You will study each of these in later chapters — for example Logistic Regression, Decision Trees, and Support Vector Machines (SVM).
Regression: Predicting a Number
In regression the target is a continuous number — a quantity on a scale. Instead of "which class?", the model answers "how much?" or "how many?".
Real-world example (India): A real-estate portal predicts the price (₹ lakhs) of a flat in Pune from its area (sq ft), number of bedrooms, floor, and distance to the metro. The label is a number, so this is regression.
from sklearn.linear_model import LinearRegression
df = pd.read_csv("pune_flats.csv")
X = df[["area_sqft", "bedrooms", "floor", "metro_km"]]
y = df["price_lakhs"] # continuous number -> regression
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test) # e.g. array([62.4, 88.1, 41.7, ...]) in ₹ lakhs
Representative regression algorithms:
Linear Regression, Ridge / Lasso, K-Nearest Neighbors (regressor),
Decision Tree Regressor, Random Forest Regressor, SVR
A quick test to tell the two apart: if the answer to your question can be sensibly averaged, it is a number and you have regression (average price = ₹64 lakh makes sense). If averaging is nonsense (average of spam and not-spam?), you have classification.
Unsupervised Learning
Unsupervised learning works with data that has no labels — only features X, with no target y. There is no teacher and no "correct answer" to imitate. Instead, the goal is to discover hidden structure in the data itself. It answers open-ended questions like "what natural groups exist here?" or "how can I describe this data more simply?".
Because there is no ground-truth label, unsupervised results need human interpretation, and evaluation is trickier than in supervised learning. Unsupervised learning has three main jobs.
Clustering: Finding Natural Groups
Clustering partitions examples into groups (clusters) so that items in the same cluster are similar to each other and different from items in other clusters. Nobody defines the groups in advance — the algorithm finds them.
Real-world example (India): A retail chain like a large FMCG brand wants to run targeted offers. It clusters its customers by purchase frequency, average basket value (₹), and product categories bought. The algorithm discovers segments such as "young value-seekers", "premium loyalists", and "occasional bulk buyers" — the marketing team then names and targets each cluster.
from sklearn.cluster import KMeans
df = pd.read_csv("customers.csv")
X = df[["visits_per_month", "avg_basket_value", "num_categories"]]
# No y! We ask for 3 groups; the algorithm finds them.
model = KMeans(n_clusters=3, random_state=42, n_init=10)
labels = model.fit_predict(X) # e.g. array([2, 0, 1, 2, 0, ...])
df["segment"] = labels
You will cover K-Means Clustering and Hierarchical Clustering & DBSCAN in dedicated chapters.
Dimensionality Reduction: Simplifying the Data
Dimensionality reduction compresses data that has many features (high-dimensional data) into far fewer features while keeping as much of the important information as possible. This speeds up models, removes noise and redundancy, and makes it possible to visualise data in 2-D or 3-D.
Real-world example: A genomics lab has patient records with 20,000 gene-expression columns. Most are redundant. Reducing them to, say, 30 combined components makes downstream models faster and less prone to overfitting.
from sklearn.decomposition import PCA
# Reduce many features down to 2 principal components (great for plotting)
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X) # shape goes from (n, many) -> (n, 2)
print(pca.explained_variance_ratio_) # e.g. [0.61, 0.18] -> 79% info kept
The Dimensionality Reduction & PCA chapter goes deep on this.
Association: Finding Items That Go Together
Association rule learning finds items that frequently occur together. The classic case is market-basket analysis — discovering rules like "customers who buy atta and ghee also tend to buy sugar". These rules power "frequently bought together" recommendations.
Example association rule (illustrative):
{bread, butter} → {jam} support = 4%, confidence = 68%
Read as: baskets with bread and butter contain jam 68% of the time,
and this combination appears in 4% of all baskets.
Association is typically handled by algorithms like Apriori or FP-Growth (available in libraries such as mlxtend), rather than in core scikit-learn.
Reinforcement Learning
Reinforcement learning (RL) is a different beast. There is no fixed dataset of labelled examples. Instead an agent interacts with an environment, takes actions, and receives a reward (or penalty) signal. Over many trials the agent learns a policy — a strategy for choosing actions — that maximises its total reward over time.
The reinforcement learning loop:
Agent observes State → chooses Action → Environment returns Reward + new State
Repeat, updating the policy to maximise long-term cumulative reward.
Key terms:
Agent the learner / decision-maker (e.g. a warehouse robot)
Environment the world it acts in (the warehouse)
State the current situation (robot position, shelf status)
Action what the agent can do (move, pick, drop)
Reward feedback signal (+ for a completed order, − for a collision)
Policy the learned mapping from state to action
The crucial idea is delayed reward and trial-and-error: a good move now may only pay off many steps later, and the agent must balance exploration (trying new actions) against exploitation (using what already works).
Real-world examples: A food-delivery app learning to assign riders to orders to minimise total delivery time; a game-playing agent (chess, Go); a robot in a Bengaluru warehouse learning the fastest picking routes; dynamic pricing that adjusts fares based on demand.
Reinforcement learning uses specialised libraries such as Gymnasium and Stable-Baselines3 rather than scikit-learn, and it is an advanced topic beyond this series. Just remember the shape of the problem: learn by doing and being rewarded, not from a labelled dataset.
Semi-Supervised and Self-Supervised Learning
Two hybrids sit between the pure paradigms and are increasingly important in modern ML.
Semi-supervised learning uses a small amount of labelled data together with a large amount of unlabelled data. Labelling is often expensive — imagine paying doctors to annotate thousands of X-rays. Semi-supervised methods label a few examples by hand, then use the abundant unlabelled data to improve the model. scikit-learn provides tools like SelfTrainingClassifier and LabelPropagation for this.
Self-supervised learning creates its own labels automatically from the raw data, with no human annotation, then learns from those. For example, a language model is trained to predict the next word in a sentence — the "label" is simply the actual next word already present in the text. This is the technique behind large language models and many modern vision models: the data supervises itself.
Quick contrast:
Semi-supervised : few human labels + many unlabelled examples
Self-supervised : zero human labels ; labels invented from the data itself
Comparison of the Main Paradigms
| Paradigm | Has labels? | Goal | Example algorithms | Example use case |
|---|---|---|---|---|
| Supervised — Classification | Yes (categorical y) | Predict a class | LogisticRegression, RandomForestClassifier, SVC | Loan default: yes or no |
| Supervised — Regression | Yes (numeric y) | Predict a number | LinearRegression, Ridge, RandomForestRegressor | Flat price in ₹ lakhs |
| Unsupervised — Clustering | No | Group similar items | KMeans, DBSCAN, AgglomerativeClustering | Customer segmentation |
| Unsupervised — Dimensionality Reduction | No | Fewer features, keep info | PCA, TruncatedSVD, TSNE | Compress gene-expression data |
| Unsupervised — Association | No | Find items bought together | Apriori, FP-Growth (mlxtend) | Market-basket "frequently bought together" |
| Reinforcement | No (reward signal) | Learn a policy to maximise reward | Q-Learning, DQN, PPO | Warehouse-robot routing |
| Semi-supervised | Few labels + unlabelled | Use unlabelled data to boost accuracy | SelfTrainingClassifier, LabelPropagation | Medical images with few annotations |
| Self-supervised | Labels created from data | Learn representations without human labels | Masked / next-token prediction | Pretraining language models |
How to Decide Which Type Your Problem Is
Work through these questions in order — they narrow the paradigm quickly.
1. Do you have a target/answer column already in your data?
- No → you are in unsupervised (or possibly reinforcement) territory.
- Yes → you are in supervised territory; go to question 2.
2. What does that target column look like?
- A category from a fixed set (
fraud/genuine, or a product type) → classification. - A number on a continuous scale (price, temperature, demand) → regression.
3. If you have no labels, what do you want to achieve?
- Group similar records → clustering.
- Reduce the number of features or visualise the data → dimensionality reduction.
- Find which items co-occur → association.
4. Are you learning by taking actions and receiving feedback over time, rather than from a fixed dataset?
- Yes → reinforcement learning.
5. Do you have only a handful of labels but lots of unlabelled data?
- Consider semi-supervised learning before spending money on more labels.
Worked decision — three business problems:
A) "Predict which customers will churn next month (churn / stay)."
Has a label? Yes. Category? Yes -> SUPERVISED CLASSIFICATION.
B) "Estimate next quarter's electricity demand in MW."
Has a label? Yes. A number? Yes -> SUPERVISED REGRESSION.
C) "Group 50,000 support tickets into themes (we don't know the themes)."
Has a label? No. Want groups? Yes -> UNSUPERVISED CLUSTERING.
A note on feature vs label: the same field can be either, depending on the question. income is a feature when predicting loan default, but the label if your goal is to predict a person's income. Always define your target first — it decides everything else.
Common Pitfalls
1. Forcing a label onto an unsupervised problem
If you do not actually have trustworthy labels, do not fabricate them just so you can run a classifier. Clustering may be the honest choice. Fake labels lead to a model that looks accurate but learned nothing real.
2. Confusing classification with regression
Predicting a rating from 1 to 5 looks numeric, but if only whole stars are allowed and the order matters little, it may be better treated as classification (or ordinal). Ask whether averaging the target makes sense before committing.
3. Judging clusters as if they were "correct"
Unsupervised results have no ground truth. Two valid clusterings of the same customers can both be useful. Do not chase a single "right" answer — evaluate whether the groups are actionable, using metrics like the silhouette score plus domain judgement.
4. Reaching for reinforcement learning too early
RL is powerful but data-hungry and hard to stabilise. Most business problems that people label "RL" are actually supervised prediction problems. Use RL only when the task genuinely involves sequential actions with delayed rewards.
5. Doing dimensionality reduction and forgetting to scale first
Techniques like PCA are sensitive to feature scale — a column measured in ₹ lakhs will dominate one measured in years. Always standardise features (covered in Feature Engineering & Scaling) before applying PCA or distance-based clustering.
6. Leaking the target into the features
In supervised learning, make sure no feature secretly encodes the answer (for example, using loan_status = closed_due_to_default to predict default). This inflates accuracy in testing and collapses in production.
Practice Exercises
-
For each task, name the paradigm and sub-type: (a) predict tomorrow's Nifty 50 closing value; (b) sort 10,000 news articles into unknown topics; (c) flag a UPI transaction as fraud or genuine; (d) recommend products often bought together on an e-commerce site.
-
A hospital wants to predict a patient's length of stay in days. Is this classification or regression? What is the target, and what are two possible features?
-
You have 1,000 labelled resumes (
shortlist/reject) and 100,000 unlabelled resumes. Which paradigm would you consider to make the most of all this data, and why? -
Explain, in one sentence each, the difference between the label in supervised learning and the reward in reinforcement learning.
-
Given a dataset of 200 features and only 500 rows, which unsupervised technique might you apply before modelling, and what problem does it help with?
-
Write two lines of scikit-learn pseudocode: one that fits a classifier on
(X_train, y_train), and one that fits aKMeansclustering model onXwith 4 clusters. Note which one uses a label and which does not.
Summary
In this chapter you learned:
- Supervised learning uses labelled data
(X, y)to learnŷ = f(X); it splits into classification (predict a category) and regression (predict a number). - Unsupervised learning uses unlabelled data
Xto discover structure — via clustering (find groups), dimensionality reduction (fewer features), and association (items that co-occur). - Reinforcement learning has an agent take actions in an environment to maximise a long-term reward, learning a policy by trial and error rather than from a fixed dataset.
- Semi-supervised learning combines a few labels with lots of unlabelled data; self-supervised learning generates its own labels from the raw data (the engine behind modern language models).
- The fastest way to identify your paradigm is to ask "Do I have labels, and is the label a category or a number?" — then follow the decision tree.
- The comparison table maps each paradigm to its goal, representative algorithms, and a real use case, and the Common Pitfalls section flags traps like forcing labels, target leakage, and skipping feature scaling.
Getting the paradigm right up front tells you which algorithms to reach for, which metrics to trust, and how to prepare your data — everything downstream depends on this choice.
Next up: The Machine Learning Workflow — the end-to-end process, from framing the problem and preparing data through training, evaluating, and deploying a model, that ties all these paradigms together.