Machine Learning

1. Introduction

Machine learning is a field of computer science that focuses on how computers can learn patterns from data and improve their performance over time. Instead of following fixed rules, machine learning systems study examples and use them to make predictions or decisions. A common example is teaching a computer to recognize handwritten postal codes by showing it many images of handwritten digits and their correct labels.

Machine learning is now one of the fastest‑growing areas in computing. This section explains four major types of machine learning that are closely related to machine learning types: supervised learning, unsupervised learning, semi‑supervised learning, and active learning.

2. Overview Table

Learning Type	Uses Labels?	Main Goal	Example Application
Supervised Learning	Yes	Predict known categories	Handwritten digit recognition
Unsupervised Learning	No	Discover hidden patterns	Customer segmentation
Semi‑Supervised Learning	Partially	Improve accuracy using unlabeled data	Web page classification
Active Learning	Yes (selected by model)	Reduce labeling cost	Medical image labeling

3. Figure: Machine Learning Categories

                +-----------------------------+
                |       Machine Learning      |
                +-----------------------------+
                     /        |        \
                    /         |         \
                   /          |          \
      Supervised Learning  Unsupervised   Semi-supervised
                               Learning        Learning
                                 |
                                 |
                           Active Learning

Figure Explanation:
This diagram shows the four major categories of machine learning. Supervised learning uses labeled data, unsupervised learning uses unlabeled data, semi‑supervised learning uses both, and active learning involves human experts.

4. Supervised Learning

Supervised learning uses labeled data to train a model. Each training example includes both the input and the correct output. The model learns by comparing its predictions with the true labels and adjusting itself to reduce errors.

Example

In postal code recognition, the training dataset contains thousands of handwritten digit images and their correct labels. These labeled examples guide the model to learn how to classify new handwritten digits accurately.

Supervised learning is widely used in classification tasks such as spam detection, medical diagnosis, and speech recognition.

5. Unsupervised Learning

Unsupervised learning uses data that has no labels. The goal is to discover hidden structures or patterns. A common method is clustering, where the algorithm groups similar items together.

Example

If we give the algorithm many handwritten digit images without labels, it may discover 10 clusters. These clusters often correspond to digits 0–9, even though the algorithm does not know the meaning of each cluster.

Unsupervised learning is useful for clustering tasks such as customer segmentation and anomaly detection.

6. Semi‑Supervised Learning

Semi‑supervised learning uses both labeled and unlabeled data. This method is especially useful when labeled data is expensive or difficult to obtain.

Figure: Decision Boundary Refinement

Before using unlabeled data:     After using unlabeled data:

   Positive ● ●                     Positive ● ●
   Negative ○ ○                     Negative ○ ○
   Dashed line = rough boundary     Solid line = refined boundary

Explanation

Using only labeled data produces a rough decision boundary. When unlabeled data is added, the model can better understand the structure of the dataset and refine the boundary. It can also detect mislabeled or noisy examples.

Semi‑supervised learning is widely used in text classification, speech recognition, and web content analysis.

7. Active Learning

Active learning allows human experts to participate in the learning process. Instead of passively receiving labeled data, the model selects the most informative examples and asks the user to label them.

Example

A medical diagnosis system may ask a doctor to label only the most uncertain patient cases. This improves accuracy while reducing the number of labels required.

Active learning is especially useful in fields where labeling requires expert knowledge, such as medical imaging.

8. Conclusion

Machine learning provides essential tools for analyzing data and making intelligent decisions.

Supervised learning uses labeled data to train accurate models.
Unsupervised learning discovers hidden structures without labels.
Semi‑supervised learning improves performance by combining labeled and unlabeled data.
Active learning reduces labeling cost by involving human experts.

Together, these methods form the foundation of modern data mining and intelligent decision‑making systems.

9. References

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
Mitchell, T. M. (1997). Machine Learning. McGraw‑Hill.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
Russell, S., & Norvig, P. (2021). Artificial Intelligence: A Modern Approach (4th ed.). Pearson.
Zhu, X., & Goldberg, A. B. (2009). Introduction to Semi‑Supervised Learning. Morgan & Claypool Publishers.