Machine learning, Classification and Data Mining - MAT4102

Engineering Master’s program (~90 hours TD eq.), Télécom SudParis, 2025

This is a short description of the lessons made in MAT4102. The aim of this course is to give the fundamentals of machine learning, with a particular emphasis on how learning algorithms are constructed. The course is divided into two parts: lectures where we describe the statistical models and the algorithms, and practical sessions to assess the algorithms performance with real data. All practical sessions are done using Python notebooks

Course 1 : AI & Machine Learning Introduction

Course objective :

AI and machine learning: definitions, presentation of some important concepts that have contributed to their development.
Provide an overview of the interface between physics and mathematics
Explain supervised and unsupervised learning
Explain the data processing and analysis chain
Give the “basic” methodology of data processing and analysis
Present evaluation tools for automatic classification systems based on supervised learning

Course 2: Unsupervised learning (Clustering)

Course objectives:

Introduction to clustering and evaluation of clustering quality.
Presentation of unsupervised learning methods: partioning method (K-means, K-Medoids), hierarchical method, density-based method (DBSCAN), probabilistic method using GMM model.
Presentation of the evaluation methodology in the context of unsupervised learning.

Course 3: Unsupervised learning (Lab session):

The aim of this tutorial is to study and compare different unsupervised learning algorithms covered in the course (K-Meaning, K-Medoid, DBSCAN and Gaussian Mixture Models), techniques widely used for grouping individuals, i.e. to partition $N$ input individuals $(X={X_1, . . . , X_N})$ into K groups or ≪ clusters ≫. The individuals are described in a $p$-dimensional space ($X_i \in \mathbb{R}^p$). This lab session is splited into 3 parts:

Application of clustering algorithms (K-Means, K-Medoids, GMM, DBSCAN) on MNIST data and stengths/weaknesses analysis of these algorithms.
Use of clustering algorithms for Image Compression/Segmentation.
Natural Language Processing introduction (Word2Vec) and semantic analysis of words using unsupervised algorithms.

Course 4: Decision Trees, Random Forests and Ensemble Methods:

Course objectives:

Introduction to decision trees: fundamental concepts, tree construction and interpretation.
Presentation of ensemble methods:
- Bagging: general principles and application with Random Forests.
- Boosting: general principles and application with AdaBoost and Gradient Boosting.

Course 5: Decision Trees, Random Forests and Ensemble Methods (Lab session):

The aim of this tutorial is to apply the theoretical concepts covered in the course on decision trees and aggregation methods, by comparing them in the context of supervised classification. Students worked with a database containing $N$ individuals, represented by input vectors $X={X_1,…, X_N}, each described in a $p$-dimensional space $(X_i \in \mathbb{R}^p)$. Since the objective is supervised classification, the students also had the labels $Y = [y_1,…, y_N]^T\in\mathb{R}^N$ , corresponding to the values to be predicted. Thus, each individual $i$ in the database is characterized by the pair $(X_i, y_i)$. This lab session is splited into 3 parts:

Application of a Decision Tree and Random Forest to the Fisher’s Iris database and to a database designed to predict diabetes from clinical data.
Application of decision trees, random forests and AdaBoost to the MNIST database, and comparative analysis of the performance of these algorithms.
Application of ensemble methods on a database used for breast cancer prediction from clinical data. Comparative performance analysis of bagging vs. boosting algorithms.

Course 6: Artificial Neural Networks (ANNs)

Course objectives:

Introduction to artificial neurons: historical background and mathematical modeling
Presentation of neural networks:
- Multi-Layer Perceptron
- Activation functions: linear, sigmoid, softmax, ReLU, Tanh, …
- Cost functions: MSE, MAE, cross-entropy, categorical cross-entropy, …
- Optimizers: gradient descent (stochastic), Adam.
Neural network optimization by example:
- Linear regression
- Logistic regression
Introduction to autoencoders

Course 7: Perceptrons and Artificial Neural Networks (Lab session):

The aim of this tutorial is to apply the theoretical concepts covered in the course on perceptrons and artificial neural networks (ANNs). This lab session is composed of 3 parts:

Implementation and application of the first perceptrons and ANNs on synthetic data.
Application of the previous implemented models for stroke prediction from clinical data.
Application of autoencoders for ECG anomalies detection

Course 8: Variational autoencoders and convolutional neural networks (CNNs)

Course objectives:

Introduction to variational autoencoders
Presentation of convolutional neural networks:
- Convolution filter
- Convolution parameters (padding, strides, size)
- Pooling
- Gradient backpropagation
Introduction to prediction interpretation techniques: example of Grad-CAM

Course 9: Convolutional Neural Networks (Lab session):

The aim of this tutorial is to apply the theoretical concepts covered in the course on CNNs.

Conception and implementation of a CNN for pneumonia detection on X-Ray thoracic images.
Class imbalance management and evaluation metrics analysis
Interpretation of model predictions using the Grad-CAM algorithm

Share on

Twitter Facebook LinkedIn

Lorenzo HERMEZ

Course 1 : AI & Machine Learning Introduction

Course 2: Unsupervised learning (Clustering)

Course 3: Unsupervised learning (Lab session):

Course 4: Decision Trees, Random Forests and Ensemble Methods:

Course 5: Decision Trees, Random Forests and Ensemble Methods (Lab session):

Course 6: Artificial Neural Networks (ANNs)

Course 7: Perceptrons and Artificial Neural Networks (Lab session):

Course 8: Variational autoencoders and convolutional neural networks (CNNs)

Course 9: Convolutional Neural Networks (Lab session):

Share on