Deep Learning | Valentin Barriere

Deep Learning

Feb 24, 2024 · 1 min read

All the different classes can be found here!

This is the CC66204 course from the Universidad de Chile. Based on the class of Ivan Sipiran, I added details in each of the classes, allowing to understand on how we get to the recent large multimodal models. The github is here.
Here’s a summary:

General introduction: Overview of the class, reminders from Machine Learning,…
[TODO] Basics: Perceptron, Vanilla Gradient Descent, MLP, Backprop,
[TODO] Losses and Activations: General losses, Softmax, CE, Activation functions
[TODO] Initialization and Optimization: Weights initialization, Complex gradient descents
Regularization: Penalization, Dropout, Data augmentation
[TODO] Convolutional Layer: Convolution, Padding, Pooling, LeNet
Computer Vision Architectures: ImageNet, Revolution of depth, Classical classifiers architectures
Transfer Learning: Motivation, Principle, Types of TL, Weights unfreezing, Pre-training datasets, SoTA
Object Detection: Principle, IoU and mAP, Classical Object Detection, Segmentation and Mask-RCNN, SoTA
[TODO] Recurrent Layer: Sequential Modeling, RNN, LSTM, GRU,…
[TODO] Attention:
[TODO] Transformers:
Generative Large Language Models: Language Modeling and Temperature, Abilities and In-Context-Learning, Tokenization, Instructions, Alignments, Reasonings, Training and Evaluating in Practice, LLMs as Agents
Large Multimodal Models: Multimodality, Fusion, Original tasks and datasets, Early multimodal transformers, CLIP and text2image Diffusion, Frozen encoders, BLIP 1/2/3 and LMM Assistants, Open-source training datasets, LMM evaluation, Video, Multimodal Tokenization

Last updated on Feb 24, 2024