Deep Learning
Deep Learning
Feb 24, 2024
·
1 min read
All the different classes can be found here!
This is the CC66204 course from the Universidad de Chile. Based on the class of Ivan Sipiran, I added details in each of the classes, allowing to understand on how we get to the recent large multimodal models. The github is here.
Here’s a summary:
- General introduction: Overview of the class, reminders from Machine Learning,…
- [TODO] Basics: Perceptron, Vanilla Gradient Descent, MLP, Backprop,
- [TODO] Losses and Activations: General losses, Softmax, CE, Activation functions
- [TODO] Initialization and Optimization: Weights initialization, Complex gradient descents
- Regularization: Penalization, Dropout, Data augmentation
- [TODO] Convolutional Layer: Convolution, Padding, Pooling, LeNet
- Computer Vision Architectures: ImageNet, Revolution of depth, Classical classifiers architectures
- Transfer Learning: Motivation, Principle, Types of TL, Weights unfreezing, Pre-training datasets, SoTA
- Object Detection: Principle, IoU and mAP, Classical Object Detection, Segmentation and Mask-RCNN, SoTA
- [TODO] Recurrent Layer: Sequential Modeling, RNN, LSTM, GRU,…
- [TODO] Attention:
- [TODO] Transformers:
- Generative Large Language Models: Language Modeling and Temperature, Abilities and In-Context-Learning, Tokenization, Instructions, Alignments, Reasonings, Training and Evaluating in Practice, LLMs as Agents
- Large Multimodal Models: Multimodality, Fusion, Original tasks and datasets, Early multimodal transformers, CLIP and text2image Diffusion, Frozen encoders, BLIP 1/2/3 and LMM Assistants, Open-source training datasets, LMM evaluation, Video, Multimodal Tokenization