Mineria de Datos

Mineria de Datos

Mar 24, 2024 · 1 min read

All the different classes can be found here!

This is the CC5205 course from the Universidad de Chile. I restructured it so that it is more adapted to nowadays techniques and more machine learning oriented, it is heavily based on scikit!

Here’s a summary:

  1. General introduction: Definitions of Data Mining, Data Science, and content of the class
  2. Data I: (Un)structured data, Representation, Normalization, Noise removal, …
  3. [TODO] Data II: Basic statistics for data exploration.
  4. Intro to Supervised Learning: Basics of Machine Learning and supervised learning.
  5. Intro to Fairness and Biases: How to avoid making bad models.
  6. Linear Models: A very simple model, which is the base of deep neural networks!
  7. [TODO] Classifiers: KNN, Naive Bayes, Decision Tree, Boosting, Bagging, Random Forests.
  8. [TODO] Dimensionality Reduction: Principal Component Analysis, Independant Component Analysis, t-SNE, UMAP,…
  9. [TODO] Clustering methods: Clustering methods and associated metrics
  10. SVM, SVR: Hinge loss, Lagrangian, KKT conditions, non-linear SVM, Kernel trick, SV Regressor
  11. Introduction to Neural Nets: Basics of Deep Learning
  12. Introduction to NLP (Invited Speaker: Juan Jose Alegria): How to deal with natural language.