
Machine Learning Algorithm category

Category Explain Scenarios
Supervised Learning Training data is labeled Auto Driving
Image Recognization
UnSupervised Learning Training Data is not labeled Clustering
Anomaly detection
Dimensionality Reduction
Semi-Supervised Learning Trained on labeled and unlabeled data self-training
Reinforcement Learning Adjust according to feedback to gain maximum reword game playing
Reward such as Finance
natural language process

Machine Learning Algorithms

Name Category Detail Scenarios
Linear Regression Supervised Continous Regression Line  
Logistic Regression Supervised Yes - No Binary Classification
Medication Dignosis
Political Forecasting
Naive Bayes Supervised Based on Bayes-theory  
Decision Tree Supervised    
Random Forest Supervised Use a set of sub-forest to vote  
KNN - K-Nearest Neighbor Supervised Use the K nearst neighbor to decide where it belongs classification
K-means Unsupervised Use a ‘center’ to define each cluster Clustering
SVM - Supported Vector Machine Supervised   Classification and regression
XGBoost Supervised Large dataset, complex problems classification
Feature selecion
abnormal detection
natural language processing
feature selection
CNN Convolutional Neural Networks Supervised, Deep Learning   Image Classification
Object Detection
Image Segmentation
RNN - Recurrent Neural Networks Supervised, Deep Learning   Sequential Data Processing
Time series preication
Speech Recognition
GAN - Generative Adversarial Network Supervised, unsupervised, Deep Learning   Image Generation
Image to image translation
abormal detection
Deep Belief Network UnSupervised, Deep Learning    
Autoencoders Unsupervised, Deep Learning   data denoising
dimensionality reduction
anomaly detection
DRL - Deep Reinforcement Learning Reinforcement, Deep Learning Combining deep learning with reinforcement learning Feature Learning
Transformer Network Semi-supervised, Deep Learning   Natural Language Processing, including BERT, GPT
Yolo - You only look once Supervised, Deep Learning   Real time object detection
Traffic Monitoring
Retail Analysis

Popular Machine Learning library

Library Short Description
Scikit-Learn Traditional Machine Learning Algorithms, such as XGBoost
TensorFlow Deep Learning Framework, Google backed
Pytorch Deep Learning Framework, Facebook backed
Keras Deep Learning Library, Popular choice and supported multiple platform
Numpy Numeric computing
Matplotlib Visualize 2-D data
Pandas Read and Analysis structured data

Linear Regression

Decision Tree

Random Forest

Importance of data versus algorithms

Popular Encoder - from text to numeric

Name Detail Scenarios
LabelEncoder Convert category data into a number, like 2 Quick
Preserver the order
OneHotEncoder Convert category data into a binary vector such as [0,0,1] Doesn’t assume the order
Can handle unseen label
Binary Encoder Convert category data into a binary vector such as [0,0] [0,1], [1,1] has less dimensionality compared to one-hot-code
Target Encoder (Mean-Encoder) Encode category data by the mean value of each category Useful when there is strong relation between category and target variable
Frequency Encoder Encodes each category by its frequency Useful when frequency is a valuable feature

Real Datasets

Keras Model

Sequential Model

Functional Model

