JasonTechnology

Machine Learning Algorithm category

Category Explain Scenarios
Supervised Learning Training data is labeled Auto Driving
Image Recognization
UnSupervised Learning Training Data is not labeled Clustering
Anomaly detection
Dimensionality Reduction
Semi-Supervised Learning Trained on labeled and unlabeled data self-training
co-training
Reinforcement Learning Adjust according to feedback to gain maximum reword game playing
Reward such as Finance
natural language process

Machine Learning Algorithms

Name Category Detail Scenarios
Linear Regression Supervised Continous Regression Line  
Logistic Regression Supervised Yes - No Binary Classification
Medication Dignosis
Political Forecasting
Naive Bayes Supervised Based on Bayes-theory  
Decision Tree Supervised Decision boundary is when the possibility is equal for both side in 1-0 scenario
Classification error, Gini and Entropy to meaure the algorithm
stopping condition
 
Random Forest Supervised Use a set of sub-forest to vote Robost, default for many scenarios
KNN - K-Nearest Neighbor Unsupervised Use the K nearst neighbor to decide where it belongs classification
K-means Unsupervised Use a ‘center’ to define each cluster Clustering
SVM - Supported Vector Machine Supervised   Classification and regression
XGBoost Supervised Large dataset, complex problems classification
Regression
Feature selecion
abnormal detection
natural language processing
feature selection
ANN - Artificial Neural Network Supervised, Deep Learning An ANN is made up of layers of nodes (neurons) that process data and “learn” patterns from it  
CNN - Convolutional Neural Networks Supervised, Deep Learning   Image Classification
Object Detection
Image Segmentation
RNN - Recurrent Neural Networks Supervised, Deep Learning   Sequential Data Processing
Time series preication
Speech Recognition
SOM - Self Organised Map unsupervised, Deep Learning   Dimensionality reduction and visualization of high-dimensional data
GAN - Generative Adversarial Network Supervised, unsupervised, Deep Learning   Image Generation
Image to image translation
abormal detection
Deep Belief Network UnSupervised, Deep Learning    
Autoencoders Unsupervised, Deep Learning   data denoising
dimensionality reduction
anomaly detection
DRL - Deep Reinforcement Learning Reinforcement, Deep Learning Combining deep learning with reinforcement learning Feature Learning
Transformer Network Semi-supervised, Deep Learning   Natural Language Processing, including BERT, GPT
Yolo - You only look once Supervised, Deep Learning   Real time object detection
Traffic Monitoring
Retail Analysis

Popular Machine Learning library

Library Short Description
Scikit-Learn Traditional Machine Learning Algorithms, such as XGBoost
TensorFlow Deep Learning Framework, Google backed
Pytorch Deep Learning Framework, Facebook backed
Keras Deep Learning Library, Popular choice and supported multiple platform
Numpy Numeric computing
Matplotlib Visualize 2-D data
Pandas Read and Analysis structured data
dtreeviz A python library to visualize deep-tree process in details

Machine Learning in Chart

Linear Regression

alt text

SVM

alt text

KNN

alt text

K-means

alt text

Decision Tree

alt text

Random Forest

alt text

XGBoost

alt text

Gradient Boost Algorithms

Name Detail Scenarios Developer
XGBoost optimized distributed gradient boosting library many scenarios, default choice XGBoost.ai
LGBM tree based learning algorithms Large Dataset, high performace Microsoft
CatBoost Decision tree based boost framework search, recommendation systems, personal assistant, self-driving cars, weather prediction Yondex

Deep Learning - Neural Network

ANN

alt text

CNN

alt text

RNN

SOM - Self Organised Map

Importance of data versus algorithms

alt text

Popular Encoder - from text to numeric

Name Detail Scenarios
LabelEncoder Convert category data into a number, like 2 Quick
Preserver the order
OneHotEncoder Convert category data into a binary vector such as [0,0,1] Doesn’t assume the order
Can handle unseen label
Binary Encoder Convert category data into a binary vector such as [0,0] [0,1], [1,1] has less dimensionality compared to one-hot-code
Target Encoder (Mean-Encoder) Encode category data by the mean value of each category Useful when there is strong relation between category and target variable
Frequency Encoder Encodes each category by its frequency Useful when frequency is a valuable feature

Real Datasets

Keras Model

Sequential Model

alt text

Functional Model

alt text

Visualliization Tool

ChatGPT

Install Jupiter Notebook

Best Python IDE