| Category | Explain | Scenarios |
|---|---|---|
| Supervised Learning | Training data is labeled | Auto Driving Image Recognization |
| UnSupervised Learning | Training Data is not labeled | Clustering Anomaly detection Dimensionality Reduction |
| Semi-Supervised Learning | Trained on labeled and unlabeled data | self-training co-training |
| Reinforcement Learning | Adjust according to feedback to gain maximum reword | game playing Reward such as Finance natural language process |
| Name | Category | Detail | Scenarios |
|---|---|---|---|
| Linear Regression | Supervised | Continous Regression Line | |
| Logistic Regression | Supervised | Yes - No | Binary Classification Medication Dignosis Political Forecasting |
| Naive Bayes | Supervised | Based on Bayes-theory | |
| Decision Tree | Supervised | Decision boundary is when the possibility is equal for both side in 1-0 scenario Classification error, Gini and Entropy to meaure the algorithm stopping condition |
|
| Random Forest | Supervised | Use a set of sub-forest to vote | Robost, default for many scenarios |
| KNN - K-Nearest Neighbor | Unsupervised | Use the K nearst neighbor to decide where it belongs | classification |
| K-means | Unsupervised | Use a ‘center’ to define each cluster | Clustering |
| SVM - Supported Vector Machine | Supervised | Classification and regression | |
| XGBoost | Supervised | Large dataset, complex problems | classification Regression Feature selecion abnormal detection natural language processing feature selection |
| ANN - Artificial Neural Network | Supervised, Deep Learning | An ANN is made up of layers of nodes (neurons) that process data and “learn” patterns from it | |
| CNN - Convolutional Neural Networks | Supervised, Deep Learning | Image Classification Object Detection Image Segmentation |
|
| RNN - Recurrent Neural Networks | Supervised, Deep Learning | Sequential Data Processing Time series preication Speech Recognition |
|
| SOM - Self Organised Map | unsupervised, Deep Learning | Dimensionality reduction and visualization of high-dimensional data | |
| GAN - Generative Adversarial Network | Supervised, unsupervised, Deep Learning | Image Generation Image to image translation abormal detection |
|
| Deep Belief Network | UnSupervised, Deep Learning | ||
| Autoencoders | Unsupervised, Deep Learning | data denoising dimensionality reduction anomaly detection |
|
| DRL - Deep Reinforcement Learning | Reinforcement, Deep Learning | Combining deep learning with reinforcement learning | Feature Learning |
| Transformer Network | Semi-supervised, Deep Learning | Natural Language Processing, including BERT, GPT | |
| Yolo - You only look once | Supervised, Deep Learning | Real time object detection Traffic Monitoring Retail Analysis |
| Library | Short Description |
|---|---|
| Scikit-Learn | Traditional Machine Learning Algorithms, such as XGBoost |
| TensorFlow | Deep Learning Framework, Google backed |
| Pytorch | Deep Learning Framework, Facebook backed |
| Keras | Deep Learning Library, Popular choice and supported multiple platform |
| Numpy | Numeric computing |
| Matplotlib | Visualize 2-D data |
| Pandas | Read and Analysis structured data |
| dtreeviz | A python library to visualize deep-tree process in details |







| Name | Detail | Scenarios | Developer |
|---|---|---|---|
| XGBoost | optimized distributed gradient boosting library | many scenarios, default choice | XGBoost.ai |
| LGBM | tree based learning algorithms | Large Dataset, high performace | Microsoft |
| CatBoost | Decision tree based boost framework | search, recommendation systems, personal assistant, self-driving cars, weather prediction | Yondex |



| Name | Detail | Scenarios |
|---|---|---|
| LabelEncoder | Convert category data into a number, like 2 | Quick Preserver the order |
| OneHotEncoder | Convert category data into a binary vector such as [0,0,1] | Doesn’t assume the order Can handle unseen label |
| Binary Encoder | Convert category data into a binary vector such as [0,0] [0,1], [1,1] | has less dimensionality compared to one-hot-code |
| Target Encoder (Mean-Encoder) | Encode category data by the mean value of each category | Useful when there is strong relation between category and target variable |
| Frequency Encoder | Encodes each category by its frequency | Useful when frequency is a valuable feature |
