AI Stack Layer · 7 of 8

Classic ML

Pre-LLM machine learning — and still the right tool for huge swaths of real problems. Tabular prediction, computer vision, recommendations, fraud, forecasting. Cheap, fast, interpretable.

SupervisedUnsupervisedTabularDeep LearningLayer 7
← Back to AI Landscape
Quick Facts

At a Glance

Basic Concepts

  • Supervised learning: predict a label / value from features (regression, classification).
  • Unsupervised: find structure without labels (clustering, dimensionality reduction).
  • Reinforcement learning: an agent learns by taking actions and getting rewards.
  • Train / validation / test split — never evaluate on data the model has seen.
  • Overfitting = great on training, bad on new data. Regularization fights it.
Landscape

The Major Libraries

LibraryBest for
scikit-learnThe Swiss army knife — regression, classification, clustering, pipelines, all in clean Python.
XGBoostGradient-boosted trees — wins most tabular Kaggle competitions.
LightGBMMicrosoft's GBT — faster training, similar accuracy.
CatBoostYandex's GBT — best-in-class with categorical features.
PyTorchThe default deep-learning framework today; flexible, research-friendly.
TensorFlow / KerasGoogle's DL framework; strong production tooling (TFX, TF Serving).
JAXComposable function transforms for high-performance ML / research.
Hugging Face TransformersPre-trained model hub for NLP, vision, audio.
statsmodelsClassical statistics — linear models, hypothesis tests, time series.
Prophet / NeuralProphetForecasting library from Meta.
OpenCVComputer vision toolkit (faces, edges, tracking).
Surprise / implicit / RecBoleRecommender systems.
Mechanics

Common Tasks

Tabular Prediction

The bread & butter — predicting churn, fraud, prices, conversion from a row of features. Gradient-boosted trees (XGBoost / LightGBM / CatBoost) almost always win, with neural nets close behind on very large datasets.

import xgboost as xgb
model = xgb.XGBClassifier(n_estimators=500, max_depth=6)
model.fit(X_train, y_train)
pred = model.predict_proba(X_test)[:, 1]
Computer Vision
  • Classification: "what's in this image?" — ResNet, EfficientNet, ViT.
  • Detection: "where are the things?" — YOLO, DETR, Mask R-CNN.
  • Segmentation: "label every pixel" — U-Net, SAM.
  • Generation: Stable Diffusion, Flux, Imagen.
NLP (pre-LLM)

Tokenization, TF-IDF, word2vec / GloVe, named-entity recognition, topic modeling. Most of these still matter as building blocks — and BERT-family encoders remain the best choice for cheap, fast classification & embeddings.

Forecasting & Time Series

ARIMA, Prophet, neural forecasting (N-BEATS, Temporal Fusion Transformer). Most production forecasting still combines a tree model on lagged features with seasonal decomposition.

Evaluation Metrics
TaskMetric
RegressionRMSE, MAE, R²
Binary classificationAUC-ROC, precision, recall, F1, log-loss
Multi-classAccuracy, macro F1, confusion matrix
Ranking / RecSysNDCG, MRR, MAP, hit-rate
ForecastingMAPE, sMAPE, MASE
DetectionmAP @ IoU thresholds
Classic ML vs LLMs — When to Use Which
Use Classic ML when…
  • You have structured / tabular data.
  • You need millisecond-latency & pennies-per-prediction.
  • You need interpretability & explainability.
  • Compliance demands deterministic models.
Reach for LLMs when…
  • Inputs are free-form natural language.
  • You need reasoning, summarization, or generation.
  • You want zero-shot — no labeled training data.
  • Latency & cost can absorb a few hundred ms / cents.
Continue

Other AI Stack Layers