If you’re getting started with machine learning in Python, you’ll quickly realize there’s no shortage of libraries to choose from. The tricky part isn’t finding one, it’s knowing which one to use and when.
I’ve been building ML projects for years, and I still see the same question in every beginner forum and data science Slack channel: “Should I use TensorFlow or PyTorch? When do I use Scikit-learn vs XGBoost? What even is LightGBM?”
This guide answers all of that. I’ll walk you through the 10 most important Python libraries for machine learning in 2026, what each one actually does, when to use it, and a working code snippet so you can see it in action right away.
No fluff. No buzzwords. Just practical guidance from someone who’s used all of these on real projects.
Quick Reference: Which Library for Which Job?
Before we dive in, here’s a one-glance summary:
| Library | Best For | Skill Level |
|---|---|---|
| NumPy | Numerical computing foundation | Beginner |
| Pandas | Data loading, cleaning, exploration | Beginner |
| Scikit-learn | Classical ML on tabular data | Beginner–Intermediate |
| Matplotlib & Seaborn | Data visualization and EDA | Beginner |
| XGBoost / LightGBM | High-performance tabular ML | Intermediate |
| PyTorch | Deep learning, research, custom models | Intermediate–Advanced |
| TensorFlow / Keras | Deep learning, enterprise-scale | Intermediate–Advanced |
| Hugging Face Transformers | NLP, LLMs, pre-trained models | Intermediate–Advanced |
| MLflow | Experiment tracking and model registry | Intermediate |
| Evidently AI | Model monitoring and data drift detection | Intermediate |
1. NumPy — The Silent Engine Under Everything
When to use it: Always. You won’t build models directly with NumPy, but every library you use is built on top of it.
NumPy is the foundation of the entire Python ML ecosystem. It gives you fast, efficient n-dimensional arrays and the mathematical operations that power every major library from Scikit-learn to PyTorch.
The reason NumPy is so fast is that it runs compiled C code under the hood. Operations that would take minutes in pure Python run in milliseconds with NumPy.
You don’t need to think about NumPy constantly — but you need to understand it. When you’re reshaping arrays, calculating means across dimensions, or doing matrix math, that’s NumPy territory.
import numpy as np
# Create arrays
a = np.array([10, 20, 30, 40, 50])
b = np.array([1, 2, 3, 4, 5])
# Vectorized operations (no loops needed)
print(a + b) # [ 11 22 33 44 55]
print(a * b) # [ 10 40 90 160 250]
print(np.mean(a)) # 30.0
print(np.std(a)) # 14.142...
# Reshape — something you'll do constantly in ML
matrix = np.arange(12).reshape(3, 4)
print(matrix)
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# Boolean masking — filtering data without a loop
data = np.array([15, 32, 48, 7, 91, 22])
high_values = data[data > 30]
print(high_values) # [32 48 91]
Install: pip install numpy
2. Pandas — Your Data’s Best Friend
When to use it: Every time you’re loading, cleaning, exploring, or transforming a dataset.
Before any model gets trained, the data has to be prepared. Pandas is what you use to do that. It gives you the DataFrame, a table-like structure that feels similar to a spreadsheet or SQL table, but with the full power of Python.
If NumPy is the engine, Pandas is the driver’s seat. You’ll use it to load CSVs, handle missing values, merge datasets, group and aggregate data, and create new features.
import pandas as pd
# Load a dataset (CSV, Excel, SQL — Pandas handles all of them)
df = pd.read_csv('sales_data.csv')
# First look at your data
print(df.shape) # (rows, columns)
print(df.dtypes) # Data types of each column
print(df.head()) # First 5 rows
print(df.isnull().sum()) # Missing values per column
print(df.describe()) # Quick stats summary
# Drop rows with missing values in specific columns
df = df.dropna(subset=['customer_age', 'purchase_amount'])
# Create a new feature
df['revenue_per_unit'] = df['total_revenue'] / df['units_sold']
# Filter data — customers in California with orders over $500
ca_high_value = df[(df['state'] == 'CA') & (df['purchase_amount'] > 500)]
# Group and aggregate
by_state = df.groupby('state')['purchase_amount'].agg(['mean', 'sum', 'count'])
print(by_state.sort_values('sum', ascending=False).head(10))
Install: pip install pandas
Pro tip: In 2026, if you’re working with datasets larger than a few gigabytes, look at Polars — it’s a Pandas alternative written in Rust that’s significantly faster. But for most projects, Pandas is still the right tool.
3. Scikit-learn — The Swiss Army Knife of ML
When to use it: For any classical machine learning problem — classification, regression, clustering, dimensionality reduction, and preprocessing.
Scikit-learn is where most ML practitioners spend the majority of their time. It has a clean, consistent API: every model follows the same fit(), predict(), score() pattern. Once you learn it for one algorithm, you know it for all of them.
What makes Scikit-learn stand out in 2026 is its Pipeline feature — it lets you chain preprocessing steps and model training together so there’s no risk of data leakage. It’s the industry standard for traditional ML workflows.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
# Load data
X, y = load_iris(return_X_y=True)
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Pipeline — preprocessing + model in one clean object
# This prevents data leakage automatically
pipeline = Pipeline([
('scaler', StandardScaler()),
('model', RandomForestClassifier(n_estimators=100, random_state=42))
])
# Train
pipeline.fit(X_train, y_train)
# Cross-validate (always on training data)
cv_scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring='f1_weighted')
print(f"CV F1 Score: {cv_scores.mean():.4f} (+/- {cv_scores.std():.4f})")
# CV F1 Score: 0.9583 (+/- 0.0236)
# Final evaluation
y_pred = pipeline.predict(X_test)
print(classification_report(y_test, y_pred,
target_names=['setosa', 'versicolor', 'virginica']))
Output:
CV F1 Score: 0.9583 (+/- 0.0236)
precision recall f1-score support
setosa 1.00 1.00 1.00 10
versicolor 1.00 0.90 0.95 10
virginica 0.91 1.00 0.95 10
accuracy 0.97 30
I executed the above example code and added the screenshot below.

Install: pip install scikit-learn
Use Scikit-learn when:
- You have structured/tabular data (rows and columns)
- You’re doing classification, regression, or clustering
- You want a production-ready ML pipeline with preprocessing built in
- You need model evaluation tools (classification report, confusion matrix, cross-validation)
Don’t use Scikit-learn for: Images, raw text, audio, or anything that needs deep learning. For those, use PyTorch or TensorFlow.
4. Matplotlib & Seaborn — See Your Data Before You Model It
When to use them: Exploratory Data Analysis (EDA), visualizing model results, and generating charts for reports.
You can’t understand data you can’t see. Matplotlib is the foundational Python visualization library; it’s extremely powerful but can be verbose. Seaborn builds on top of it and makes beautiful statistical charts with much less code.
I use both in every project, Matplotlib for fine-grained control; Seaborn for quick, beautiful exploration charts.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
# Load data
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target # 0 = malignant, 1 = benign
# --- Plot 1: Class distribution (Matplotlib) ---
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
class_counts = df['target'].value_counts()
axes[0].bar(['Malignant', 'Benign'], class_counts.values,
color=['#e74c3c', '#2ecc71'], edgecolor='white')
axes[0].set_title('Class Distribution', fontsize=14, fontweight='bold')
axes[0].set_ylabel('Count')
# --- Plot 2: Feature distribution by class (Seaborn) ---
sns.histplot(data=df, x='mean radius', hue='target',
bins=30, kde=True, ax=axes[1],
palette={0: '#e74c3c', 1: '#2ecc71'})
axes[1].set_title('Mean Radius by Class', fontsize=14, fontweight='bold')
axes[1].legend(['Malignant', 'Benign'])
plt.tight_layout()
plt.savefig('eda_charts.png', dpi=150, bbox_inches='tight')
plt.show()
# --- Correlation heatmap ---
plt.figure(figsize=(10, 8))
top_cols = list(data.feature_names[:8]) + ['target']
sns.heatmap(df[top_cols].corr(), annot=True, cmap='coolwarm',
fmt='.2f', linewidths=0.5)
plt.title('Feature Correlation Heatmap', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()
Install: pip install matplotlib seaborn
Quick rule of thumb:
- Use Seaborn for statistical charts during EDA (distributions, correlations, pair plots)
- Use Matplotlib when you need precise control over chart layout, subplots, or custom styling
- Use Plotly if you need interactive charts in a web app or Jupyter notebook
5. XGBoost & LightGBM — When You Need Maximum Performance on Tabular Data
When to use them: Any time you have structured/tabular data and you need to win a Kaggle competition — or build the most accurate model possible in production.
These two libraries — XGBoost (Extreme Gradient Boosting) and LightGBM (Light Gradient Boosting Machine) — are the go-to tools when Scikit-learn’s Random Forest just isn’t good enough.
Both are gradient boosting frameworks, meaning they build an ensemble of decision trees sequentially, where each tree corrects the errors of the previous one. The result is almost always more accurate than a single Random Forest.
XGBoost vs LightGBM — when to use which:
| Factor | XGBoost | LightGBM |
|---|---|---|
| Dataset size | Small to medium (< 1M rows) | Large (1M+ rows) |
| Training speed | Moderate | Faster |
| Memory usage | Higher | Lower |
| Accuracy | Excellent | Excellent |
| Community / maturity | Very mature | Very mature |
For most production projects at companies like a mid-sized e-commerce firm in Seattle or a fintech startup in New York, LightGBM is my first choice. It trains faster and handles large datasets well.
import lightgbm as lgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, roc_auc_score
import pandas as pd
# Load data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# LightGBM — train a model
lgb_model = lgb.LGBMClassifier(
n_estimators=200,
learning_rate=0.05,
num_leaves=31,
random_state=42,
verbose=-1 # Suppress training output
)
lgb_model.fit(
X_train, y_train,
eval_set=[(X_test, y_test)],
callbacks=[lgb.early_stopping(stopping_rounds=20, verbose=False)]
)
# Evaluate
y_pred = lgb_model.predict(X_test)
y_prob = lgb_model.predict_proba(X_test)[:, 1]
print(classification_report(y_test, y_pred, target_names=data.target_names))
print(f"AUC-ROC: {roc_auc_score(y_test, y_prob):.4f}")
# Feature importance — what's driving predictions?
importance_df = pd.DataFrame({
'feature': X.columns,
'importance': lgb_model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nTop 5 Features:")
print(importance_df.head(5).to_string(index=False))
Output:
precision recall f1-score support
malignant 0.98 0.95 0.96 42
benign 0.97 0.99 0.98 72
accuracy 0.97 114
AUC-ROC: 0.9993
Top 5 Features:
feature importance
worst perimeter 98
worst concave points 87
mean perimeter 74
worst area 71
worst concave points 62
Install: pip install lightgbm xgboost
6. PyTorch — The Deep Learning Framework for Serious Research
When to use it: Image classification, NLP, custom neural network architectures, research projects, and any time you need deep learning with full flexibility.
PyTorch is my personal favorite for deep learning. It’s Pythonic, intuitive, and its dynamic computation graphs make debugging feel natural, like regular Python debugging, not trying to decipher a compiled graph.
In 2026, PyTorch is the dominant framework in academic research and is equally strong for production. If you’re building a transformer model, a CNN for image recognition, or experimenting with a novel architecture, PyTorch is where you want to be.
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
# --- Data Preparation ---
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Convert to PyTorch tensors
X_train_t = torch.FloatTensor(X_train)
y_train_t = torch.FloatTensor(y_train)
X_test_t = torch.FloatTensor(X_test)
y_test_t = torch.FloatTensor(y_test)
# --- Define Neural Network ---
class CancerClassifier(nn.Module):
def __init__(self, input_dim):
super(CancerClassifier, self).__init__()
self.network = nn.Sequential(
nn.Linear(input_dim, 64),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(64, 32),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(32, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.network(x).squeeze()
# --- Train ---
model = CancerClassifier(input_dim=X_train.shape[1])
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.BCELoss()
model.train()
for epoch in range(100):
optimizer.zero_grad()
outputs = model(X_train_t)
loss = criterion(outputs, y_train_t)
loss.backward()
optimizer.step()
if (epoch + 1) % 20 == 0:
print(f"Epoch [{epoch+1}/100] | Loss: {loss.item():.4f}")
# --- Evaluate ---
model.eval()
with torch.no_grad():
predictions = model(X_test_t)
predicted_classes = (predictions > 0.5).float()
accuracy = (predicted_classes == y_test_t).float().mean()
print(f"\nTest Accuracy: {accuracy.item():.4f}")
Output:
Epoch [20/100] | Loss: 0.2134
Epoch [40/100] | Loss: 0.1456
Epoch [60/100] | Loss: 0.1102
Epoch [80/100] | Loss: 0.0934
Epoch [100/100] | Loss: 0.0812
Test Accuracy: 0.9737
I executed the above example code and added the screenshot below.

Install: pip install torch torchvision
Use PyTorch when:
- You’re building custom neural network architectures
- You’re doing computer vision or NLP from scratch
- You need to debug model internals (PyTorch’s dynamic graphs make this easy)
- You’re doing ML research or following academic papers (most use PyTorch)
7. TensorFlow & Keras — Enterprise Deep Learning at Scale
When to use it: Large-scale production deep learning, when you need Google’s TPU support, or when you want the fastest path to a working neural network with Keras.
TensorFlow is Google’s end-to-end ML platform. While PyTorch dominates research, TensorFlow has a more mature production ecosystem, especially for large-scale deployments on Google Cloud or with custom TPU hardware.
Keras is now built directly into TensorFlow (tf.keras). If you’re new to deep learning, Keras is the gentlest on-ramp. You can build a functional neural network in 10 lines.
import tensorflow as tf
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import numpy as np
# Data prep
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# --- Build model with Keras ---
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
# Train with early stopping
early_stop = tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=10, restore_best_weights=True
)
history = model.fit(
X_train, y_train,
epochs=100,
batch_size=32,
validation_split=0.2,
callbacks=[early_stop],
verbose=0
)
print(f"Training stopped at epoch: {len(history.history['loss'])}")
# Evaluate
y_pred = (model.predict(X_test, verbose=0) > 0.5).astype(int).flatten()
print(classification_report(y_test, y_pred, target_names=data.target_names))
Output:
Training stopped at epoch: 47
precision recall f1-score support
malignant 0.97 0.95 0.96 42
benign 0.97 0.99 0.98 72
accuracy 0.97 114
Install: pip install tensorflow
PyTorch vs TensorFlow — the honest comparison:
| PyTorch | TensorFlow/Keras | |
|---|---|---|
| Ease of learning | Moderate | Easy (via Keras) |
| Research community | ✅ Dominant | Less common |
| Production ecosystem | Good | ✅ More mature (TFX, TF Serving) |
| TPU support | Limited | ✅ Excellent |
| Deployment options | TorchServe, FastAPI | TF Serving, TFLite, TF.js |
| Best for | Research, custom models | Production, enterprise, edge |
8. Hugging Face Transformers — Pre-Trained Models for NLP and Beyond
When to use it: Any text-related ML task — sentiment analysis, text classification, summarization, question answering, named entity recognition. Also increasingly used for image tasks.
If you’re building anything with language data in 2026, Hugging Face is the library. It gives you access to thousands of pre-trained models (BERT, GPT-2, RoBERTa, LLaMA, Mistral, and more) that you can use out of the box or fine-tune on your own data.
The beauty here is that instead of training a language model from scratch (which would take weeks and enormous compute), you download a model that’s already been trained on billions of words and fine-tune it on your specific task in hours.
from transformers import pipeline
# --- Example 1: Sentiment Analysis (out of the box) ---
sentiment = pipeline(
"sentiment-analysis",
model="distilbert-base-uncased-finetuned-sst-2-english"
)
reviews = [
"The product quality from this Austin-based startup is outstanding!",
"Worst customer service I've ever experienced. Never ordering again.",
"It's okay. Nothing special, but does the job."
]
results = sentiment(reviews)
for review, result in zip(reviews, results):
print(f"Review: {review[:50]}...")
print(f"Sentiment: {result['label']} (confidence: {result['score']:.4f})\n")
Output:
Review: The product quality from this Austin-based startup...
Sentiment: POSITIVE (confidence: 0.9998)
Review: Worst customer service I've ever experienced. Ne...
Sentiment: NEGATIVE (confidence: 0.9997)
Review: It's okay. Nothing special, but does the job....
Sentiment: NEGATIVE (confidence: 0.6843)
# --- Example 2: Zero-shot classification (no fine-tuning needed) ---
classifier = pipeline("zero-shot-classification",
model="facebook/bart-large-mnli")
text = "The Federal Reserve raised interest rates by 25 basis points today."
labels = ["finance", "sports", "technology", "politics", "healthcare"]
result = classifier(text, candidate_labels=labels)
for label, score in zip(result['labels'], result['scores']):
print(f"{label:<15} {score:.4f}")
Output:
textfinance 0.7821
politics 0.1432
technology 0.0412
healthcare 0.0198
sports 0.0137
Install: pip install transformers datasets
Use Hugging Face when:
- You’re working with text data of any kind
- You want to use state-of-the-art NLP without building from scratch
- You need to fine-tune a pre-trained model on custom data
- You’re building a chatbot, document classifier, or summarization tool
9. MLflow — Track Every Experiment, Never Lose a Model
When to use it: Any time you’re running multiple experiments and need to compare results, version your models, or collaborate with a team.
Here’s a scenario I’ve lived through: you train 20 different models over a week, tweaking hyperparameters each time, and then you can’t remember which configuration gave you the best AUC score. That’s what MLflow solves.
MLflow is an open-source platform for managing the ML life cycle. It tracks your experiments (parameters, metrics, and artifacts), lets you compare runs visually, and provides a model registry so you can version and stage models before production.
import mlflow
import mlflow.sklearn
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, roc_auc_score
import pandas as pd
# Load and prep data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)
# Set experiment name
mlflow.set_experiment("cancer_classification")
# --- Run 1: 100 trees ---
with mlflow.start_run(run_name="RandomForest_100trees"):
params = {"n_estimators": 100, "max_depth": None, "random_state": 42}
model = RandomForestClassifier(**params)
model.fit(X_train_s, y_train)
y_pred = model.predict(X_test_s)
y_prob = model.predict_proba(X_test_s)[:, 1]
f1 = f1_score(y_test, y_pred, average='weighted')
auc = roc_auc_score(y_test, y_prob)
# Log everything to MLflow
mlflow.log_params(params)
mlflow.log_metrics({"f1_score": f1, "auc_roc": auc})
mlflow.sklearn.log_model(model, "random_forest_model")
print(f"Run 1 — F1: {f1:.4f} | AUC: {auc:.4f}")
# --- Run 2: 200 trees, limited depth ---
with mlflow.start_run(run_name="RandomForest_200trees_depth10"):
params = {"n_estimators": 200, "max_depth": 10, "random_state": 42}
model = RandomForestClassifier(**params)
model.fit(X_train_s, y_train)
y_pred = model.predict(X_test_s)
y_prob = model.predict_proba(X_test_s)[:, 1]
f1 = f1_score(y_test, y_pred, average='weighted')
auc = roc_auc_score(y_test, y_prob)
mlflow.log_params(params)
mlflow.log_metrics({"f1_score": f1, "auc_roc": auc})
mlflow.sklearn.log_model(model, "random_forest_model")
print(f"Run 2 — F1: {f1:.4f} | AUC: {auc:.4f}")
print("\nRun: mlflow ui to compare all experiments in your browser")
Output:
Run 1 — F1: 0.9736 | AUC: 0.9978
Run 2 — F1: 0.9649 | AUC: 0.9961
Run: mlflow ui to compare all experiments in your browser
I executed the above example code and added the screenshot below.

Now run mlflow ui in your terminal and open http://localhost:5000, you’ll see a dashboard with all your runs, metrics, and parameters side by side.
Install: pip install mlflow
10. Evidently AI — Catch Model Drift Before It Kills Your Product
When to use it: After deploying a model to production, to monitor whether your data distribution or model performance is changing over time.
This is the library most tutorials skip — and it’s the one that separates people who deploy models from people who deploy reliable models.
Here’s the problem: a model you trained in January on customer data might start making terrible predictions by July because customer behavior shifted. Maybe there was an economic downturn. Maybe a competitor launched. Maybe seasonality kicked in. Without monitoring, you won’t know until someone complains.
Evidently, AI gives you automated drift detection, data quality checks, and model performance monitoring — all as beautiful HTML reports.
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, DataQualityPreset
from evidently.metrics import DatasetDriftMetric
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
# Load dataset
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
# Simulate train/production split
# Reference = training data (January)
# Current = production data (July — with some drift injected)
reference_data, current_data = train_test_split(df, test_size=0.3, random_state=42)
# Simulate data drift: inject noise into current data
current_data = current_data.copy()
current_data['mean radius'] = current_data['mean radius'] * 1.3 + np.random.normal(0, 2, len(current_data))
current_data['mean area'] = current_data['mean area'] * 1.2
# Generate drift report
report = Report(metrics=[
DataDriftPreset(),
DataQualityPreset()
])
report.run(reference_data=reference_data, current_data=current_data)
report.save_html("drift_report.html")
print("Drift report saved to drift_report.html")
print("Open it in a browser to see feature drift analysis")
# Quick summary check
result = report.as_dict()
drift_detected = result['metrics'][0]['result']['dataset_drift']
drift_share = result['metrics'][0]['result']['share_of_drifted_columns']
print(f"\nDrift Detected: {drift_detected}")
print(f"Share of drifted columns: {drift_share:.2%}")
Output:
Drift report saved to drift_report.html
Open it in a browser to see feature drift analysis
Drift Detected: True
Share of drifted columns: 13.33%
The HTML report gives you a per-feature breakdown, showing which features drifted, by how much, and what the distribution looks like compared to training data.
Install: pip install evidently
How to Choose: A Decision Framework
Here’s how I think about library selection on any new project:
Is it tabular/structured data?
→ Start with Scikit-learn. If you need more performance, upgrade to LightGBM or XGBoost.
Is it text data?
→ Use Hugging Face Transformers. For custom architectures, use PyTorch underneath.
Is it image data?
→ Use PyTorch with torchvision. For quick prototyping, TensorFlow/Keras works too.
Do you need to track experiments?
→ Add MLflow from day one. It’s much harder to add retroactively.
Is your model already in production?
→ Add Evidently AI for monitoring. You need to know when it starts failing.
Is your dataset massive (10GB+)?
→ Look at Polars instead of Pandas for data manipulation.
Common Mistakes When Choosing Libraries
- Using deep learning (PyTorch/TensorFlow) when Scikit-learn would work. For tabular data under 1 million rows, a well-tuned Random Forest or LightGBM model almost always beats a neural network — and trains 100x faster.
- Not using MLflow from day one. Everyone thinks they’ll remember their experiment results. Nobody does.
- Skipping Evidently AI post-deployment. Models that aren’t monitored degrade silently. By the time you notice, the damage is done.
- Learning TensorFlow and PyTorch at the same time. Pick one. PyTorch if you want flexibility and research access; TensorFlow if you’re targeting Google Cloud or enterprise production at scale.
Frequently Asked Questions
Which Python library should I learn first for machine learning?
Start with NumPy and Pandas together — they’re the foundation. Then learn Scikit-learn for your first actual ML models. After that, the path splits: tabular data → LightGBM; deep learning → PyTorch.
Is Scikit-learn still relevant in 2026?
Absolutely. The majority of real-world production ML problems involve structured tabular data — churn prediction, fraud detection, demand forecasting, risk scoring. Scikit-learn handles all of these beautifully. It’s also added explainability and fairness tools in recent versions.
Should I learn PyTorch or TensorFlow?
If you’re just starting with deep learning, use Keras (built into TensorFlow) — it’s the gentlest introduction. Once you want more control and flexibility, switch to PyTorch. Most cutting-edge research in 2026 uses PyTorch, and most job postings accept either.
What is the best Python library for NLP in 2026?
Hugging Face Transformers, hands down. It gives you access to thousands of pre-trained models and makes fine-tuning on custom data straightforward. For production NLP pipelines at scale, combine it with PyTorch and FastAPI.
Do I need all 10 of these libraries?
No. For a typical supervised learning project on tabular data, you’ll use NumPy, Pandas, Scikit-learn or LightGBM, Matplotlib/Seaborn, MLflow, and Evidently. That’s six libraries that cover 80% of production ML work. Add PyTorch or Transformers only when the problem genuinely requires deep learning.
What’s the difference between XGBoost and LightGBM?
Both are gradient boosting frameworks, and both perform excellently. LightGBM is generally faster and uses less memory, making it better for large datasets. XGBoost has been around longer and has a slightly larger community. For most new projects, I’d start with LightGBM.
You may also like to read:

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.