Scikit-learn Logistic Regression

As a Python developer with over a decade of experience, I’ve worked extensively with machine learning models. Among them, logistic regression remains one of the most useful yet simple algorithms for classification problems.

In this article, I’ll walk you through how to implement logistic regression using Scikit-learn, the go-to Python library for machine learning. I’ll share practical methods and tips based on real-world experience so you can quickly apply this in your projects.

Let’s get started..!

This Tutorial Covers:

What is Logistic Regression?

Logistic regression is a classification algorithm used to predict binary outcomes, yes/no, true/false, or 0/1. Unlike linear regression, which predicts continuous values, logistic regression estimates the probability that a given input belongs to a particular class.

For example, suppose you want to predict whether a customer in the US will buy a product (1) or not (0) based on their age, income, and browsing history. Logistic regression can model this probability effectively.

Get Started with Logistic Regression in Scikit-learn

Let me show you how to create a logistic regression model step-by-step using a practical example. Imagine you have a dataset of US bank customers, and you want to predict whether they will subscribe to a term deposit based on their features.

Step 1: Import Libraries and Load Data

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load dataset (replace with your actual data source)
data = pd.read_csv('us_bank_customers.csv')

# Preview data
print(data.head())

Step 2: Prepare Data

Select relevant features and the target variable. Clean the data by handling missing values and encoding categorical variables.

# Features and target
X = data[['age', 'balance', 'duration', 'campaign']]
y = data['subscribed']  # 1 if subscribed, 0 otherwise

# Split into training and testing sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Step 3: Train the Logistic Regression Model

# Initialize the model
model = LogisticRegression(max_iter=1000)

# Train the model
model.fit(X_train, y_train)

Step 4: Make Predictions and Evaluate

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate accuracy
print("Accuracy:", accuracy_score(y_test, y_pred))

# Confusion matrix
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

# Detailed classification report
print(classification_report(y_test, y_pred))

I executed the above example code and added the screenshot below.

how to import logistic regression from sklearn

Read Scikit-Learn accuracy_score

Different Ways to Use Logistic Regression in Scikit-learn

Now, I will explain to you the different ways to use logistic regression in Scikit-learn.

1. Regular Logistic Regression (Default)

The example above uses the default logistic regression with L2 regularization. This is suitable for most cases and helps prevent overfitting.

2. Logistic Regression with L1 Regularization (Feature Selection)

L1 regularization can shrink some coefficients to zero, effectively performing feature selection. This is useful when you have many features.

model_l1 = LogisticRegression(penalty='l1', solver='liblinear', max_iter=1000)
model_l1.fit(X_train, y_train)

3. Multiclass Logistic Regression

While logistic regression is often used for binary classification, Scikit-learn supports multiclass classification using strategies like “one-vs-rest.”

# For multiclass target variable
model_multi = LogisticRegression(multi_class='ovr', max_iter=1000)
model_multi.fit(X_train, y_train)

4. Use Logistic Regression with Pipeline and Scaling

In many cases, features require scaling for better model performance. Using Scikit-learn’s Pipeline simplifies preprocessing and modeling.

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('logreg', LogisticRegression(max_iter=1000))
])

pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
print("Pipeline Accuracy:", accuracy_score(y_test, y_pred))

I executed the above example code and added the screenshot below.

Check out the Scikit-Learn Genetic Algorithm

Tips from My Experience

Feature Engineering Matters: Logistic regression assumes a linear relationship between features and the log odds of the outcome. Create meaningful features or apply transformations if needed.
Handle Imbalanced Data: In US datasets like fraud detection, the classes may be imbalanced. Consider using techniques like class weighting (class_weight='balanced') or resampling.
Tune Hyperparameters: Use GridSearchCV or RandomizedSearchCV to find the best regularization strength (C parameter).
Interpretability: Logistic regression provides coefficients that show the impact of each feature. This is valuable in industries like finance and healthcare where understanding the model is critical.

Logistic regression with Scikit-learn is a useful yet accessible tool for classification tasks. Whether you’re working on customer behavior in the US banking sector or predicting election outcomes, this method offers a solid foundation.

I encourage you to experiment with different regularization techniques and preprocessing steps to optimize your models. The examples here reflect practical workflows I’ve used repeatedly in production environments.

Other Python articles you may also like:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/