NumPy Normalize Array Between 0 And 1

When working with numerical data in Python, normalization is a common preprocessing step that can significantly improve the performance of machine learning algorithms.

Recently, I was analyzing US housing price data and needed to normalize the values between 0 and 1 to make my model more effective. The issue was that the raw data had vastly different scales, and prices ranged from $100,000 to millions.

In this article, I’ll show you several simple ways to normalize NumPy arrays to a range between 0 and 1, I’ll cover both 1D and 2D arrays with practical examples that you can apply to your projects.

This Tutorial Covers:

Python NumPy Normalization

Normalization is the process of transforming data to bring it within a specific range, typically 0 to 1. This helps algorithms converge faster and improves the accuracy of many machine learning models.

The basic formula for min-max normalization is:

normalized_value = (value - min_value) / (max_value - min_value)

This scales all values proportionally so that the smallest value becomes 0 and the largest becomes 1.

Read NumPy Reverse Array in Python

Method 1: Use NumPy’s Built-in Functions

The simplest way to normalize an array between 0 and 1 is by using Python NumPy’s built-in functions. Let’s look at a practical example:

import numpy as np

# Sample US temperature data (Fahrenheit) for 10 cities
temperatures = np.array([85, 91, 72, 68, 95, 88, 79, 82, 75, 102])

# Get min and max values
min_temp = np.min(temperatures)
max_temp = np.max(temperatures)

# Normalize the array
normalized_temperatures = (temperatures - min_temp) / (max_temp - min_temp)

print("Original temperatures:", temperatures)
print("Normalized temperatures:", normalized_temperatures)

Output:

Original temperatures: [ 85  91  72  68  95  88  79  82  75 102]
Normalized temperatures: [0.5       0.67647059 0.11764706 0.         0.79411765 0.58823529
 0.32352941 0.41176471 0.20588235 1.        ]

I executed the above example code and added the screenshot below.

You can see how the lowest temperature (68°F) became 0 and the highest (102°F) became 1, with all others scaled proportionally.

Check out NumPy Array to a String in Python

Method 2: Create a Custom Normalization Function

For reusability, let’s create a custom function to normalize arrays:

import numpy as np

def normalize_array(arr):
    """
    Normalize a NumPy array to range [0, 1]
    """
    return (arr - np.min(arr)) / (np.max(arr) - np.min(arr))

# Sample US stock prices data
stock_prices = np.array([145.23, 167.89, 132.45, 189.67, 121.34, 156.78])

# Normalize the stock prices
normalized_prices = normalize_array(stock_prices)

print("Original stock prices:", stock_prices)
print("Normalized stock prices:", normalized_prices)

Output:

Original stock prices: [145.23 167.89 132.45 189.67 121.34 156.78]
Normalized stock prices: [0.34951456 0.68129063 0.16245066 1.         0.         0.51873199]

I executed the above example code and added the screenshot below.

This custom function makes it easy to normalize any array without repeating the same code.

Read np.add.at() Function in Python

Method 3: Normalize 2D Arrays

When working with 2D arrays (like datasets with multiple features), you might want to normalize each feature independently:

import numpy as np

# 2D array of US house data: [square_footage, price, age_in_years]
house_data = np.array([
    [1800, 350000, 15],
    [2200, 450000, 8],
    [1500, 290000, 25],
    [3000, 650000, 3],
    [1900, 380000, 12]
])

# Normalize each column (feature) independently
normalized_data = np.zeros_like(house_data, dtype=float)

for i in range(house_data.shape[1]):
    column = house_data[:, i]
    normalized_data[:, i] = (column - np.min(column)) / (np.max(column) - np.min(column))

print("Original house data:")
print(house_data)
print("\nNormalized house data:")
print(normalized_data)

Output:

Original house data:
[[1800 350000     15]
 [2200 450000      8]
 [1500 290000     25]
 [3000 650000      3]
 [1900 380000     12]]

Normalized house data:
[[0.2        0.16666667 0.54545455]
 [0.46666667 0.44444444 0.22727273]
 [0.         0.         1.        ]
 [1.         1.         0.        ]
 [0.26666667 0.25       0.40909091]]

In this example, each feature (square footage, price, and age) is normalized independently, making them comparable despite having different original scales.

Check out Replace Values in NumPy Array by Index in Python

Method 4: Use scikit-learn’s MinMaxScaler

For more sophisticated projects, you might want to use scikit-learn’s MinMaxScaler:

import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Sample US population data for different counties (in thousands)
population_data = np.array([[423], [1200], [89], [567], [2100], [345]])

# Initialize the scaler
scaler = MinMaxScaler(feature_range=(0, 1))

# Fit and transform the data
normalized_population = scaler.fit_transform(population_data)

print("Original population data (thousands):", population_data.flatten())
print("Normalized population data:", normalized_population.flatten())

# The great thing about using MinMaxScaler is that you can inverse transform
original_data = scaler.inverse_transform(normalized_population)
print("Inverse transformed data:", original_data.flatten())

Output:

Original population data (thousands): [ 423 1200   89  567 2100  345]
Normalized population data: [0.16595745 0.55319149 0.         0.23758865 1.         0.12732919]
Inverse transformed data: [ 423. 1200.   89.  567. 2100.  345.]

I executed the above example code and added the screenshot below.

The advantage of using MinMaxScaler is that it provides methods to reverse the normalization if needed, and it integrates well with scikit-learn pipelines.

Read np.diff() Function in Python

Method 5: Use np.linalg.norm for Vector Normalization

For some applications, you might want to normalize vectors to have a unit norm (length of 1):

import numpy as np

# US GDP growth rates (quarterly)
gdp_growth = np.array([2.3, 1.8, 3.2, 2.1])

# Normalize to unit length
normalized_gdp = gdp_growth / np.linalg.norm(gdp_growth)

print("Original GDP growth rates:", gdp_growth)
print("Normalized GDP growth rates:", normalized_gdp)
print("Length of normalized vector:", np.linalg.norm(normalized_gdp))

Output:

Original GDP growth rates: [2.3 1.8 3.2 2.1]
Normalized GDP growth rates: [0.43224222 0.33830486 0.60126996 0.39468067]
Length of normalized vector: 1.0

This method normalizes the vector to have a unit length, which is useful in certain machine learning and linear algebra applications.

Check out NumPy Filter 2D Array by Condition in Python

Best Practices and Considerations for Normalizing the Data.

Here are some important considerations when normalizing data:

Outliers: Be cautious of outliers as they can significantly skew your normalization. Consider removing or capping outliers before normalizing.
Test-Train Split: Always fit your scaler on the training data only, then apply it to both training and test data to prevent data leakage.
Feature Scaling: For algorithms like gradient descent, neural networks, and k-nearest neighbors, feature scaling is crucial for optimal performance.
Save Normalization Parameters: In production, you’ll need to apply the same normalization to new data, so save your min and max values.

I hope you found this article helpful for understanding how to normalize NumPy arrays between 0 and the methods that I have used in this tutorial are using NumPy’s built-in function, creating a custom normalization function, using scikit-learn’s minmaxscaler, and using np.linalg.norm for vector normalization. I also covered some best practices and considerations.

Other Python articles you may also like:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/

NumPy Normalize Array Between 0 and 1

Python NumPy Normalization

Method 1: Use NumPy’s Built-in Functions

Method 2: Create a Custom Normalization Function

Method 3: Normalize 2D Arrays

Method 4: Use scikit-learn’s MinMaxScaler

Method 5: Use np.linalg.norm for Vector Normalization

Best Practices and Considerations for Normalizing the Data.

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends