How NumPy Create NaN Array in Python?

I was working on a data analysis project for a US retail chain where I needed to handle missing sales data. The issue was, I needed to create placeholder arrays filled with NaN (Not a Number) values that would later be populated with actual data.

In this article, I’ll share several practical methods to create NaN arrays in NumPy, a fundamental skill for any data analysis work.

Create a NaN Array in NumPy

NaN (Not a Number) is a special floating-point value used to represent undefined or missing numerical values. In data analysis, NaN values are incredibly useful for marking missing data points that need special handling.

NumPy, Python’s efficient numerical computing library, makes working with NaN values simple and efficient.

Read ValueError: setting an array element with a sequence error in Python

Method 1: Use np.nan and np.full()

One of the simplest ways to create a NaN array is by using the np.full() function combined with NumPy’s built-in np.nan constant in Python.

import numpy as np

# Create a 3x3 array filled with NaN values
nan_array = np.full((3, 3), np.nan)
print(nan_array)

Output:

[[nan nan nan]
 [nan nan nan]
 [nan nan nan]]

Refer to the screenshot below to see the output.

numpy nan

This method is perfect when you need a NaN array of a specific shape. I often use this approach when initializing arrays for time-series data, where some timestamps might be missing.

Check out NumPy Average Filter in Python

Method 2: Use np.empty() and Fill with NaN

Python np.empty() function creates an array with uninitialized values, which we can then fill with NaN:

import numpy as np

# Create an empty array
empty_array = np.empty((2, 4))

# Fill it with NaN values
empty_array.fill(np.nan)
print(empty_array)

Output:

[[nan nan nan nan]
 [nan nan nan nan]]

Refer to the screenshot below to see the output.

numpy nan array

This two-step approach can be useful when you’re working with very large arrays and want fine-grained control over memory allocation.

Read np.abs() in Python Numpy

Method 3: Direct Assignment

We can also create NumPy arrays and directly assign NaN values:

import numpy as np

# Create a NumPy array and directly assign NaN
nan_array = np.array([np.nan, np.nan, np.nan])
print(nan_array)

Output:

[nan nan nan]

Refer to the screenshot below to see the output.

np nan array

I find this method particularly handy for smaller arrays or when I need to manually specify a pattern of NaN and non-NaN values.

Read AttributeError: ‘numpy.ndarray’ object has no attribute ‘split’ in Python

Method 4: Use ones_like() or zeros_like() with NaN

If you already have a Python array and want to create a NaN array with the same shape, you can use ones_like() or zeros_like() combined with multiplication:

import numpy as np

# Original array
original = np.array([[1, 2], [3, 4]])

# Create NaN array with same shape
nan_array = np.ones_like(original, dtype=float) * np.nan
print(nan_array)

Output:

[[nan nan]
 [nan nan]]

This approach is especially useful in data processing pipelines where you need to maintain the same array structure while replacing values.

Check out np.round() Function in Python

Method 5: Use np.full_like() for NaN Arrays

The np.full_like() function creates an array with the same shape as another array but filled with a specific value:

import numpy as np

# Original array
original = np.array([[5, 6, 7], [8, 9, 10]])

# Create NaN array with same shape using full_like
nan_array = np.full_like(original, np.nan, dtype=float)
print(nan_array)

Output:

[[nan nan nan]
 [nan nan nan]]

When working with complex data structures, this method helps maintain consistency across multiple arrays.

Read Check if NumPy Array is Empty in Python

Method 6: Convert Other Data Types to NaN

Sometimes you’ll need to convert specific values in an existing array to NaN:

import numpy as np

# Create an array with some values
data = np.array([1.0, 2.0, -999.0, 4.0, -999.0])

# Convert -999 (our sentinel value) to NaN
data[data == -999.0] = np.nan
print(data)

Output:

[1.0, 2.0, nan, 4.0, nan]

This technique is particularly useful when working with real-world datasets where missing values might be represented by sentinel values like -999, -1, or other placeholders.

Check out NumPy Sum of Squares in Python

Practical Example: Stock Market Analysis

Let’s look at a practical example where NaN arrays are essential. Imagine we’re analyzing stock market data for the top 5 tech companies, but we’re missing data for some days:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Stock tickers for top tech companies
tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META']

# Create a 5x7 array for a week of stock prices (with missing data)
stock_data = np.full((5, 7), np.nan)

# Fill in some known values (simulated data)
stock_data[0, :5] = [150.2, 151.3, 149.8, 152.0, 153.5]  # AAPL (missing weekend)
stock_data[1, :] = [290.1, 288.5, 292.3, 295.0, 291.2, 290.8, 292.5]  # MSFT (complete data)
stock_data[2, [0, 1, 3, 5, 6]] = [135.2, 136.0, 134.8, 137.2, 138.0]  # GOOGL (some missing)
stock_data[3, 2:] = [130.5, 128.9, 132.1, 131.7, 133.4]  # AMZN (missing first two days)
stock_data[4, [0, 2, 3, 6]] = [220.4, 223.1, 225.8, 228.3]  # META (sporadic missing data)

# Create a DataFrame for better visualization
dates = pd.date_range('2023-05-01', periods=7)
stock_df = pd.DataFrame(stock_data, index=tickers, columns=dates)

print(stock_df)

Output:

           2023-05-01  2023-05-02  2023-05-03  2023-05-04  2023-05-05  2023-05-06  2023-05-07
AAPL          150.2      151.3       149.8       152.0       153.5         NaN         NaN
MSFT          290.1      288.5       292.3       295.0       291.2       290.8       292.5
GOOGL         135.2      136.0         NaN       134.8         NaN       137.2       138.0
AMZN            NaN        NaN       130.5       128.9       132.1       131.7       133.4
META          220.4        NaN       223.1       225.8         NaN         NaN       228.3

In this example, the NaN values indicate where data is missing, allowing us to make informed decisions about how to handle these gaps in our analysis.

Read NumPy Concatenate vs Append in Python

Important Considerations When Working with NaN Arrays

  1. Data Type Requirements: NaN is only available for floating-point data types. When creating NaN arrays, ensure you specify dtype=float.
  2. NaN Propagation: Mathematical operations involving NaN typically result in NaN. This is helpful as it prevents silent errors in calculations with missing data.
  3. Checking for NaN: Use np.isnan() to check which elements in an array are NaN:
   np.isnan(nan_array)  # Returns boolean array where True indicates NaN
  1. Memory Usage: NaN arrays use floating-point data types, which use more memory than integer or boolean arrays. Consider this when working with very large datasets.

NumPy provides powerful functions for handling NaN values in your data analysis workflows. Whether you’re preprocessing data, performing exploratory analysis, or building machine learning models, knowing how to create and work with NaN arrays is an essential skill.

I hope you found this article helpful. If you have any questions or suggestions, kindly leave them in the comments below.

Other Python articles you may also like:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.