NumPy Interview Questions and Answers

NumPy is the backbone of numerical computing in Python. It powers everything from simple arrays to complex calculations and machine learning workflows.

Anyone aiming for data science, machine learning, or scientific computing roles really needs to know NumPy well.

I have given 75 commonly asked NumPy interview questions. It ranges from basics like arrays and data types to advanced topics such as broadcasting, memory layout, and performance tricks.

The questions dig into array creation, manipulation, math operations, indexing, and how NumPy connects with other libraries. There are practical explanations of universal functions, vectorisation, linear algebra, and ways to handle complex data structures.

You’ll find key concepts that interviewers love to ask about. Think differences between lists and arrays, handling missing values, masking, fancy indexing, and making operations memory-efficient.

Advanced sections touch on strides, memory mapping, and parallel processing—stuff that separates everyday users from true NumPy pros.

Table of Contents

1. What is NumPy, and why is it used?

NumPy stands for Numerical Python. It’s open-source and lets you work with arrays and matrices in Python.

NumPy Interview Questions and Answers for Data Science

The library is great at handling big, multi-dimensional arrays quickly. It comes with a bunch of math functions that work right on those arrays.

NumPy forms the base for scientific computing in Python. Developers love it because it’s just so much faster than regular Python lists.

Under the hood, NumPy runs optimised C code. That means array operations run much faster, which is a big deal for data science and scientific work.

import numpy as np

# Create a NumPy array
arr = np.array([1, 2, 3, 4, 5])
print(arr)

It also has tools for linear algebra, random numbers, and Fourier transforms. These features make it a must-have for libraries like pandas and scikit-learn.

2. Explain ndarray in NumPy.

The ndarray is NumPy’s main data structure. It stands for N-dimensional array and stores elements that are all of the same type.

NumPy Interview Questions and Answers for Data Science Professionals

An ndarray can have any number of dimensions. A 1D array looks like a list, while a 2D array is more like a table.

import numpy as np

# 1D array
array_1d = np.array([1, 2, 3, 4])

# 2D array
array_2d = np.array([[1, 2], [3, 4]])

All elements need to be the same type, like all ints or all floats. That’s part of what makes ndarrays faster for math than Python lists.

You can run operations on the whole array at once. No need to write loops just to add or multiply everything.

3. How do you create a NumPy array?

You start by importing NumPy with import numpy as np. The most common way is to use np.array() to turn a Python list into an array.

import numpy as np
arr = np.array([1, 2, 3, 4, 5])

NumPy also has functions to make arrays from scratch. np.zeros() gives you an array of zeros, and np.ones() gives you ones.

zeros_arr = np.zeros((3, 3))
ones_arr = np.ones((2, 4))

If you want a sequence of numbers, np.arange() works a lot like Python’s range. np.linspace() makes evenly spaced numbers between two values.

range_arr = np.arange(0, 10, 2)
linspace_arr = np.linspace(0, 1, 5)

You can set the data type at creation with the dtype parameter if you need something specific.

4. Difference between Python lists and NumPy arrays

Python lists can hold anything—strings, numbers, objects. They’re flexible and good for general use.

NumPy arrays only hold one data type and keep everything in a single, continuous block of memory. This makes them way faster for math and scientific work.

import numpy as np

# Python list
python_list = [1, 2, 3, 4, 5]

# NumPy array
numpy_array = np.array([1, 2, 3, 4, 5])

Lists take up more memory because they store references. NumPy arrays store the actual values, so they’re more efficient.

With NumPy, you can do vector operations—process everything at once. Lists need you to loop or use a comprehension for that.

# NumPy vector operation
result = numpy_array * 2  # [2, 4, 6, 8, 10]

# List requires a loop or comprehension
result = [x * 2 for x in python_list]

5. What is broadcasting in NumPy?

Broadcasting is how NumPy handles operations on arrays with different shapes. It lets you do element-wise math without making extra copies or writing loops.

If two arrays have different shapes, NumPy stretches the smaller one to match the bigger one. It follows certain rules to decide if the shapes are compatible.

import numpy as np

# Broadcasting example
arr = np.array([1, 2, 3])
scalar = 5
result = arr + scalar  # Returns [6, 7, 8]

# Broadcasting with 2D array
matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
result = matrix + vector  # Adds vector to each row

Broadcasting works when the dimensions are equal or one of them is 1. It keeps code clean and fast—no need for manual loops.

6. How do you reshape an array using NumPy?

Use the reshape() function to change an array’s shape without touching its data. Just pass in the new dimensions you want.

import numpy as np

# Original 1D array
arr = np.array([1, 2, 3, 4, 5, 6])

# Reshape to 2D array (2 rows, 3 columns)
reshaped = arr.reshape(2, 3)
print(reshaped)
# Output: [[1 2 3]
#          [4 5 6]]

The number of elements has to stay the same. You can use -1 for one of the dimensions, and NumPy figures it out for you.

# NumPy calculates the second dimension
arr.reshape(3, -1)  # Results in shape (3, 2)

The function works for any number of dimensions. You can go from 1D to 3D or flatten things back down to 1D.

7. Explain the concept of vectorisation in NumPy.

Vectorisation means running operations on whole arrays at once, not looping through elements. You just write the operation, and NumPy applies it to every element.

If you add a number to a NumPy array, it adds that number to every value. This happens super fast because NumPy uses C code in the background.

import numpy as np

# Without vectorization (slow)
arr = [1, 2, 3, 4, 5]
result = []
for num in arr:
    result.append(num * 2)

# With vectorization (fast)
arr = np.array([1, 2, 3, 4, 5])
result = arr * 2

Vectorised code is quicker and shorter. You can use it for most math operations—addition, subtraction, multiplication, you name it.

8. How to perform basic arithmetic operations on arrays?

NumPy lets you do arithmetic right on arrays. You can add, subtract, multiply, or divide arrays directly.

These operations work element by element. So each value in one array lines up with the value in the other array.

import numpy as np

a = np.array([10, 20, 30])
b = np.array([5, 4, 3])

print(a + b)  # [15, 24, 33]
print(a - b)  # [5, 16, 27]
print(a * b)  # [50, 80, 90]
print(a / b)  # [2., 5., 10.]

You can also use single numbers. Multiply, add, or divide a whole array by one value—no need for a loop.

c = np.array([2, 4, 6])
print(c + 10)  # [12, 14, 16]
print(c * 3)   # [6, 12, 18]

9. Methods to slice and index NumPy arrays

NumPy gives you a bunch of ways to access and change array elements. Basic indexing uses numbers to get single items or slices.

For 1D arrays, just use brackets: arr[0] gets the first element. Slicing uses arr[start:end:step]—pretty similar to regular Python lists.

import numpy as np
arr = np.array([10, 20, 30, 40, 50])
print(arr[1])      # Output: 20
print(arr[1:4])    # Output: [20 30 40]

With multi-dimensional arrays, separate indices with commas, like arr[row, col]. You can also use boolean indexing for conditions or fancy indexing with arrays of positions.

arr_2d = np.array([[1, 2], [3, 4]])
print(arr_2d[0, 1])  # Output: 2

10. What are data types in NumPy and how are they specified?

Data types in NumPy tell you what kind of values are in the array. Unlike Python lists, every value in a NumPy array has to be the same type—that’s part of why it’s fast.

NumPy has more data types than regular Python. You’ll see things like int8, int16, int32 for integers, float32, float64 for floats, and unsigned types like uint8. Each one uses a different amount of memory.

You set the data type with the dtype parameter when you make an array:

import numpy as np

arr1 = np.array([1, 2, 3], dtype='int32')
arr2 = np.array([1.5, 2.7], dtype=np.float64)

You can always check the type with the dtype attribute. If you need to, you can convert types using the astype() method.

11. Explain the difference between np.array() and np.asarray().

np.array() and np.asarray() look similar but treat existing arrays differently. If you pass an array to np.array(), it creates a fresh copy in memory.

np.asarray(), on the other hand, just returns the original array without copying anything.

import numpy as np

original = np.array([1, 2, 3])
copy_version = np.array(original)
reference_version = np.asarray(original)

copy_version[0] = 99
reference_version[0] = 88

print(original)  # [88, 2, 3]

This choice affects memory and data safety. np.array() keeps your original safe but uses extra memory.

np.asarray() saves memory, but if you tweak the new array, you change the original too. If you pass in a list or some other type, both functions just make a new array.

12. How to handle missing values in NumPy arrays?

NumPy uses NaN (Not a Number) to represent missing data. You can spot these with np.isnan(), which gives you a Boolean array showing where the NaNs are hiding.

If you want to remove missing values, try array[~np.isnan(array)]. That filters out all the NaNs and leaves only the good stuff.

import numpy as np
array = np.array([1, 2, np.nan, 4, 5])
clean_array = array[~np.isnan(array)]

To fill in missing values, np.nan_to_num() swaps NaNs for zero by default. You can give it a custom value with the nan parameter.

filled_array = np.nan_to_num(array, nan=0)

There are also functions like np.nanmean() and np.nansum() that do calculations while ignoring NaNs, which is pretty handy for stats.

13. What are universal functions (ufuncs) in NumPy?

Universal functions, or ufuncs, work on NumPy arrays one element at a time. No need for explicit loops—they handle it all for you.

Ufuncs are fast because they’re written in optimized C code. They let you do vectorized operations that leave plain Python loops in the dust.

They also support broadcasting and type casting. Broadcasting means you can use arrays of different shapes and NumPy figures out how to make it work.

import numpy as np

arr = np.array([1, 2, 3, 4])
result = np.sqrt(arr)  # ufunc operating element-wise
print(result)  # [1. 1.41421356 1.73205081 2.]

Common ufuncs include things like np.add(), np.sin(), and comparison functions. They also have extra methods like reduce() and accumulate() if you need more power.

14. How do you concatenate and stack arrays?

NumPy gives you a couple of main ways to combine arrays. np.concatenate() joins arrays along an existing axis.

np.stack() puts arrays together along a new axis, which is a bit different.

import numpy as np

# Concatenate arrays along existing axis
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = np.concatenate([arr1, arr2])
# Output: [1, 2, 3, 4, 5, 6]

If you’re working with 2D arrays, np.vstack() and np.hstack() are nice shortcuts for vertical and horizontal stacking.

# Stack arrays vertically
a = np.array([[1, 2]])
b = np.array([[3, 4]])
vstacked = np.vstack([a, b])
# Output: [[1, 2], [3, 4]]

# Stack along new axis
stacked = np.stack([arr1, arr2])
# Creates 2D array from 1D arrays

15. Explain memory layout in NumPy arrays (C-contiguous vs. Fortran-contiguous).

NumPy stores arrays in a flat, continuous chunk of memory. How it arranges the elements decides if the array is C-contiguous or Fortran-contiguous.

C-contiguous arrays use row-major order. The last axis changes fastest, so in a 2D array, each row sits together in memory.

Fortran-contiguous arrays use column-major order. Here, the first axis changes fastest, so columns are stored together.

import numpy as np

arr_c = np.array([[1, 2], [3, 4]], order='C')
arr_f = np.array([[1, 2], [3, 4]], order='F')

print(arr_c.flags['C_CONTIGUOUS'])  # True
print(arr_f.flags['F_CONTIGUOUS'])  # True

Performance can depend on this layout. If you process data in the same order it’s stored, you’ll get better speed.

16. How to generate random numbers with NumPy?

NumPy gives you a few ways to make random numbers. The newer way is to use the Generator class with np.random.default_rng().

import numpy as np
rng = np.random.default_rng()

Want a single random number between 0 and 1? Just call rng.random(). For an array, pass in the size you want.

single_number = rng.random()
array_of_numbers = rng.random(10)

The Generator class has other tricks too. integers() spits out random whole numbers in your chosen range.

random_ints = rng.integers(low=0, high=100, size=5)

For normal distributions, use normal(). If you want reproducible results, just set a seed when you create the generator.

rng = np.random.default_rng(seed=42)

17. What is the purpose of np.where()?

np.where() lets you do conditional stuff with NumPy arrays. You can use it to filter or transform data based on whatever criteria you set.

It’s got two main uses. You can find indices where a condition is true, or you can make new arrays by picking values from two sources depending on a condition.

If you just give it a condition, np.where() returns the indices where that’s true. For example:

import numpy as np
arr = np.array([10, 25, 30, 15])
indices = np.where(arr > 20)
# Returns indices where values are greater than 20

If you pass three arguments, it acts like a vectorized if-else. It picks from one array when the condition is true, and from another when it’s not:

result = np.where(arr > 20, "high", "low")
# Returns array with "high" or "low" based on condition

18. Discuss the performance advantages of using NumPy.

NumPy is seriously faster than plain Python lists for number crunching. It stores data in contiguous memory, so accessing and processing is way quicker.

Most of the heavy lifting happens in C and Fortran, so operations run at compiled speed. Multiply two arrays and you’ll see—NumPy can be dozens of times faster than a Python loop.

import numpy as np
import time

# NumPy array operation
arr = np.arange(1000000)
start = time.time()
result = arr * 2
numpy_time = time.time() - start

# Python list operation
lst = list(range(1000000))
start = time.time()
result = [x * 2 for x in lst]
list_time = time.time() - start

NumPy also uses memory more efficiently than lists. And because you can do vectorized operations, you avoid most explicit loops, which makes your code both faster and cleaner.

19. How can you perform matrix multiplication in NumPy?

NumPy gives you three main options for matrix multiplication. The @ operator is probably the cleanest and easiest to read.

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = A @ B

You can also use numpy.dot(), which works the same for 2D arrays.

result = np.dot(A, B)

numpy.matmul() is another option and acts just like the @ operator.

result = np.matmul(A, B)

All three make sure your matrices have the right dimensions. The number of columns in the first must match the number of rows in the second. Under the hood, NumPy uses fast C code for all this.

20. Explain the difference between np.dot() and np.matmul().

Both do matrix multiplication, but they treat arrays differently when you get into higher dimensions. For 1D and 2D arrays, you’ll get the same answer from both.

np.dot() is the older function and multiplies over the last axis of the first array and the second-to-last of the second array. It’s also used for vector dot products.

import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
result = np.dot(a, b)  # Standard matrix multiplication

np.matmul() follows linear algebra rules more closely. For 3D and higher arrays, it treats them as stacks of matrices. The @ operator is just a shortcut for np.matmul().

result = np.matmul(a, b)  # Same as a @ b

When in doubt, go with np.matmul() for matrix math—it makes your intent clearer and handles multidimensional arrays better.

21. How do you compute statistics like mean, median, and standard deviation using NumPy?

NumPy has built-in functions for the most common stats. Use np.mean() for the average, and np.median() for the middle value.

import numpy as np

data = np.array([10, 20, 30, 40, 50])
mean_value = np.mean(data)
median_value = np.median(data)

np.std() gives you the standard deviation, which tells you how spread out your values are. By default, it’s the population standard deviation.

std_dev = np.std(data)

All these functions work on arrays of any shape. You can also pass an axis argument to get stats along a specific dimension, like columns in a 2D array.

22. What is the role of shape and size attributes in NumPy arrays?

The shape attribute gives you a tuple with the length of each dimension in an array. Basically, it tells you how many rows and columns you’re dealing with.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # Output: (2, 3)

The size attribute counts the total number of elements in the array. It doesn’t care about the structure, just the total count.

print(arr.size)  # Output: 6

These attributes are super handy for checking array compatibility or reshaping. A lot of NumPy operations rely on arrays having the right dimensions.

23. How to flatten a multi-dimensional array?

NumPy gives you a few ways to turn a multi-dimensional array into a one-dimensional array. The most common is flatten(), which returns a new, flattened copy.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.flatten()
print(flattened)  # Output: [1 2 3 4 5 6]

If you want to save memory, ravel() is usually better. It gives you a view of the original array, so it avoids copying data if possible.

arr = np.array([[1, 2, 3], [4, 5, 6]])
raveled = arr.ravel()
print(raveled)  # Output: [1 2 3 4 5 6]

You can also use reshape(-1) for flattening. It creates a flattened view as well.

arr = np.array([[1, 2, 3], [4, 5, 6]])
reshaped = arr.reshape(-1)
print(reshaped)  # Output: [1 2 3 4 5 6]

24. Explain the function np.linspace() and when to use it.

The np.linspace() function creates an array of evenly spaced numbers over a range. You give it a start, stop, and how many points you want in between.

import numpy as np
array = np.linspace(0, 10, 5)
# Output: [0. 2.5 5. 7.5 10.]

Unlike range() or np.arange(), which use step sizes, linspace() just figures out the spacing for you. You only need to say how many values you want.

It’s great for making smooth lines in plots or generating points for scientific calculations. By default, both the start and stop values are included.

x = np.linspace(0, 100, 11)
# Creates 11 points from 0 to 100

25. How do you mask arrays in NumPy?

NumPy has masked arrays in the numpy.ma module. A masked array keeps your data but also tracks which elements are invalid or missing.

You can create one with numpy.ma.masked_array() by passing your data and a boolean mask:

import numpy as np
import numpy.ma as ma

arr = np.array([1, 2, 3, 4, 5])
mask = [False, False, True, False, True]
masked_arr = ma.masked_array(arr, mask=mask)

It’s also easy to create masks based on conditions. The masked_where() function masks elements that meet your criteria:

arr = np.array([1, -2, 3, -4, 5])
masked_arr = ma.masked_where(arr < 0, arr)

You can combine multiple conditions with & (and) or | (or). Masked elements stay in the array but get ignored in calculations.

26. What are views and copies of NumPy arrays?

Views and copies are two ways to work with NumPy arrays. A view shares memory with the original array, so changing the view changes the original too.

A copy, on the other hand, gets its own memory. If you change a copy, the original stays the same.

import numpy as np

# Creating a view
arr = np.array([1, 2, 3, 4])
view = arr[1:3]
view[0] = 10  # Changes original array

# Creating a copy
arr2 = np.array([1, 2, 3, 4])
copy = arr2.copy()
copy[0] = 10  # Does not change original array

You can check if something’s a view with the base attribute. If base is None, it’s not a view.

27. How to save and load NumPy arrays efficiently?

NumPy has np.save() and np.load() for saving and loading arrays in binary format. These are fast and keep your data types intact.

import numpy as np

# Save array
arr = np.array([[1, 2, 3], [4, 5, 6]])
np.save('array.npy', arr)

# Load array
loaded_arr = np.load('array.npy')

If you want to save several arrays, use np.savez(). For smaller file sizes, np.savez_compressed() adds compression, though it’s a bit slower.

# Save multiple arrays compressed
np.savez_compressed('data.npz', x=arr, y=arr*2)

# Load arrays
data = np.load('data.npz')
x = data['x']
y = data['y']

The .npy format is best for single arrays. For massive datasets, you might want to look into HDF5 or Parquet formats—they’re more robust and efficient for big data.

28. What is the difference between np.zeros(), np.ones(), and np.empty()?

These three functions all create arrays, but they fill them differently. np.zeros() gives you an array full of zeros, and np.ones() fills it with ones.

import numpy as np
zeros_array = np.zeros((2, 3))  # Creates 2x3 array of zeros
ones_array = np.ones((2, 3))    # Creates 2x3 array of ones

np.empty() is a little different—it doesn’t set initial values, so you get whatever random data was already in memory. It’s a bit quicker for that reason.

empty_array = np.empty((2, 3))  # Creates 2x3 array with random values

Just make sure to assign values before using np.empty() arrays. It’s best when you plan to fill the array right away and want the fastest option.

29. Explain advanced indexing techniques in NumPy.

Advanced indexing lets you select array elements in ways that go beyond simple slicing. It always creates a copy, not a view.

Boolean indexing uses condition masks to filter elements. You apply logical conditions and pick out exactly what you want.

import numpy as np
arr = np.array([10, 20, 30, 40, 50])
mask = arr > 25
result = arr[mask]  # Returns [30, 40, 50]

Integer array indexing lets you select elements at specific positions using arrays of indices. This way, you can grab elements from anywhere in the array.

arr = np.array([10, 20, 30, 40, 50])
indices = np.array([0, 2, 4])
result = arr[indices]  # Returns [10, 30, 50]

You can use these with multi-dimensional arrays, too. It’s a flexible way to extract or manipulate data based on whatever conditions you need.

30. What is the use of np.tile()?

The np.tile() function repeats an input array a specified number of times and builds a new array. It repeats the whole array along each axis, not just individual elements.

import numpy as np
arr = np.array([1, 2, 3])
result = np.tile(arr, 2)
# Output: [1 2 3 1 2 3]

You can use np.tile() with arrays of any shape. The reps parameter tells it how many times to repeat along each axis.

arr = np.array([[1, 2], [3, 4]])
result = np.tile(arr, (2, 3))
# Repeats 2 times vertically, 3 times horizontally

It comes in handy for data augmentation, making patterns, or expanding arrays to a certain shape. If you’re prepping data for machine learning or need repeated sequences, np.tile() can save you some time.

31. How do you perform sorting operations on arrays?

NumPy’s np.sort() arranges the elements of an array in ascending order. It gives you a sorted copy and leaves the original array untouched.

import numpy as np
arr = np.array([3, 1, 4, 1, 5])
sorted_arr = np.sort(arr)
# Result: [1, 1, 3, 4, 5]

If you want descending order, just reverse the sorted array with slicing.

desc_arr = np.sort(arr)[::-1]
# Result: [5, 4, 3, 1, 1]

The array.sort() method sorts the array in place, so it changes the original data. That’s useful if you don’t need to keep the unsorted version.

arr.sort()
# arr is now [1, 1, 3, 4, 5]

For multi-dimensional arrays, you can use the axis parameter. axis=0 sorts columns, and axis=1 sorts rows.

32. Explain the function np.unique() and its significance.

The np.unique() function finds all the unique elements in a NumPy array. It removes duplicates and returns just the distinct values.

import numpy as np

arr = np.array([1, 2, 2, 3, 3, 3, 4])
unique_values = np.unique(arr)
print(unique_values)  # Output: [1 2 3 4]

You can also get extra info. return_counts=True tells you how many times each value appears, and return_index=True shows where each value first pops up.

values, counts = np.unique(arr, return_counts=True)
print(counts)  # Output: [1 2 3 1]

This function is a lifesaver for data cleaning and analysis. It’s great for spotting categories or getting rid of duplicates in your data.

33. How does NumPy handle multi-dimensional arrays internally?

NumPy stores multi-dimensional arrays as contiguous blocks of memory. It uses a data structure called ndarray.

The array holds a pointer to the actual data. It also includes metadata that describes the shape, data type, and how the data sits in memory.

The shape attribute defines the size of each dimension. Stride tells NumPy how many bytes to skip in memory to move to the next element along each dimension.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # (2, 3)
print(arr.strides)  # (24, 8) for 64-bit integers

This setup lets NumPy handle arrays with up to 32 dimensions. The contiguous memory layout means the CPU can access nearby elements fast.

34. How to perform element-wise comparison between arrays?

NumPy gives you comparison operators like ==, >, <, >=, <=, and != for element-wise array comparison. These return a Boolean array showing the result for each element.

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([1, 5, 3, 2])

result = a == b  # Returns [True, False, True, False]
greater = a > b  # Returns [False, False, False, True]

Both arrays need the same shape for this to work. You can also compare an array against a single value, and NumPy broadcasts the comparison across all elements.

arr = np.array([10, 20, 30])
result = arr > 15  # Returns [False, True, True]

These Boolean arrays work great for filtering data or creating masks for later steps.

35. What is a structured array in NumPy?

A structured array in NumPy is a special array type that can hold multiple fields with different data types in each element. It’s kind of like a table or a database record—each array element contains several pieces of info.

While regular NumPy arrays store just one data type, structured arrays let you combine types. For instance, you can keep a name as a string, age as an integer, and salary as a float all together.

import numpy as np

# Create a structured array
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('salary', 'f4')])
employees = np.array([('John', 25, 50000), ('Sarah', 30, 65000)], dtype=dt)

print(employees['name'])  # Access by field name

This makes structured arrays handy for mixed data types while keeping NumPy’s speed.

36. Explain the broadcasting rules with examples.

Broadcasting lets NumPy work with arrays of different shapes without you having to reshape them by hand. There are three main rules.

First, NumPy compares shapes from right to left. If an array has fewer dimensions, it pads ones on the left until both shapes match in length.

Second, two dimensions are compatible if they’re equal or one of them is 1. If not, NumPy throws an error.

Third, if an array has a dimension size of 1, NumPy stretches it to match the other array’s dimension.

import numpy as np

# Example 1: Adding scalar to array
a = np.array([1, 2, 3])
b = 5
result = a + b  # [6, 7, 8]

# Example 2: 2D and 1D arrays
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([10, 20, 30])
result = x + y  # [[11, 22, 33], [14, 25, 36]]

37. How to convert a NumPy array to a Python list?

The easiest way to turn a NumPy array into a Python list is to use the tolist() method. It works for arrays of any shape and gives you a regular Python list.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
python_list = arr.tolist()
print(python_list)  # [1, 2, 3, 4, 5]

The tolist() method converts NumPy types to their closest Python types. For multi-dimensional arrays, it creates nested lists that match the array’s shape.

arr_2d = np.array([[1, 2], [3, 4]])
nested_list = arr_2d.tolist()
print(nested_list)  # [[1, 2], [3, 4]]

You could also use the list() constructor, but that keeps NumPy scalar types instead of converting them to native Python types.

38. Discuss the function np.argsort() and its applications.

The np.argsort() function gives you the indices that would sort an array. Unlike np.sort(), it doesn’t return sorted values, just their positions in sorted order.

import numpy as np
arr = np.array([4, 5, 1, 7, 3])
indices = np.argsort(arr)
print(indices)  # Output: [2 4 0 1 3]

This is super useful for ranking data or keeping relationships between arrays. If you have several related arrays and want to sort them all based on one array’s values, argsort() gives you the indices to do that consistently.

It also works with multidimensional arrays if you set the axis parameter. You can sort rows or columns separately. People often use it for ranking students, sorting records while preserving row relationships, or finding top-K elements by their positions.

39. How do you use np.apply_along_axis()?

The np.apply_along_axis() function applies a custom function to 1-D slices along a chosen axis of an array. You’ll need three arguments: the function, the axis, and the array itself.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

# Apply sum function along axis 1 (rows)
result = np.apply_along_axis(np.sum, 1, arr)
# Output: [ 6 15]

It processes each slice independently and returns a new array with the results. If you set axis to 1, it works on rows; axis 0 means it works on columns.

# Apply function to columns
result = np.apply_along_axis(np.mean, 0, arr)
# Output: [2.5 3.5 4.5]

This saves you from writing explicit loops when working with arrays.

40. Explain the buffering of arrays during operations.

Buffering in NumPy is about how it manages memory for arrays. When NumPy runs operations, it creates temporary storage areas—buffers—to hold data efficiently.

These buffers let NumPy access and change array data quickly. It moves info between different parts of memory without copying whole arrays unless it has to.

import numpy as np

# NumPy uses buffers internally during operations
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])
result = arr1 + arr2  # Buffering happens here

Buffers matter most when you work with other libraries or huge datasets. They help NumPy transfer data faster and reduce memory overhead during calculations.

The buffering system works in the background. Most of the time, you don’t need to think about it, but it’s one reason NumPy is faster than standard Python lists.

41. What warnings or errors should you watch for when working with NumPy?

NumPy throws floating-point warnings when operations go wrong. Division by zero pops up a lot.

import numpy as np
np.seterr(divide='warn', invalid='warn')

You’ll get warnings for things like inf/inf or 0/0 that create NaN values. Broadcasting errors happen if the array shapes don’t match for an operation.

a = np.array([1, 2, 3])
b = np.array([1, 2])
# This raises a ValueError for shape mismatch

Overflow and underflow warnings show up when calculations go beyond what the data type can handle. With seterr(), you can control whether NumPy ignores, warns, raises, or calls a custom function for these issues.

The errstate context manager gives you temporary control over error handling for specific bits of code.

42. How to perform linear algebra operations in NumPy?

NumPy has a module called numpy.linalg for linear algebra. It has functions for matrix multiplication, solving equations, and finding eigenvalues.

You can do matrix multiplication using the @ operator or numpy.dot():

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = A @ B

Other handy operations include np.linalg.det() for determinants, np.linalg.inv() for inverses, and np.linalg.eig() for eigenvalues. To solve equations, use np.linalg.solve():

A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = np.linalg.solve(A, b)

All these functions handle n-dimensional arrays efficiently. They’re the backbone for data science and machine learning work.

43. Discuss the use of np.linalg module.

The np.linalg module provides functions for linear algebra in NumPy. It helps with tasks like solving equations, finding matrix properties, and doing decompositions.

import numpy as np

# Calculate matrix determinant
matrix = np.array([[1, 2], [3, 4]])
det = np.linalg.det(matrix)

# Find inverse of a matrix
inv = np.linalg.inv(matrix)

# Solve linear equations (Ax = b)
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = np.linalg.solve(A, b)

The module covers eigenvalues, matrix norms, and decompositions like SVD. These tools help engineers and data scientists tackle tough numerical problems fast.

Common functions include eig() for eigenvalues, norm() for vector norms, and matrix_rank() for figuring out matrix rank.

44. Explain NumPy’s role in scientific computing and machine learning.

NumPy sits at the core of numerical operations in Python. It gives you tools to handle arrays and matrices efficiently—absolutely critical for scientific work.

In scientific computing, NumPy powers fast calculations on large datasets. Its array-based methods make complex math feel much simpler to write and run.

Scientists rely on it for simulations, data analysis, and mathematical modelling. It’s not an exaggeration to say NumPy is everywhere in this field.

import numpy as np

# Efficient array operations
data = np.array([1, 2, 3, 4, 5])
squared = data ** 2  # Vectorized operation
mean = np.mean(data)

For machine learning, NumPy crunches the numbers behind algorithms. It stores training data, handles matrix math, and processes features.

Big libraries like TensorFlow and scikit-learn actually build on top of NumPy’s capabilities. Without it, they’d be a lot slower and clunkier.

Broadcasting and vectorised operations make code run way faster than plain Python loops. That speed boost really matters when you’re working with millions of data points.

45. How do you perform FFT (Fast Fourier Transform) with NumPy?

NumPy gives you the numpy.fft.fft() function for Fast Fourier Transform. This function flips a signal from the time domain into the frequency domain.

Just import NumPy and pass your array to fft(). Here’s a quick example:

import numpy as np

# Create a simple signal
signal = np.array([1, 2, 3, 4, 5])

# Apply FFT
fft_result = np.fft.fft(signal)
print(fft_result)

Want something more practical, like a sine wave? Here’s how that looks:

import numpy as np

# Create time series
t = np.linspace(0, 1, 500)
signal = np.sin(2 * np.pi * 5 * t)

# Perform FFT
fft_result = np.fft.fft(signal)
frequencies = np.fft.fftfreq(len(signal))

The fftfreq() function spits out the frequency values for your FFT output. Handy, right?

46. What is the significance of order parameter in array creation?

The order parameter tells NumPy how to lay out array elements in memory. This can affect both how fast your code runs and how efficiently it stores data.

You’ve got two main choices: ‘C’ and ‘F’. The ‘C’ order means row-major, so it stores elements row by row. ‘F’ order means column-major, storing elements column by column.

import numpy as np

# Create array with C order (row-major)
array_c = np.array([[1, 2], [3, 4]], order='C')

# Create array with F order (column-major)
array_f = np.array([[1, 2], [3, 4]], order='F')

The default is ‘C’ order. If your program works with rows a lot, stick with ‘C’. If you’re mostly working with columns, ‘F’ order might give you a speed edge.

47. How does NumPy facilitate performance optimisation?

NumPy speeds things up by using vectorisation. Instead of looping through elements one by one, it runs operations on whole arrays at once. Under the hood, it leverages optimised C code—so it’s way faster than regular Python.

import numpy as np

# Slow Python loop
result = []
for i in range(1000000):
    result.append(i * 2)

# Fast NumPy vectorization
arr = np.arange(1000000)
result = arr * 2

It stores data in big, contiguous memory blocks. That makes it easier for your computer to grab and process info quickly.

Broadcasting lets you run operations on arrays of different shapes without making extra copies. Saves both memory and time.

Most NumPy functions are actually written in C or Fortran. That’s a big reason why things run so much faster compared to pure Python.

48. Explain the difference between np.vstack() and np.hstack().

These two functions combine arrays in different directions. np.vstack() stacks arrays vertically—think of adding rows. np.hstack() stacks them horizontally, so you’re adding columns side by side.

With np.vstack(), your arrays need the same number of columns. It creates an array with more rows.

import numpy as np
a = np.array([[1, 2]])
b = np.array([[3, 4]])
result = np.vstack((a, b))  # [[1, 2], [3, 4]]

With np.hstack(), your arrays need the same number of rows. You end up with more columns.

a = np.array([[1], [2]])
b = np.array([[3], [4]])
result = np.hstack((a, b))  # [[1, 3], [2, 4]]

49. How do you obtain the diagonal elements of a matrix?

NumPy makes it easy to grab diagonal elements. Use np.diag() with your matrix, and you’ll get a one-dimensional array of the diagonal values.

import numpy as np

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

diagonal = np.diag(matrix)
print(diagonal)  # Output: [1 5 9]

Or, try np.diagonal() if you want more control. You can use the offset parameter to get diagonals above or below the main one.

main_diag = np.diagonal(matrix)
upper_diag = np.diagonal(matrix, offset=1)
print(main_diag)  # Output: [1 5 9]
print(upper_diag)  # Output: [2 6]

50. Discuss the role of strides attribute in NumPy arrays.

The strides attribute tells NumPy how to move through array memory. It’s a tuple of integers showing how many bytes to skip to hit the next element along each axis.

For a 1D array of 4-byte integers, the stride might be (4,). That means NumPy skips 4 bytes to reach the next value.

import numpy as np

arr = np.array([1, 2, 3, 4], dtype=np.int32)
print(arr.strides)  # (4,)

In 2D arrays, each dimension has its own stride. The first number tells you how many bytes to skip for the next row, the second for the next column.

arr_2d = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int32)
print(arr_2d.strides)  # (12, 4)

51. How to convert structured data into NumPy arrays?

How you convert structured data into NumPy arrays depends on where the data comes from. If you’ve got lists of tuples or dictionaries, use numpy.array() with a specific dtype.

import numpy as np

# Define structured array dtype
dt = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]

# Create from list of tuples
data = [('Alice', 25, 5.5), ('Bob', 30, 6.0)]
structured_arr = np.array(data, dtype=dt)

There’s also structured_to_unstructured() in numpy.lib.recfunctions. It converts structured arrays to regular ones, handling the messy parts for you.

from numpy.lib.recfunctions import structured_to_unstructured

regular_arr = structured_to_unstructured(structured_arr[['age', 'height']])

This method works great when you’re converting table-like data from CSVs or databases for analysis.

52. What is fancy indexing in NumPy?

Fancy indexing lets you select multiple elements from an array using arrays or lists of indices. Instead of picking things one by one, you grab a bunch at once based on their positions.

Just pass an array of integers as indices to pull out exactly what you want. The order doesn’t matter, and they don’t have to be next to each other.

import numpy as np

a = np.array([10, 20, 30, 40, 50])
indices = np.array([0, 2, 4])
result = a[indices]
print(result)  # Output: [10 30 50]

Fancy indexing is different from slicing because you can pick non-consecutive elements. It’s way more flexible for grabbing specific spots in an array.

This also works for two-dimensional arrays. You can select whole rows, or even mix and match elements from different rows and columns.

53. How to use slicing with steps in arrays?

Array slicing with steps follows the pattern array[start:stop:step]. The step decides how many elements to skip each time.

import numpy as np

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
result = arr[0:10:2]  # Returns [0, 2, 4, 6, 8]

Step can be positive or negative. If you use a negative step, it reverses the array.

reversed_arr = arr[::-1]  # Reverses entire array
every_third = arr[::3]    # Returns [0, 3, 6, 9]

For multi-dimensional arrays, you can use steps on each axis separately.

arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
result = arr_2d[::2, ::2]  # Selects every other row and column

54. Explain the function np.broadcast()

The np.broadcast() function creates an object that mimics how NumPy would broadcast arrays together. Give it a few arrays, and it returns a broadcast object showing how the shapes would combine.

This is handy if you want to double-check how broadcasting will work before running your computation. The broadcast object has properties like shape and nd that show the resulting dimensions.

import numpy as np

array1 = np.array([1, 2, 3])
array2 = np.array([[1], [2], [3]])

broadcast_obj = np.broadcast(array1, array2)
print(broadcast_obj.shape)  # Output: (3, 3)

Unlike standard broadcasting, this function just gives you info about how arrays will broadcast. It doesn’t do the math itself. You can also use the broadcast object as an iterator to walk through broadcasted values if you want.

55. How to perform cumulative operations like cumsum() and cumprod()?

NumPy has two handy functions for cumulative stuff: cumsum() for running totals and cumprod() for running products. These go through the array in order, updating the total at each spot.

With cumsum(), each element gets added to the sum of everything before it. If you pass a simple array, you get back a new array where each value is the sum up to that point.

import numpy as np

arr = np.array([1, 2, 3, 4])
cumulative_sum = np.cumsum(arr)
# Result: [1, 3, 6, 10]

cumprod() does the same thing, but multiplies instead of adding.

cumulative_product = np.cumprod(arr)
# Result: [1, 2, 6, 24]

Both functions take an axis parameter for multi-dimensional arrays. Set axis=0 to work down the columns, or axis=1 to go across the rows.

56. What are masked arrays, and how to use them?

Masked arrays in NumPy help you deal with missing or bad data. They’re made up of your regular data and a boolean mask that says which values to ignore.

The mask uses True for invalid data and False for good data. This way, you can run calculations on just the valid numbers without deleting anything.

To make a masked array, use numpy.ma like this:

import numpy as np
import numpy.ma as ma

data = np.array([1, 2, -999, 4, 5])
masked_data = ma.masked_equal(data, -999)

You can also create a mask manually if you want more control:

data = np.array([1, 2, 3, 4, 5])
mask = [False, False, True, False, False]
masked_array = ma.array(data, mask=mask)

Masked arrays work with most NumPy operations. Masked values just get skipped in things like mean, sum, and standard deviation.

57. Difference between np.copy() and simple assignment

If you assign an array to a new variable with =, you’re just referring. Both variables point to the same data in memory, so a change in one shows up in the other.

import numpy as np
A = np.array([1, 2, 3])
B = A
B[0] = 99
print(A)  # Output: [99, 2, 3]

If you want a real, independent copy, use np.copy(). Now, changes to the new array don’t touch the original.

A = np.array([1, 2, 3])
B = np.copy(A)
B[0] = 99
print(A)  # Output: [1, 2, 3]

The big difference? Assignment shares memory, while np.copy() gives you a whole new chunk.

58. How to handle complex numbers in NumPy?

NumPy supports complex numbers right out of the box with its complex types. A complex number has a real part and an imaginary part, written like a + bj.

You can make a complex array by using np.complex128, or just type in complex values:

import numpy as np

# Create complex numbers
z = np.complex(3, 4)  # 3 + 4j
arr = np.array([1+2j, 3+4j, 5+6j])

To get the real or imaginary part, use .real and .imag:

print(arr.real)  # [1. 3. 5.]
print(arr.imag)  # [2. 4. 6.]

All the usual math operations work fine with complex arrays. You can also grab conjugates with np.conj() or get magnitudes using np.abs().

59. Explain the function np.meshgrid() and its applications.

np.meshgrid() builds a rectangular grid out of two one-dimensional arrays. Give it coordinate vectors, and it spits out coordinate matrices for every grid point.

import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5])
X, Y = np.meshgrid(x, y)

You end up with two 2D arrays. X repeats x-coordinates for each y, and Y repeats y-coordinates for each x.

There are two indexing modes: ‘xy’ (the default, Cartesian) and ‘ij’ (matrix-style).

People use meshgrid() for things like evaluating functions on a 2D grid, making surface plots, or generating coordinate points for calculations. It’s way easier than writing nested loops by hand.

60. What is the dtype parameter, and how does it influence array creation?

The dtype parameter sets the data type for the elements in your NumPy array. It controls what kind of values the array holds—integers, floats, objects, you name it.

When you make an array, dtype affects both memory use and speed. NumPy checks this value to figure out how much space each element needs and how to read those bytes.

import numpy as np

# Create arrays with different dtypes
arr_int = np.array([1, 2, 3], dtype=np.int32)
arr_float = np.array([1, 2, 3], dtype=np.float64)
arr_object = np.array([1, 2, 3], dtype=object)

print(arr_int.dtype)    # int32
print(arr_float.dtype)  # float64
print(arr_object.dtype) # object

With dtype, every value in the array has the same type. That consistency is what lets NumPy work so much faster than regular Python lists.

61. How do you work with record arrays?

Record arrays let you access data by field names, not just by index. They’re like structured arrays but with a few extra tricks for easier access.

You can make a record array with np.recarray() or by converting a structured array using .view(np.recarray). Here’s a quick example:

import numpy as np

# Create a record array
data = np.recarray((3,), dtype=[('name', 'U10'), ('age', 'i4'), ('score', 'f4')])
data.name = ['Alice', 'Bob', 'Carol']
data.age = [25, 30, 28]
data.score = [95.5, 88.0, 92.3]

# Access fields directly
print(data.name)
print(data.age[0])

You can sort record arrays by field or apply functions across columns. They’re just easier to work with than plain structured arrays, honestly.

62. Describe the process to reshape without copying data.

NumPy’s reshape tries to give you a view of the original array, not a copy, if the memory layout allows it. This works when the array is stored in one continuous block.

Arrays from things like arange() or zeros() are usually contiguous. Here’s a typical reshape:

import numpy as np

arr = np.arange(12)
reshaped = arr.reshape(3, 4)

The reshaped array points to the same memory as the original. Change one, and the other changes too.

If the array’s memory isn’t contiguous (maybe from slicing), NumPy might have to make a copy. This usually doesn’t matter, but if you’re working with giant arrays, it’s worth keeping in mind.

63. Explain the np.fromfunction() method.

np.fromfunction() builds an array by calling a function on every coordinate position. You give it a function and a shape, and it fills in the array with the function’s output for each spot.

The function gets coordinate indices as arguments. For a 2D array, that’s row and column; for 3D, it’s three indices, and so on.

import numpy as np

# Create a 3x3 array where each element is i + j
arr = np.fromfunction(lambda i, j: i + j, (3, 3))
print(arr)
# Output: [[0. 1. 2.]
#          [1. 2. 3.]
#          [2. 3. 4.]]

This is great for making arrays with mathematical patterns. By default, you get floats, but you can set dtype if you want something else.

64. How does NumPy integrate with other Python libraries?

NumPy is pretty much the backbone of Python’s scientific stack. Lots of other libraries lean on it for their array structures.

Pandas, for instance, uses NumPy arrays under the hood for DataFrames. That makes data wrangling much faster.

SciPy builds on NumPy by adding optimisation, signal processing, and stats tools. You can pass NumPy arrays right into SciPy functions, and it just works.

import numpy as np
import pandas as pd

# NumPy array to Pandas DataFrame
arr = np.array([[1, 2], [3, 4]])
df = pd.DataFrame(arr, columns=['A', 'B'])

Plotting libraries like Matplotlib take NumPy arrays directly for charts and graphs. Machine learning frameworks (TensorFlow, PyTorch, etc.) can convert NumPy arrays into their own tensor types, so moving data around is easy.

65. Discuss multi-threading and parallelism in NumPy.

NumPy can use multiple threads and often runs operations in parallel without you having to do anything. Lots of NumPy functions release Python’s GIL, so you get real parallel execution.

Under the hood, NumPy relies on BLAS libraries for heavy math. Those BLAS libraries handle threading automatically, so things like matrix multiplication get a big speed boost.

import numpy as np

# This operation may use multiple threads automatically
large_matrix = np.random.rand(1000, 1000)
result = np.dot(large_matrix, large_matrix)

Most operations in NumPy are thread-safe. You can use Python’s threading module with NumPy arrays and not worry too much.

If you want to parallelise work that’s not handled by BLAS, you can use multiprocessing to split things across CPUs yourself.

66. How to check array contiguity?

To see if an array is contiguous in memory, check the flags attribute. It has boolean values that show how the array is stored.

For C-contiguous arrays, use array.flags[‘C_CONTIGUOUS’] or array.flags.c_contiguous. For Fortran-contiguous, try array.flags[‘F_CONTIGUOUS’] or array.flags.f_contiguous.

import numpy as np

arr = np.array([[1, 2], [3, 4]])
print(arr.flags.c_contiguous)  # True
print(arr.flags.f_contiguous)  # False

arr_t = arr.T
print(arr_t.flags.c_contiguous)  # False
print(arr_t.flags.f_contiguous)  # True

If flags returns True, the array’s memory is contiguous in that order. This info can help you squeeze out more performance or make sure your code plays nice with other libraries.

67. Describe how to convert NumPy arrays to pandas DataFrames.

Turning a NumPy array into a pandas DataFrame is honestly pretty simple with pd.DataFrame(). Just pass your array to this function, and you’ll get a DataFrame back.

import numpy as np
import pandas as pd

array = np.array([[1, 2, 3], [4, 5, 6]])
df = pd.DataFrame(array)

Pandas automatically assigns numeric labels to both rows and columns, starting from zero. If you’d rather have custom column names, just provide a list with the columns parameter.

df = pd.DataFrame(array, columns=['A', 'B', 'C'])

You can also set custom row labels by using the index parameter. This conversion is handy because DataFrames make data analysis and manipulation a lot easier than working with plain NumPy arrays.

68. Explain memory mapping in NumPy arrays.

Memory mapping lets NumPy handle huge arrays that won’t fit into RAM. The actual data stays on disk, and NumPy pulls in only what you need, when you need it.

If you access certain elements, the operating system grabs just those parts from the file. Any changes get written back to disk, so you never have to load the whole thing into memory.

To do this, NumPy gives you the memmap function. It creates an object that acts like a normal array but is really just a window into a file.

import numpy as np

# Create a memory-mapped array
data = np.memmap('large_data.dat', dtype='float32', mode='w+', shape=(10000, 10000))

This approach is a lifesaver when your dataset is way bigger than your available RAM. It also lets multiple processes work with the same data file efficiently, without duplicating memory.

69. How to compute covariance and correlation with NumPy?

NumPy has built-in functions for both covariance and correlation. Use np.cov() for the covariance matrix and np.corrcoef() for the correlation coefficient matrix.

import numpy as np

x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 6])

# Calculate covariance matrix
covariance = np.cov(x, y)
print(covariance)

# Calculate correlation matrix
correlation = np.corrcoef(x, y)
print(correlation)

The covariance matrix tells you how variables shift together. On the diagonal, you’ll see the variance; off-diagonal values show the covariance between variables.

The correlation matrix normalises these values to between -1 and 1, making it easier to get a sense of the relationship’s strength.

70. What are the best practices for debugging NumPy code?

Start by checking the shape and data type of your arrays using array.shape and array.dtype. These two will catch most dimension or type issues right away.

import numpy as np
arr = np.array([1, 2, 3])
print(arr.shape)  # (3,)
print(arr.dtype)  # int64

Adding print statements at key points helps you see what’s actually in your arrays. Sometimes that’s all you need to spot where things go off the rails.

You can also use np.set_printoptions() to tweak how arrays get displayed. Adjusting precision or suppress can make big arrays easier to read when you’re deep in debugging.

np.set_printoptions(precision=3, suppress=True)

Try things out with small arrays before scaling up. It’s easier to spot logic errors when you know what the output should be.

Don’t forget to check for NaN or infinity values with np.isnan() and np.isinf(). These can sneak in and quietly break your calculations.

71. How to calculate percentiles and quantiles in NumPy?

NumPy gives you np.percentile() and np.quantile() for this. Percentile takes a value from 0 to 100, while quantile wants something between 0 and 1.

import numpy as np

data = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

# Calculate 75th percentile
p75 = np.percentile(data, 75)  # Returns 77.5

# Calculate equivalent quantile
q75 = np.quantile(data, 0.75)  # Returns 77.5

Both functions work just fine on multi-dimensional arrays. If you want to calculate along a specific axis, just use the axis parameter.

data_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Calculate along columns
col_percentiles = np.percentile(data_2d, 50, axis=0)

You can even pass a list of percentiles to get multiple results at once. Super convenient.

72. Explain the np.select() function and its use cases.

np.select() lets you build an output array by picking values from different choices based on a list of conditions. You give it the conditions and the corresponding choices.

import numpy as np

conditions = [array < 0, array == 0, array > 0]
choices = [-1, 0, 1]
result = np.select(conditions, choices, default=99)

It checks each condition in order, and as soon as one is true for an element, it pulls the value from the matching choice. If nothing matches, it uses the default value.

This is great for things like categorising data, setting up grading schemes, or any situation where you need to assign values based on multiple rules. For a single condition, though, np.where() is usually quicker and simpler.

73. How to use masked arrays for conditional computation?

Masked arrays in NumPy let you run calculations while ignoring certain elements that meet a condition. You get these features from the numpy.ma module.

To make a masked array, use numpy.ma.masked_where() and give it a condition. It’ll mask the elements where your condition holds.

import numpy as np
import numpy.ma as ma

# Create an array
data = np.array([1, -2, 3, -4, 5])

# Mask negative values
masked_data = ma.masked_where(data < 0, data)

# Perform computation (ignores masked values)
result = masked_data.mean()  # Only uses positive values

Masked arrays play nicely with regular NumPy functions. The masked elements basically disappear from your calculations, so you don’t have to mess with new arrays or complicated filters.

74. What are potential pitfalls when using broadcasting?

Broadcasting sometimes surprises you when array shapes line up in ways you didn’t expect. For instance, mixing a 1D array with a 2D array might give you a shape you didn’t want.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([[1], [2], [3]])
result = a + b  # Creates 3x3 array, not 3x1

It can also chew up a lot of memory if you’re not careful. Some simple-looking operations actually create huge intermediate arrays in the background.

When shapes don’t match up, debugging gets tricky. NumPy will throw a ValueError, but the message doesn’t always make it obvious what went wrong.

Sometimes, the code runs without errors but gives the wrong answer, just because the arrays broadcasted in a way you didn’t intend. That’s a headache.

# Intended: subtract row means
data = np.array([[1, 2], [3, 4]])
means = np.array([1.5, 3.5])  # Wrong shape
result = data - means  # Broadcasts incorrectly

75. How to apply mathematical functions like exponential and logarithm to arrays?

NumPy makes it easy to apply math functions to whole arrays. Use np.exp() for exponentials (e^x) and np.log() for natural logarithms, all element-wise.

import numpy as np

arr = np.array([1, 2, 3, 4])
exponential = np.exp(arr)  # Returns [2.718, 7.389, 20.086, 54.598]
logarithm = np.log(arr)    # Returns [0, 0.693, 1.099, 1.386]

There are also np.log10() and np.log2() for base-10 and base-2 logs. These work on arrays of any shape and spit out a new array with the same dimensions.

log_base10 = np.log10(arr)  # Base-10 logarithm
log_base2 = np.log2(arr)    # Base-2 logarithm

These operations are way faster than looping through elements in Python. NumPy handles everything at once, and it’s honestly hard to beat that speed.

Conclusion

NumPy is a core tool for anyone working with data in Python. The 75 questions here cover everything from basic array operations to performance tricks you might not expect.

If you really master these concepts, you’ll walk into technical interviews with a lot more confidence. The questions jump from simple syntax to tricky, real-world problem-solving.

Key areas to focus on include:

Array creation and manipulation
Indexing and slicing operations
Mathematical and statistical functions
Broadcasting rules
Performance optimisation techniques

Honestly, practice makes all the difference. It’s so much better to write code for each concept than just read about it.

import numpy as np
# Practice with real examples
arr = np.array([1, 2, 3, 4, 5])

Interviewers want to see that you not only understand NumPy but can explain your choices. Why pick one function over another? That’s something they’ll ask.

Most interviews will test:

Array fundamentals
Data manipulation skills
Problem-solving ability
Code efficiency awareness

These questions reflect what hiring managers actually care about in data science, machine learning, and Python roles. Regular practice with these topics really does build confidence.

Don’t be afraid to revisit tough questions. Dig into the NumPy docs too; there’s always more to learn, and it pays off beyond just interview prep.

You may also read:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/