In this Python NumPy tutorial, I will explain how NumPy read csv with header in Python using different methods in detail with some illustrative examples.
To read a CSV file with header through NumPy in Python, we can use the genfromtxt() with names=True parameter, or loadtxt() function which is suitable for numerical data, or the reader() function from csv() function.
NumPy Read CSV with Header in Python Methods
There are two different methods to read a CSV file with header in Python NumPy:
- genfromtxt function
- loadtxt function
- csv.reader() function
Let’s see them one by one in detail.
Method 1: NumPy read csv file using genfromtxt in Python
The numpy.genfromtxt is a powerful function provided by NumPy in Python, designed to handle the loading of data from text files, with a particular focus on CSV files.
Syntax:
data = np.genfromtxt(filename, delimiter=',', names=True)
Here,
Name | Description |
---|---|
filename | The name of the input csv file that we want to read. |
delimiter | The delimiter=’,’ parameter specifies that the file is comma-separated. |
names | The names=True parameter tells NumPy to use the header row to define the names of the columns, which will allow us to reference columns by their header name in Python |
Example: Let’s see a Python code that will help us through NumPy read CSV with header in Python using genfromtxt function.
import numpy as np
filename = 'C:/Users/kumar/OneDrive/Desktop/CSVFile.csv'
employee_data = np.genfromtxt(filename, delimiter=',', names=True, dtype=None, encoding='utf-8')
print(employee_data)
Output: Here,
- delimiter=’,’: This specifies that the file is a CSV.
- names=True: This tells genfromtxt to treat the first row as column headers.
- dtype=None: NumPy will infer the data type of each column.
- encoding=’utf-8′: This specifies the encoding of the file, which is important for reading text data.
[('John Doe', 28, 'Marketing', 50000)
('Jane smith', 32, 'Engineering', 65000)
('Emily Davis', 45, 'Sales', 52000)]
This way NumPy read csv with header in Python using the genfromtxt() function with the name=True parameter.
To read a specific header from the csv file in Python as a NumPy array, we can apply this code:
import numpy as np
filename = 'C:/Users/kumar/OneDrive/Desktop/CSVFile.csv'
employee_data = np.genfromtxt(filename, delimiter=',', names=True, dtype=None, encoding='utf-8')
Names = employee_data['Name']
print('Names of the Employees:', Names)
Output: Let’s read-only names by referring only to the header name (‘Name’) and get a NumPy array of the names in Python.
Names of the Employees: ['John Doe' 'Jane smith' 'Emily Davis']
This way the NumPy read csv with header in Python with a specific header name.
Method 2: NumPy load csv with header in Python using loadtxt() function
NumPy’s loadtxt function is a straightforward tool for loading data from text files, with an emphasis on numerical data in Python. Unlike genfromtxt, loadtxt is generally faster but less flexible; it does not handle missing values and is more limited in handling non-numerical columns.
However, loadtxt can still be used to read CSV files with headers, with some manual handling.
Syntax:
data = np.loadtxt(Filename, delimiter=',', skiprows=1)
Here,
Name | Description |
---|---|
Filename | The name of the input csv file that we want to read. |
delimiter | The delimiter=’,’ parameter specifies that the file is comma-separated. |
skiprows | The skiprows=1 argument tells loadtxt to skip the first row (which is typically the header in a CSV file). |
usecols | The usecols is used to select specific columns. |
Example: Here, Let’s try to read a CSV file with header in Python
import numpy as np
Filename = 'C:/Users/kumar/OneDrive/Desktop/CSVFile.csv'
with open(Filename, 'r') as f:
header = f.readline().strip().split(',')
data = np.loadtxt(Filename, delimiter=',', skiprows=1, usecols=(1, 3))
ages = data[:, 0] # Age data
salaries = data[:, 1] # Salary data
print(header)
print(ages)
print(salaries)
Output: Here, we are printing the headers and some data under some headers as an array in Python.
['Name', 'Age', 'Department', 'Salary']
[28. 32. 45.]
[50000. 65000. 52000.]
This way we can go through NumPy to read csv with header in Python.
Method 3: Python NumPy read csv using csv.reader()
The csv.reader function reads the file line by line and returns each row as a list of strings in Python. This means we will need to manually convert the numerical data to the appropriate types.
Example: Here’s how we can use csv.reader to read a CSV file and then convert it to a NumPy array in Python
import numpy as np
import csv
path = 'C:/Users/kumar/OneDrive/Desktop/CSVFile.csv'
with open(path, 'r') as f:
reader = csv.reader(f, delimiter=',')
headers = next(reader)
data = np.array(list(reader)).astype(str)
print(headers)
Output: The implementation of the code is as follows:
['Name', 'Age', 'Department', 'Salary']
This way NumPy read csv with header in Python through the csv.reader() function.
Conclusion
This article explains how NumPy read CSV with header in Python using three different methods like genfromtxt(), loadtxt(), and csv.reader() function. I have explained each method with some illustrative examples.
Now, the choice of the methods depends upon the requirement of the problem.
You may also like to read:
- How to write Python array to CSV
- Python Read CSV File and Write CSV File
- Python replace a string in a file
- Python write a list to CSV
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.