In this Python tutorial, we will discuss how to work with Excel files in Python. In addition, we will also learn to read and write Excel files in Python with various examples.
Excel files are ubiquitous in the world of data handling and analysis. Whether you’re a data scientist, data analyst, or Python programmer, you’ll likely need to read or write Excel files at some point in your career.
While Python does not have built-in support for reading and writing Excel files, there are several libraries available that make this task simple. We will discuss the most popular libraries with examples in detail.
Write To Excel File in Python
Writing to an Excel file in Python involves creating or opening an Excel file and then adding or modifying data in it. Several libraries can be used for this purpose, such as openpyxl, xlsxwriter, xlwt, etc.
Here’s a brief explanation of how this process works:
- Create or open a workbook: The first step is to create a new Excel file or open an existing one. This is done using the appropriate function provided by the library you’re using.
- Add or select a worksheet: An Excel file can contain multiple worksheets. Therefore, you must add a new worksheet to the workbook or select an existing one to write data to it.
- Write data: You can write data to cells individually, or you can write rows or columns of data at once. The method for doing this depends on the library you’re using.
- Save the workbook: After writing the data, you need to save the changes to the workbook.
Let’s discuss various methods that help to write an Excel file in Python.
Writing to Excel Files with openpyxl in Python
Prerequisites: Ensure you have the following packages installed:
- Python
- openpyxl: pip install openpyxl
openpyxl
is a Python library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files. Here’s an example of writing to Excel files using openpyxl
:
from openpyxl import Workbook
# Initialize a workbook
workbook = Workbook()
# Get the active sheet in the workbook (default sheet)
worksheet = workbook.active
# US-based Student Data
data = [
["Name", "Age", "City", "Grade"],
["John Doe", 20, "Los Angeles", "B"],
["Jane Doe", 22, "New York", "A"],
["Jim Brown", 21, "Chicago", "B+"]
]
# Write data to worksheet
for row in data:
worksheet.append(row)
# Save workbook to a file
workbook.save(filename="students_openpyxl.xlsx")
This Python script uses the openpyxl
library to create an Excel file.
- It first imports the
Workbook
class fromopenpyxl
and creates a new workbook (Excel file) in memory. - It then gets the active worksheet from the workbook (the default sheet you see when you open a new Excel file).
- It defines the student data as a list of lists where each inner list is a row of data.
- It then iterates over the data, appending each row to the worksheet.
- Finally, it saves the workbook to a physical file called
students_openpyxl.xlsx
.
The resulting Excel file has a single sheet with the student data arranged in rows.
Output:
students_openpyxl.xlsx
‘Writing to Excel Files with xlwt in Python
Prerequisites: Ensure you have the following packages installed:
- Python
- xlwt: pip install xlwt
xlwt
is a Python library for writing data and formatting information to older Excel files (ie: .xls). Let’s write the same data using xlwt
:
import xlwt
# Initialize a workbook
workbook = xlwt.Workbook()
# Add a sheet to the workbook
worksheet = workbook.add_sheet("US Sports Teams")
# US-based Sports Team Data
teams = [
["Name", "City", "Sport"],
["Lakers", "Los Angeles", "Basketball"],
["Yankees", "New York", "Baseball"],
["Cowboys", "Dallas", "Football"]
]
# Write data to worksheet
for i, team in enumerate(teams):
for j, info in enumerate(team):
worksheet.write(i, j, info)
# Save workbook to a file
workbook.save("us_sports_teams_xlwt.xls")
This script uses the xlwt
library to write data about US sports teams to an Excel file:
- A new Excel workbook is created using
xlwt.Workbook()
. - A worksheet named “US Sports Teams” is added to the workbook with
workbook.add_sheet("US Sports Teams")
. - A list of sports teams, where each item is a list representing a row of data in the Excel file, is defined.
- The script then iterates over each team in the list, and for each team, it writes every piece of information (
info
) to the corresponding cell in the worksheet usingworksheet.write(i, j, info)
. - The workbook is saved to a file named “
us_sports_teams_xlwt.xls
” usingworkbook.save("us_sports_teams_xlwt.xls")
.
After running this script, you’ll have an Excel file named "us_sports_teams_xlwt.xls"
that contains the data on the sports teams.
Output:
us_sports_teams_xlwt.xls
“Writing to Excel Files with xlsxwriter in Python
Prerequisites: Ensure you have the following packages installed:
- Python
- xlsxwriter: pip install xlsxwriter
xlsxwriter
is a Python library for creating Excel .xlsx files. Let’s write some data to an Excel file using xlsxwriter
:
import xlsxwriter
# Create a workbook and add a worksheet
workbook = xlsxwriter.Workbook('us_companies_xlsxwriter.xlsx')
worksheet = workbook.add_worksheet()
# US-based Company Data
companies = [
["Name", "Headquarters", "CEO"],
["Apple", "Cupertino, CA", "Tim Cook"],
["Microsoft", "Redmond, WA", "Satya Nadella"],
["Google", "Mountain View, CA", "Sundar Pichai"],
["Amazon", "Seattle, WA", "Andy Jassy"],
]
# Write data to worksheet
for i, company in enumerate(companies):
for j, info in enumerate(company):
worksheet.write(i, j, info)
# Close the workbook
workbook.close()
This script uses the xlsxwriter
library to create an Excel file with US-based company data.
- It begins by creating a new workbook (Excel file) and a worksheet (tab within the Excel file).
- It then defines a list of companies, with each company represented as a list of data (name, headquarters, CEO).
- The script then loops over the companies list and for each company, it loops over its data and writes it to the worksheet using
worksheet.write(i, j, info)
. Thewrite()
method takes the row index (i
), the column index (j
), and the data to be written (info
). - Finally, it closes the workbook, effectively saving the Excel file as ‘
us_companies_xlsxwriter.xlsx
‘. The resulting Excel file contains a single sheet with the provided company data.
Output:
us_companies_xlsxwriter.xlsx
‘ using PythonWriting Excel Files with pyexcel in Python
Prerequisites: Ensure you have the following packages installed:
- Python
- pyexcel: pip install pyexcel pyexcel-xlsx
pyexcel
is a Python library for reading, writing, and manipulating Excel, csv, ods, odt, xlsx, and xlsm files. Here’s how you can write data to an Excel file:
import pyexcel
# US-based Landmark Data
landmarks = [
["Name", "Location"],
["Statue of Liberty", "New York"],
["Golden Gate Bridge", "San Francisco"],
["Grand Canyon", "Arizona"]
]
# Save data to an Excel file
pyexcel.save_as(array=landmarks, dest_file_name="us_landmarks_pyexcel.xlsx")
This Python script uses the pyexcel
library to write data about US landmarks to an Excel file.
- It starts by importing the
pyexcel
library. - It defines a list of landmarks, where each item is a list representing a row of data in the Excel file.
- It then uses the
pyexcel.save_as()
function to write this data to an Excel file named “us_landmarks_pyexcel.xlsx
“. Thearray
argument specifies the data to be written to the file, and thedest_file_name
argument specifies the name of the file to be created.
After running this script, an Excel file named "us_landmarks_pyexcel.xlsx"
is created with the landmarks data.
Output:
us_landmarks_pyexcel.xlsx
using PythonRead Excel Files in Python
Reading Excel files in Python involves importing the data contained in an Excel file into your Python script. This is useful when you have data stored in Excel files that you want to manipulate, analyze, or use in some way in Python.
One of the best libraries we use to read the Excel file in Python is openpyxl. The biggest advantage of this library is it works on all types of Excel file and also support multiple libraries that we used for writing the Excel files.
Let’s try to read all the Excel files that we have written above:
Example#1
from openpyxl import load_workbook
# Load in the workbook
workbook = load_workbook(filename='students_openpyxl.xlsx')
# Get the worksheet
worksheet = workbook.active
# Iterate over the rows
for row in worksheet.iter_rows(values_only=True):
print(row)
This script reads data from an Excel file using the openpyxl
library:
- It imports the
load_workbook
function fromopenpyxl
to load an existing Excel file. - It uses
load_workbook
to open the Excel file named ‘students_openpyxl.xlsx’. - It retrieves the active worksheet from the workbook using the
active
property. - It iterates over each row in the worksheet using
worksheet.iter_rows(values_only=True)
. This yields each row as a tuple of values (notCell
objects), because ofvalues_only=True
. - It prints each row’s data to the console. After running this script, the content of ‘students_openpyxl.xlsx’ will be printed to the console.
Output:
Example#2
from openpyxl import load_workbook
# Load in the workbook
workbook = load_workbook(filename='us_companies_xlsxwriter.xlsx')
# Get the worksheet
worksheet = workbook.active
# Iterate over the rows
for row in worksheet.iter_rows(values_only=True):
print(row)
This Python code uses openpyxl
to read data from an Excel file called ‘us_companies_xlsxwriter.xlsx
‘. It gets the active worksheet from the workbook and loops through its rows, printing each row’s data to the console.
Output:
us_companies_xlsxwriter.xlsx
‘Example#3
from openpyxl import load_workbook
# Load in the workbook
workbook = load_workbook(filename='us_landmarks_pyexcel.xlsx')
# Get the worksheet
worksheet = workbook.active
# Iterate over the rows
for row in worksheet.iter_rows(values_only=True):
print(row)
Here’s a detailed explanation:
from openpyxl import load_workbook
: Theload_workbook
function from theopenpyxl
library is imported. This function allows you to read existing Excel workbooks.workbook = load_workbook(filename='us_landmarks_pyexcel.xlsx')
: Theload_workbook
function is called with the filename of the Excel file you want to read. This creates aWorkbook
object that represents the Excel file.worksheet = workbook.active
: Theactive
property of theWorkbook
object is accessed to get the currently active worksheet (the tab that is opened by default when you open the workbook in Excel). This is aWorksheet
object.for row in worksheet.iter_rows(values_only=True):
: Theiter_rows
method of theWorksheet
object is called to generate all the rows in the worksheet as tuples. Thevalues_only=True
argument makes sure that the actual data in each cell is returned, instead ofCell
objects.print(row)
: Inside the loop, each row (which is a tuple of data values) is printed to the console.
Output:
us_landmarks_pyexcel.xlsx
in PythonThere is also one disadvantage of the openpyxl
library it does not work on .xls format files
. But you don’t need to take tension if you have an Excel file with .xls format. For this Python provides us a xlrd library
that helps to read.
The xlrd library
is compatible with the .xls file format. You can install xlrd
by running pip install xlrd
and then modify your code to use xlrd instead of openpyxl.
Here’s an example: In the above example, we have created an Excel file named us_sports_teams_xlwt.xls
. Let’s try to read this one.
import xlrd
# Load the workbook
workbook = xlrd.open_workbook('us_sports_teams_xlwt.xls')
# Get the worksheet
worksheet = workbook.sheet_by_index(0) # Assuming the first sheet
# Iterate over the rows
for row_idx in range(worksheet.nrows):
row = worksheet.row_values(row_idx)
print(row)
Here, we use the xlrd library to load an Excel file in the .xls format and iterate over its rows to print their values. It first opens the workbook, retrieves the first sheet, and then loops through each row, extracting and printing the row values.
Output:
us_sports_teams_xlwt.xls
Conclusion
In conclusion, working with Excel files in Python can be both straightforward and versatile, and the various powerful libraries such as openpyxl
, xlsxwriter
, xlwt
, and pyexcel
are there. Depending on the specific requirements of your task, you can choose the library that suits you best.
The ability to read Excel files into Python opens the door to a wide range of possibilities for data analysis and manipulation. Python provides a more flexible and powerful environment for working with data than Excel alone.
You may like the following Python tutorials:
- Multiply in Python with Examples
- Python invalid literal for int() with base 10
- Python sort list of tuples
- Remove Unicode characters in python
- Comment lines in Python
- Python dictionary append with examples
- Check if a list is empty in Python
- Python convert list to string
- Python input and raw_input function
- Python File methods (With Useful Examples)
- Python read a file line by line example
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.